Re: [boost] C++Now 2021 -- Library in a Week: Boosting Boost
On 2021-05-01 00:55, Jeff Garland via Boost wrote:
Every year at the end of the session, I solicit ideas from the community about what 'library' we should work on in the following year. One
Boost.Tokenizer could do with a modernization. It generates a lot of code for even simple tokenizations, which makes it unsuitable for embedded devices. Some issues to consider: * Separate the browsing API from the conversion API. ** The browsing API should use string_view. ** The conversion API should use output buffers instead of returing a string. * Provide an API with narrow contracts * Provide a constexpr API.
I haven't thought about tokenizer in a long time since string_algo is the
swiss army knife of strings -- but yeah, interesting -- it's small and well
contained. More thoughts inline:
On Sun, May 2, 2021 at 5:33 AM Bjorn Reese via Boost
On 2021-05-01 00:55, Jeff Garland via Boost wrote:
Every year at the end of the session, I solicit ideas from the community about what 'library' we should work on in the following year. One
Boost.Tokenizer could do with a modernization. It generates a lot of code for even simple tokenizations, which makes it unsuitable for embedded devices. Some issues to consider:
Is there something specific in the implementation other than using templates that causes this?
* Separate the browsing API from the conversion API.
Not sure I follow -- are the sub-bullets what you mean?
** The browsing API should use string_view.
For sure. ** The conversion API should use output buffers instead of returing
a string. * Provide an API with narrow contracts
* Provide a constexpr API.
Yes. In 2021 it seems like we should turn the whole thing into range based token_view that takes string_view and provides string_view of each token. Jeff
On 2021-05-02 15:12, Jeff Garland wrote:
Boost.Tokenizer could do with a modernization. It generates a lot of code for even simple tokenizations, which makes it unsuitable for embedded devices. Some issues to consider:
Is there something specific in the implementation other than using templates that causes this?
The main culprint with regard to generated code size is the use of std::string and exceptions (with exception wrapping.)
* Separate the browsing API from the conversion API.
Not sure I follow -- are the sub-bullets what you mean?
Yes. Specifically, I want to be able to iterate through an input and obtain a view of each entry without any conversions (e.g. to integer or unescaped strings) taking place. Conversion should only happen when I explicitly request it. I do not mind a more user-friendly high-level API that works as the existing API as long as I can reach for a low-level API when I need to.
In 2021 it seems like we should turn the whole thing into range based token_view that takes string_view and provides string_view of each token.
Good idea.
participants (2)
-
Bjorn Reese
-
Jeff Garland