Hi all, I'll have to check large CSV text files (up to several GB) and check them with Xpressive. As far as I can see, the only way to do is to load the whole file into a std::string and hand over to Xpressive. I don't want to do that because my files are too big. So, how can I open a CSV file in text mode, get bidirectional iterators and hand them over to Xpressive? Regards Goran
On 8/11/2014 5:27 AM, Goran wrote:
Hi all,
I'll have to check large CSV text files (up to several GB) and check them with Xpressive.
As far as I can see, the only way to do is to load the whole file into a std::string and hand over to Xpressive. I don't want to do that because my files are too big.
So, how can I open a CSV file in text mode, get bidirectional iterators and hand them over to Xpressive?
The question isn't really specific to xpressive. xpressive will work with any bidirectional iterators. The hard part is implementing a bidirectional iterator that pages in sections of a text file on demand. I'm not aware of such a beast in Boost. If I were to implement one, I'd probably start by looking at the existing support for memory mapped files in Boost.Interprocess. Good luck. -- Eric Niebler Boost.org http://www.boost.org
On Wed, Aug 13, 2014 at 7:18 PM, Eric Niebler
On 8/11/2014 5:27 AM, Goran wrote:
So, how can I open a CSV file in text mode, get bidirectional iterators and hand them over to Xpressive?
The hard part is implementing a bidirectional iterator that pages in sections of a text file on demand.
I'm not aware of such a beast in Boost. If I were to implement one, I'd probably start by looking at the existing support for memory mapped files in Boost.Interprocess.
You might also take a look at multi-pass iterators from Spirit: http://www.boost.org/doc/libs/1_56_0/libs/spirit/doc/html/spirit/support/mul...
On 8/13/14 8:16 PM, Nat Goodspeed wrote:
On Wed, Aug 13, 2014 at 7:18 PM, Eric Niebler
wrote: On 8/11/2014 5:27 AM, Goran wrote:
So, how can I open a CSV file in text mode, get bidirectional iterators and hand them over to Xpressive?
The hard part is implementing a bidirectional iterator that pages in sections of a text file on demand.
I'm not aware of such a beast in Boost. If I were to implement one, I'd probably start by looking at the existing support for memory mapped files in Boost.Interprocess.
You might also take a look at multi-pass iterators from Spirit: http://www.boost.org/doc/libs/1_56_0/libs/spirit/doc/html/spirit/support/mul...
Unless used with spirit's expectation points, multi-pass will suffer the same issue raised by the OP as the data is stored in memory as it's read from file via a std::vector IIRC. Jeff
On 8/15/14, 8:09 AM, Jeff Flinn wrote:
On 8/13/14 8:16 PM, Nat Goodspeed wrote:
On Wed, Aug 13, 2014 at 7:18 PM, Eric Niebler
wrote: On 8/11/2014 5:27 AM, Goran wrote:
So, how can I open a CSV file in text mode, get bidirectional iterators and hand them over to Xpressive?
The hard part is implementing a bidirectional iterator that pages in sections of a text file on demand.
I'm not aware of such a beast in Boost. If I were to implement one, I'd probably start by looking at the existing support for memory mapped files in Boost.Interprocess.
You might also take a look at multi-pass iterators from Spirit: http://www.boost.org/doc/libs/1_56_0/libs/spirit/doc/html/spirit/support/mul...
Unless used with spirit's expectation points, multi-pass will suffer the same issue raised by the OP as the data is stored in memory as it's read from file via a std::vector IIRC.
A little known utility in spirit classic is the file_iterator: http://www.boost.org/doc/libs/1_39_0/boost/spirit/home/classic/iterator/file... It uses memmap whenever available. Regards, -- Joel de Guzman http://www.ciere.com http://boost-spirit.com http://www.cycfi.com/
On 08/14/2014 01:18 AM, Eric Niebler wrote:
I'm not aware of such a beast in Boost. If I were to implement one, I'd probably start by looking at the existing support for memory mapped files in Boost.Interprocess.
http://www.boost.org/libs/iostreams/doc/classes/mapped_file.html
participants (6)
-
Bjorn Reese
-
Eric Niebler
-
Goran
-
Jeff Flinn
-
Joel de Guzman
-
Nat Goodspeed