Reading a file in reverse using boost::iostreams
Hi, Is there an easy way to read a file in reverse using boost::iostreams? I've got a case where I need to detect whether text is present and it's closer to the end of the file than the beginning. Any help appreciated. Kind regards Sean.
Reading in reverse is likely to be much slower because of the buffering, so I doubt it will have any performance gain that you seem to be looking for. Craig
On 12 Feb 2019, at 16:33, Sean Farrow via Boost-users
wrote: Hi,
Is there an easy way to read a file in reverse using boost::iostreams? I’ve got a case where I need to detect whether text is present and it’s closer to the end of the file than the beginning. Any help appreciated. Kind regards Sean. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
On 13/02/2019 05:33, Sean Farrow wrote:
Is there an easy way to read a file in reverse using boost::iostreams?
I’ve got a case where I need to detect whether text is present and it’s closer to the end of the file than the beginning.
You should be able to read the length of the stream, then seek to a position near the end and read forwards from there. Of course, you need to know a suitable value to use as the range where you expect the value to be present; if you get this wrong then you'll either have a false negative or you'll waste a bit more time jumping back further and trying again.
My solution would be: 1. memory map the file (either use boost.interprocess or trivially hand-roll a few OS calls) 2. build an iterator pair (i.e. char *) representing the extent of the mapped memory, 3. call std::make_reverse_iterator on the iterator pair 4. use a standard algorithm On Wed, 13 Feb 2019 at 06:09, Gavin Lambert via Boost-users < boost-users@lists.boost.org> wrote:
On 13/02/2019 05:33, Sean Farrow wrote:
Is there an easy way to read a file in reverse using boost::iostreams?
I’ve got a case where I need to detect whether text is present and it’s closer to the end of the file than the beginning.
You should be able to read the length of the stream, then seek to a position near the end and read forwards from there.
Of course, you need to know a suitable value to use as the range where you expect the value to be present; if you get this wrong then you'll either have a false negative or you'll waste a bit more time jumping back further and trying again. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Richard Hodges hodges.r@gmail.com office: +442032898513 home: +376841522 mobile: +376380212
On 13/02/2019 04:43, Richard Hodges via Boost-users wrote:
My solution would be:
1. memory map the file (either use boost.interprocess or trivially hand-roll a few OS calls) 2. build an iterator pair (i.e. char *) representing the extent of the mapped memory, 3. call std::make_reverse_iterator on the iterator pair 4. use a standard algorithm
Unless the file is warm cached, this will be slow. I know of no kernel which performs readbehind, only readahead. Safer is to do as Gavin suggests, jump to some offset from the maximum extent, read forwards. Niall
On 14/02/2019 09:15, Niall Douglas wrote:
On 13/02/2019 04:43, Richard Hodges wrote:
My solution would be:
1. memory map the file (either use boost.interprocess or trivially hand-roll a few OS calls) 2. build an iterator pair (i.e. char *) representing the extent of the mapped memory, 3. call std::make_reverse_iterator on the iterator pair 4. use a standard algorithm
Unless the file is warm cached, this will be slow. I know of no kernel which performs readbehind, only readahead.
I imagine what would happen is that it would either read the whole file into memory at once (presumably only if it's small) -- which would then be fast to iterate, but not really any better than just reading it normally -- or it would reserve pages and then when you started reading from the end it would commit a page or two read from the end of the file, so it would be reasonably fast reading forwards or backwards after that until you cross a page boundary. So it may not be a problem if your target is within the last 4kB or so of the file. Having said that, by definition this can't really be any faster than doing what I suggested (modulo some issues with page sizes and alignments). And if you use a reverse iterator it also requires you to recognise your search pattern in reverse as well, which is usually inconvenient.
participants (5)
-
Craig Henderson
-
Gavin Lambert
-
Niall Douglas
-
Richard Hodges
-
Sean Farrow