using regex for in-place string modification
I want to use regex to modify a string in place. In particular, I've read the contents of a file into a string, now I want to do some search/replace functions on the string, then write the modified string back out to the original file. I've looked through the regex documentation, including the October 2001 DDJ article, but I don't see any way to modify a string in place. From what I can tell, regex_merge looks like it does the kind of thing I want, except it doesn't seem to modify in place. I'd prefer to avoid creating two strings of each file's contents, one before regex processsing, one after. Is there functionality in regex that will let me modify a string in place? Thanks, Scott
In my free Regular Expression Component Library ( http://www.tropicsoft.com/Components/RegularExpression ), which uses regex++ under the covers, you can do in-place modification of a file. Minuses: source not available, works only on Windows, uses regex++ from Boost 1.28 rather than latest version, supports BCB3,4,5,6 and VC6,7,7.1 only. Pluses: Added functionality. I do support it fully. Works solidly, but I give much credit for that to John Maddock. You don't need to download Boost regex++ files for it to work, since it is self-contained. I modified the regex++ source to do the in-place modification of a file using the same technique that the library uses for searching through files under Windows, which is the Windows file mapping API. In the regex_merge, as you have noticed, the output is sent to a separate string, or an output iterator. Maybe John Maddock can suggest something for the regex++ implementation itself which will solve your problem for you. Scott Meyers wrote:
I want to use regex to modify a string in place. In particular, I've read the contents of a file into a string, now I want to do some search/replace functions on the string, then write the modified string back out to the original file. I've looked through the regex documentation, including the October 2001 DDJ article, but I don't see any way to modify a string in place. From what I can tell, regex_merge looks like it does the kind of thing I want, except it doesn't seem to modify in place. I'd prefer to avoid creating two strings of each file's contents, one before regex processsing, one after. Is there functionality in regex that will let me modify a string in place?
Scott Meyers
I want to use regex to modify a string in place. In particular, I've read the contents of a file into a string, now I want to do some search/replace functions on the string, then write the modified string back out to the original file. I've looked through the regex documentation, including the October 2001 DDJ article, but I don't see any way to modify a string in place. From what I can tell, regex_merge looks like it does the kind of thing I want, except it doesn't seem to modify in place. I'd prefer to avoid creating two strings of each file's contents, one before regex processsing, one after. Is there functionality in regex that will let me modify a string in place?
Are you sure you can't use http://www.boost.org/libs/regex/template_class_ref.htm#partial_matches to solve your problem? By the way, repeated in-place modification, especially if you start from the front of the string, is O(N^2), so you really might be better off doing this into a separate string. Don't forget that you could end up with the same instantaneous memory usage anyway because when a string grows it has to reallocate its buffer. Text editors normally use something called a "gap buffer" to make this sort of thing more efficient, but you really have to have a very specialized application before that's the best solution. -- Dave Abrahams Boost Consulting www.boost-consulting.com
I have read the documentation and hve seen that there is an add_vertex method that allows a vertex to be added at the same time that a vertex property is set. Unfortunately the docs do not make it clear to me exactly what type I'm supposed to pass as the vertex property. If I have code like the following where I get a vertex property map from a graph, add the vertex, and then set the property: member_names_t member_names = boost::get(boost::vertex_name_t(), __graph); member_vertex_t vertex = boost::add_vertex(__graph); boost::put(member_names, vertex, name); How can I edit this code to using the add_vertex call from the MutablePropertyGraph ala: // how to express vp? member_vertex_t vertex = boost::add_vertex(vp, __graph); --aj
Hi AJ, The file example/transitive_closure.cpp contains an example of how to do this. Cheers, Jeremy On Fri, 12 Dec 2003, AJ wrote: ajames> I have read the documentation and hve seen that there is an add_vertex ajames> method that allows a vertex to be added at the same time that a vertex ajames> property is set. Unfortunately the docs do not make it clear to me ajames> exactly what type I'm supposed to pass as the vertex property. ajames> ajames> If I have code like the following where I get a vertex property map from ajames> a graph, add the vertex, and then set the property: ajames> ajames> member_names_t member_names = boost::get(boost::vertex_name_t(), ajames> __graph); ajames> member_vertex_t vertex = boost::add_vertex(__graph); ajames> boost::put(member_names, vertex, name); ajames> ajames> How can I edit this code to using the add_vertex call from the ajames> MutablePropertyGraph ala: ajames> ajames> // how to express vp? ajames> member_vertex_t vertex = boost::add_vertex(vp, __graph); ajames> ajames> --aj ajames> ---------------------------------------------------------------------- Jeremy Siek http://php.indiana.edu/~jsiek/ Ph.D. Student, Indiana Univ. B'ton email: jsiek@osl.iu.edu C++ Booster (http://www.boost.org) office phone: (812) 856-1820 ----------------------------------------------------------------------
I want to use regex to modify a string in place. In particular, I've read the contents of a file into a string, now I want to do some search/replace functions on the string, then write the modified string back out to the original file. I've looked through the regex documentation, including the October 2001 DDJ article, but I don't see any way to modify a string in place. From what I can tell, regex_merge looks like it does the kind of thing I want, except it doesn't seem to modify in place. I'd prefer to avoid creating two strings of each file's contents, one before regex processsing, one after. Is there functionality in regex that will let me modify a string in place?
Not directly, but you could: repeated search through the string finding each match, then for each match found replace the matched section with the result of regex_format, and restart the search from the appropriate place. John.
On Sat, Dec 13, 2003 at 12:12:46PM -0000, John Maddock wrote:
I want to use regex to modify a string in place. In particular, I've read the contents of a file into a string, now I want to do some search/replace functions on the string, then write the modified string back out to the original file. I've looked through the regex documentation, including the October 2001 DDJ article, but I don't see any way to modify a string in place. From what I can tell, regex_merge looks like it does the kind of thing I want, except it doesn't seem to modify in place. I'd prefer to avoid creating two strings of each file's contents, one before regex processsing, one after. Is there functionality in regex that will let me modify a string in place?
Not directly, but you could:
repeated search through the string finding each match, then for each match found replace the matched section with the result of regex_format, and restart the search from the appropriate place.
John.
Alternatively, you can try to get string_algo library from the boost-sandbox. There is an algorithm replace_regex, that does in-place regex replacement. Pavol
On Sat, 13 Dec 2003 12:12:46 -0000, John Maddock wrote:
Not directly, but you could:
repeated search through the string finding each match, then for each match found replace the matched section with the result of regex_format, and restart the search from the appropriate place.
I got this working, but I'm curious about a design decision, or perhaps about a couple of related design decisions. What I really wanted to do was a regex_merge, but I also needed to know how many matches had been detected. Is there a particular reason that regex_merge doesn't provide that information? It's there for the plucking, because regex_merge calls regex_grep (at least that's what the docs suggest), and regex_grep returns how many matches were found. Second, in learing to use regex_grep, I was struck by what appears to be an asymmetry in the division of responsibility between the functor passed to regex_grep and regex_grep's caller. To perform a full search and replace over a string (say), the functor is responsible for copying everything from the beginning of the string to the end of the final match (performing replacements as it goes), but regex_grep's caller is responsible for copying everything from the end of the final match to the end of the string. This strikes me as really odd, but it's what's done in the DDJ article, so I assume that this is the right way to do it. Would it not have been reasonable for regex_grep to pass an additional bool to the functor indicating whether the call was not for a match but was instead for a final after-the-last-match call? That way the functor could have done all the copying. Thanks for both the library and for any enlightment on the above you'd care to shed. Scott
What I really wanted to do was a regex_merge, but I also needed to know how many matches had been detected. Is there a particular reason that regex_merge doesn't provide that information? It's there for the plucking, because regex_merge calls regex_grep (at least that's what the docs suggest), and regex_grep returns how many matches were found.
I guess it just didn't quite fit with the interface - I happen to think that you're unusual in wanting this information, or at least you're the first to ask for it :-)
Second, in learing to use regex_grep, I was struck by what appears to be an asymmetry in the division of responsibility between the functor passed to regex_grep and regex_grep's caller. To perform a full search and replace over a string (say), the functor is responsible for copying everything from the beginning of the string to the end of the final match (performing replacements as it goes), but regex_grep's caller is responsible for copying everything from the end of the final match to the end of the string. This strikes me as really odd, but it's what's done in the DDJ article, so I assume that this is the right way to do it. Would it not have been reasonable for regex_grep to pass an additional bool to the functor indicating whether the call was not for a match but was instead for a final after-the-last-match call? That way the functor could have done all the copying.
Good point, but the last call to the functor then wouldn't be a match - and I think that this would cause as many if not more problems than it solves. In any case regex_grep is being deprecated in favor of an iterator based interface in the next revision. Sorry for the delay, John.
participants (7)
-
AJ
-
David Abrahams
-
Edward Diener
-
Jeremy Siek
-
John Maddock
-
Pavol Droba
-
Scott Meyers