Dear All, The String Algorithm Library submission by Pavol Droba should be accepted into boost given that the issues that came up during the review are fixed. I know that my decision will please some and disapoint others, but I will try explain my decision later. I studied many former decisions made by review managers to see if I could find a pattaern of when to accept or reject a library. Rejection was done for these reasons: * too poor documentation * design does not meet enough user's needs * design was flawed When I summarize a decision on a certain issue, then I have read the whole thread, but sometimes the conclusion is not too easy to read from the discussion; so please say if you think I have made a mistake. Here's an overview of the rest of this mail: (1) Statistics (2) Critique of the library (3) Improvements that must be adressed before the inclusion of the library (1) Statistics -------------- Let my first show the statistics of this review: people doing a review: 7 people voting yes: 6 people voting no: 1 people with general comments: many In general it is understanding that most people payed most attention to the algortihms themselves; the traits, iterator_range, and find/formatter concepts and generating algorithms where not studied in detail. For future reference, I think it would be nice if people stated more explicitly what they tried and how long time they used on it. This is part of the checklist that I gave a link to in the mail that started the review. Many had wishes for extra algorithms and I think Pavol has them on his todo list. (2) Critique of the library -------------------------- In what follows I only discuss the major concerns during the review. (a) Two libraries are intertwined. On one side there a string algorithms and on the otherside support components for implementing these: container_traits, sequence_traits, iterator_range. This is indeed a relevant isssue that we must discuss. I think most agree that container_traits, sequence_traits and iterator_range (or half_open_range) should be a separate libraries. However, as long as the documentation clearly states that these sub-libs are just waiting for somebody to replace them, it shouldn't be any problem. On the contrary, the string library gives us a good starting point for evaluating the usefulness of these features, and I think we should thank Pavol for his courage to do this even though they might not be perfect yet. If we look at how other libraries evolve in boost, then they too change quite a bit as we get more experience with something. Examples like iterator_adaptors and regex come into mind. (b) Documentation is inadequate; imprecise. There was very different opinions on this. Some said the documentation was very good; others we're more pessimistic. Among those who actually did a review, most felt the docs where ok, but they did not mind more precise specifications and more examples. Some libraries have been rejected because of very poor documentation, but I don't think it would be fair to do it here. It counts that most reviewers were able to use the library without this extra documentation. The algorithms seems quite intuitive to use, and the docs wasn't really a show-stopper. Also, if we were to evaluate the documentation of other boost libraries just as harshly, they probably wouldn't make it into boost. Several other libraries are somewhat more demanding to use than the string algorithm library. (c) The library is more than a string library; ie. it will work with vector<foobar> even though it is not even close to a string. AFAICT, this is true for all functions not taking a locale parameter. When the review started, I mailed Craig Henderson because I know he has boyer more algorithms in the sandbox. Here is what he wrote: "WRT overlaps with my algorithms in the sequence_algo sandbox, I have already considered this, but concluded that the string algorithms are very specific to sequences of characters. The Boyer Moore, Edit Distance and LCS algorithms work on any sequence of data. I think they would loose flexibility and impact if I included them in the specific string algos." At that time I actually thought that Pavol's library was "merely" a generic string algorithm library. But in fact it is much more general than that: it searches and replaces and erases for subranges in a bigger range and strings are only one example of where this is handy. It's too early to say if this will cause changes to the Formatter/Finder concepts. I asked myself the question "is this reason enough for rejecting the library?". After some thought my answer was 'no' because 1) the algorithms that might be affected by this don't take a locale argument; the ones that do often have a similar name with an 'i' prefix. When and if the generic versions have their own library, the user should (mostly) be able to switch namespace and include a new header and everything works as before. 2) it doesn't hurt anyone that the implementation is a bit more general than it has to be 3) it will be easier to test if these function can actually be used as generic algorithms on their own How Craig's algorithms will fit into all this I cannot answer, but it should be dealt with during his review. (3) Improvements that must be adressed before the inclusion of the library -------------------------------------------------------------------------- Some of these changes are probably already incorporated by Pavol, but I'll mention them here for sake of completeness. If Pavol thinks that some of the points should be given a mini-review once again, he is free to do so on the list. It might be that some doc issues cannot be resolved before boost-book is updated; in that case there is not much to do but wait. (a) The documentation should have an overview table with all the libraries functions (b) implementation details should be better hided by using <i>Implementation defined</i> in docs (c) the namespace should be changed to boost::algorithm::string (d) it should be mentioned explicitly that traits and ranges are just temporary parts of the library and theit use should be discouraged. If there is no alternative under way, the docs could state that too. (e) concepts should be provided so it is clear what a string is or what type of containers that the algorithm works with (f) a general description about exception safety when using the library should be made (g) the header policy should changed so there is eg. only one find.hpp and not find.hpp and find2.hpp (h) grammar should be checked in docs (i) naming throughout the library should be made consistent; functions that take a predicate ends with _if; make_range should be renamed make_iterator_range or vice versa. (j) a safe-bool implicit conversion of the iterator range should be added (k) mutating algorithms should return void That's it folks! One thing that you might consider is critique of the review manager; should I have done something different? Earlier? etc. Don't be afraid to criticise me...the worst thing that can happen is that I might actually learn something. best regards Thorsten Ottosen, String Algo review manager