Boost summer of formal reviews
Hello, This is a (half-)joke suggestion. There are currently 19 libraries in the review schedule ( http://www.boost.org/community/review_schedule.html ) which need a review manager (yes, including mine). Some of them have been there for years. Instead of focusing on getting people to write new libraries, which could potentially end up stuck in the review schedule (like AFIO), why don't we focus on getting the reviews for the ones that are already there. Or at least revising the review process. For example, I would be willing to review some of these libraries, but I am not qualified to be a review manager (since I am not very active in the community). I am not saying that most (or any) of the libraries stuck in review limbo, should pass the process, but some of them at least deserve a look (again, like AFIO). -- Borislav
On March 11, 2014 6:29:15 AM EDT, Borislav Stanimirov
Hello,
This is a (half-)joke suggestion.
There are currently 19 libraries in the review schedule ( http://www.boost.org/community/review_schedule.html ) which need a review manager (yes, including mine). Some of them have been there for years.
Sad, but true.
Instead of focusing on getting people to write new libraries, which could potentially end up stuck in the review schedule (like AFIO), why don't we focus on getting the reviews for the ones that are already there. Or at least revising the review process.
Robert Ramey has ideas on improving the process. Others have made suggestions over the years, but we've made little change to it this far. Still, if you've got concrete ideas for improving the backlog, speak up.
For example, I would be willing to review some of these libraries, but I am not qualified to be a review manager (since I am not very active in the community).
Being a review manager is a big job. You need to ensure that the library is ready, you need domain expertise in order to well and fairly judge the review comments, and you need to commit time to the review period and writing the report. That's a significant burden and it's little wonder few are stepping up to do it. ___ Rob (Sent from my portable computation engine)
On 11-Mar-14 14:18, Rob Stewart wrote:
On March 11, 2014 6:29:15 AM EDT, Borislav Stanimirov
wrote: Still, if you've got concrete ideas for improving the backlog, speak up.
Well I am not familiar with all ideas that have been given and the following could have been mentioned before (and even shown to be bad), but what about openly accepting donations for Boost and using them to actually pay the review managers for their time. Apart from that, what is the actual risk of adding a bad library to the collection? The following is an idea for a fundamental change in the review process: * Have a short informal review process based on short reviews from the community. More than a spam filter than an actual review process, actually. The "reviews" could be something like "Well. It looks OK to me". * Then have an _automated_ test for eligibility. Does the library compile as part of Boost in the most popular compilers and configurations? Static and dynamic analyzers can provide code coverage info by the library's tests (100% would be a requirement). They will check whether it crashes, has memory leaks, and more. * After a library passes the automated tests, it just becomes a part of boost. _BUT!_ It does so in the namespace boost::bleeding_edge. Have a disclaimer that bleeding_edge libraries haven't passed a formal review yet. Still, even with such a disclaimer, they would be exposed to the public, and more people would be encouraged to try them. * To get out of the bleeding_edge namespace, a library needs to either receive a formal review (in the format formal reviews are made now), or demonstrate its use in several real-life working projects (independent from the author). How about that? -- Borislav
On 11 Mar 2014 at 17:57, Borislav Stanimirov wrote:
Well I am not familiar with all ideas that have been given and the following could have been mentioned before (and even shown to be bad), but what about openly accepting donations for Boost and using them to actually pay the review managers for their time.
That would raise a legal minefield. Boost would have to register as an employer and deduct taxes etc.
Apart from that, what is the actual risk of adding a bad library to the collection?
Surprisingly high. Even if a library ticked all the boxes in that list I provided earlier, for example TypeIndex v2.0 had a dangerous undefined behaviour in it that both myself and Antony missed before bringing it for review here. That UB compiled and worked fine on all present compiler technologies, but could have caused significant problems if say Reflection entered the C++ language in some years time.
The following is an idea for a fundamental change in the review process:
* Have a short informal review process based on short reviews from the community. More than a spam filter than an actual review process, actually. The "reviews" could be something like "Well. It looks OK to me".
We already have that. I posted asking for a mini design review only last week.
* Then have an _automated_ test for eligibility. Does the library compile as part of Boost in the most popular compilers and configurations? Static and dynamic analyzers can provide code coverage info by the library's tests (100% would be a requirement). They will check whether it crashes, has memory leaks, and more.
On this I very much agree, and I listed some good criteria in an earlier post.
* After a library passes the automated tests, it just becomes a part of boost. _BUT!_ It does so in the namespace boost::bleeding_edge. Have a disclaimer that bleeding_edge libraries haven't passed a formal review yet. Still, even with such a disclaimer, they would be exposed to the public, and more people would be encouraged to try them.
I would be wary to diminishing the Boost brand by distributing non-peer reviewed libraries in the main distro. I wouldn't oppose a secondary "add on" distro containing libraries in the review queue which have passed the metrics in my previous post, but I still think that passing peer review is what makes Boost Boost. If you don't pass peer review, you can't say you're a Boost library.
* To get out of the bleeding_edge namespace, a library needs to either receive a formal review (in the format formal reviews are made now), or demonstrate its use in several real-life working projects (independent from the author).
To get a willing peer review manager a library already needs to become popular enough, so we already have that too. For example, not enough people are using AFIO, therefore no one wants to peer review manage it, therefore I agree AFIO should not enter Boost until it is popular enough - after all, no one may be using it because it has an awful unintuitive design irrespective of technical merit :) Equally, it simply may be a useless library and solves no real problem, another good reason it should not enter Boost. Niall -- Currently unemployed and looking for work in Ireland. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
To get a willing peer review manager a library already needs to become popular enough, so we already have that too. For example, not enough people are using AFIO, therefore no one wants to peer review manage it, therefore I agree AFIO should not enter Boost until it is popular enough -
This makes some sense, but... if it doesn't get into Boost, how does it become popular? As a policy, it's in danger of circular dependency: "AFIO never made it into Boost because it wasn't popular. It never got popular because it never made it into Boost..."
On 03/12/2014 07:08 AM, Erik Erlandson wrote:
To get a willing peer review manager a library already needs to become popular enough, so we already have that too. For example, not enough people are using AFIO, therefore no one wants to peer review manage it, therefore I agree AFIO should not enter Boost until it is popular enough -
This makes some sense, but... if it doesn't get into Boost, how does it become popular? As a policy, it's in danger of circular dependency: "AFIO never made it into Boost because it wasn't popular. It never got popular because it never made it into Boost..."
I'll second that. IMO Boost should be proactive in that regard and setting rather than following the popularity trend. IMO for that task the Boost community has immensely higher competency and expertise level that the average programming community.
On 11 Mar 2014 at 16:08, Erik Erlandson wrote:
To get a willing peer review manager a library already needs to become popular enough, so we already have that too. For example, not enough people are using AFIO, therefore no one wants to peer review manage it, therefore I agree AFIO should not enter Boost until it is popular enough -
This makes some sense, but... if it doesn't get into Boost, how does it become popular? As a policy, it's in danger of circular dependency: "AFIO never made it into Boost because it wasn't popular. It never got popular because it never made it into Boost..."
I'm not unsympathetic, certainly. But in open source there's always an element of marketing and building reputation, and that's usually down to spreading the word and finding a well recognised "famous" user such as RedHat to publicly endorse a library etc. After all, 2002-2006 I dropped 60k lines of C++ into TnFOX, yet no one ever used it. A real waste of effort in some ways, but equally I learned a ton load doing it, stuff I could never learn writing Boost libraries. Niall -- Currently unemployed and looking for work in Ireland. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
On 11.3.2014 г. 21:47 ч., Niall Douglas wrote:
On 11 Mar 2014 at 17:57, Borislav Stanimirov wrote:
* After a library passes the automated tests, it just becomes a part of boost. _BUT!_ It does so in the namespace boost::bleeding_edge. Have a disclaimer that bleeding_edge libraries haven't passed a formal review yet. Still, even with such a disclaimer, they would be exposed to the public, and more people would be encouraged to try them.
I would be wary to diminishing the Boost brand by distributing non-peer reviewed libraries in the main distro. I wouldn't oppose a secondary "add on" distro containing libraries in the review queue which have passed the metrics in my previous post, but I still think that passing peer review is what makes Boost Boost. If you don't pass peer review, you can't say you're a Boost library.
That seems like a great idea, and it's a great way of having more exposure for the libraries that are waiting for a review. Helping them to find more early adopters, track down issues and ultimately becoming more useful and worthy of a review. I (and I'm sure many of the people here) know lots of people that use Boost, follow the news and are experts in some of the libraries, but hardly any of them has even heard about AFIO. Almost none of them knew about Log until it became an official part of the library. And as I understand Log had a fairly big user base even before it was officially in Boost. How could we go about having such an add-on distro?
* To get out of the bleeding_edge namespace, a library needs to either receive a formal review (in the format formal reviews are made now), or demonstrate its use in several real-life working projects (independent from the author).
To get a willing peer review manager a library already needs to become popular enough, so we already have that too. For example, not enough people are using AFIO, therefore no one wants to peer review manage it, therefore I agree AFIO should not enter Boost until it is popular enough - after all, no one may be using it because it has an awful unintuitive design irrespective of technical merit :) Equally, it simply may be a useless library and solves no real problem, another good reason it should not enter Boost.
I could say the same thing about my library. But the thing is while the gist of AFIO can be explained in a single sentence (it's in the library's name), that cannot be said about Mixin. Personally, I have no idea how to find users for it, especially since it seems like even if it does turn out to be useful for many people, it will be in the sense of something they didn't realize they needed until they gave it a shot. -- Borislav
----- Original Message -----
* After a library passes the automated tests, it just becomes a part of boost. _BUT!_ It does so in the namespace boost::bleeding_edge. Have a disclaimer that bleeding_edge libraries haven't passed a formal review yet. Still, even with such a disclaimer, they would be exposed to the public, and more people would be encouraged to try them.
The idea has precedent, for example Apache "incubator."
On 11 Mar 2014 at 8:18, Rob Stewart wrote:
Robert Ramey has ideas on improving the process. Others have made suggestions over the years, but we've made little change to it this far. Still, if you've got concrete ideas for improving the backlog, speak up.
You'll see my suggestion for filtering out the libraries submitted in an earlier post, but I'd like to reraise an earlier idea of mine which wasn't popular: the submitted libraries ought to ranked according to the number of peer reviews previously managed by the submitter. That ought to shake loose some more review managers. Having managed TypeIndex v2.0 with me managing TypeIndex v3.0 in about a month from now, I don't think there is much potential conflict of interest. As manager you're just collating other people's votes and trying to tease votes out of people. You then count up the votes and objectively list stated sentiments in the report, and that's peer review done. One's ability to influence the outcome is actually quite limited because people will call you if your report isn't fair.
For example, I would be willing to review some of these libraries, but I am not qualified to be a review manager (since I am not very active in the community).
Being a review manager is a big job. You need to ensure that the library is ready, you need domain expertise in order to well and fairly judge the review comments, and you need to commit time to the review period and writing the report. That's a significant burden and it's little wonder few are stepping up to do it.
It's also open ended: if a library needs a second or third peer review round then you're likely to manage those too. Committing to being a review manager can be a three or four month thing. Niall -- Currently unemployed and looking for work in Ireland. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
On 11.03.2014 14:29, Borislav Stanimirov wrote:
Hello,
This is a (half-)joke suggestion.
There are currently 19 libraries in the review schedule ( http://www.boost.org/community/review_schedule.html ) which need a review manager (yes, including mine). Some of them have been there for years.
This is certainly not the biggest problem, but anyway, if that page had any descriptions, it will increase the chances of finding somebody to manage a review. Say, I have no idea what is "Join" (is that a DB layer? Nope) or "Sorting" (is that ultimate library with all sorting algorithms in the world) or "Singularity" - is that Boost library proposal for uploading one's brain into a computer? (Also, on a quick look "Sorting" entry does not appear to have right directory structure, github has a bunch of zip files) HTH, Volodya
On 11 Mar 2014 at 12:29, Borislav Stanimirov wrote:
I am not saying that most (or any) of the libraries stuck in review limbo, should pass the process, but some of them at least deserve a look (again, like AFIO).
What slightly irritates me about the present review queue is that there is a wide disparity between the quality of the libraries in there, with some clearly not ready for peer review. What I'd really, really like is if the review schedule also listed answers to at least the following questions so queue submitters have a better idea of what is demanded: 1. List of compilers supported and their earliest and latest versions tested upon. 2. List of operating systems supported and their earliest and latest versions tested upon. 3. List of architectures supported and their earliest and latest versions tested upon, including endian. 4. Code coverage of the test and functional suite (in percentage) with an auto-reject if it's not above 75%. Travis CI + coveralls.io will do this for you for free, no one has any excuse to not have this nowadays. For code not covered, break down what it consists of? 5. Link to the Continuous Integration server for the project which does a unit test run for every pull request or commit to the project (with a big red X if there is none given Travis CI is free - I'd auto-reject any new library without CI integration). Treble extra points if you also have a CI for Windows, twice that again if you also have a CI for Mac OS X or BSD, and multiply the result by ten if all unit and functional tests are 100% stable over time. 6. Do running all the unit and functional tests pass on valgrind's memcheck tool? Does the library pass when compiled with clang's static analyser? 7. Is the library C++11 aware? Do all appropriate objects have move constructors available? 8. Is the documentation supplied in BoostBook and using the *standard* Boost templates? (Hint: there are a number of libraries in the queue without either). Does the documentation provide at least the following: (i) a design rationale (ii) a tutorial (iii) examples of usage (iv) a reference section. 9. Is the library fully exception safe using the Abrahams guarantees? List all areas in which it is not, with justifications (a common one will be failure to always guarantee correct handling of std::bad_alloc). 10. Has every API been audited for worst case complexity and that value is listed in every API documentation page? If not, explain why with justifications. 11. Has every API been audited for thread safety, and the results listed in every API documentation page? If not, explain why with justifications. 12. Does the documentation include real world benchmarks for both compilation times and execution times demonstrating that the library has been to some extent tuned for performance? If not, explain why with justifications. I know all these requirements are already listed in the wiki, but back when I was looking to peer review manage a library I found it hard to find answers to most of the above questions for some given library. Having the submitter fill them in at the point of submission would be hugely useful. Just as so I eat my own dog food, here are the results to the above questions for my own review submission, Boost.AFIO: 1. GCC 4.6-4.8, clang 3.1-3.4, Visual Studio 2010-2013. 2. All POSIX and Windows. Tested on Linux 3.0-3.3, Windows XP - Windows 8, FreeBSD 10. 3. All architectures with atomic ops. Tested on ARM v7 le, 486 le to x64 le. Big endian ought to work, but is untested. 4. Code coverage by the CI tests is 90%. The code not covered is almost entirely error handling, with 60% of it being errors that should never happen and are only there to detect memory corruption/race conditions. 5. AFIO's CI is at https://ci.nedprod.com/. On each commit a full unit and functional test run is done for Linux, Windows, FreeBSD and valgrind. The full unit and functional test suite survives being soak tested for 24 hours with no failures. 6. Yes, AFIO is valgrind memcheck clean. AFIO has one failure in the clang static analyser, but this is due to bad code in Boost.Filesystem. 7. AFIO requires C++11, and therefore is C++11 aware. All appropriate objects provide move construction. 8. AFIO's documentation is 100% Boost standard, and contains all the specified sections. 9. The library is completely exception safe, except when multiple std::bad_alloc's are thrown in sequence where the process will probably hang due to entering an undefined state (yes, I feel ashamed about this, but I haven't figured out a good solution yet). 10. Every API has been audited for complexity and that value is listed in the documentation. 11. Every API has been audited for thread safety and that value is listed in the documentation. 12. The documentation contains real world benchmarks for execution, not for compilation time, however they are for v1.1 of AFIO and both ought to have significantly changed with v1.2. This will be fixed soon. Niall -- Currently unemployed and looking for work in Ireland. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
On 03/12/2014 04:45 AM, Niall Douglas wrote:
On 11 Mar 2014 at 12:29, Borislav Stanimirov wrote:
I am not saying that most (or any) of the libraries stuck in review limbo, should pass the process, but some of them at least deserve a look (again, like AFIO). What slightly irritates me about the present review queue is that there is a wide disparity between the quality of the libraries in there, with some clearly not ready for peer review.
What I'd really, really like is if the review schedule also listed answers to at least the following questions so queue submitters have a better idea of what is demanded: ... Apologies for truncating your long list of "demands" :-) ... done so only to keep the conversation short and flowing.
That is a seriously big list... and IMO unreasonable given the author has no guarantees whatsoever that all that effort will not be wasted... if the respective library is rejected for completely different reasons -- high-level design, applicability, you name it. More practical (less off-putting) IMO might be 2-level review when an idea/design, API, first-cut implementation and readable/sensible documentation are presented for evaluation. If that's rejected outright, then it saves the author a lot of effort that he might direct onto improving his original design/offering. If the initial concept is accepted, then the author would have a real incentive to keep working and improving his original submission behind more/less stable and already-approved API. I think in reality that happens all the time in Boost (or any public library for that matter). Spirit's considerable evolution/transformation might be an example.
More practical (less off-putting) IMO might be 2-level review when an idea/design, API, first-cut implementation and readable/sensible documentation are presented for evaluation. If that's rejected outright, then it saves the author a lot of effort that he might direct onto improving his original design/offering. If the initial concept is accepted, then the author would have a real incentive to keep working and improving his original submission behind more/less stable and already-approved API.
This looks like a pre-proposal often done in academia (and other realms) when preparing a full proposal is a serious amount of work. That said, it would likely increase the number of review managers required as one has to manage the pre-reviews now. - Rhys
More practical (less off-putting) IMO might be 2-level review when an idea/design, API, first-cut implementation and readable/sensible documentation are presented for evaluation. If that's rejected outright, then it saves the author a lot of effort that he might direct onto improving his original design/offering. If the initial concept is accepted, then the author would have a real incentive to keep working and improving his original submission behind more/less stable and already-approved API. This looks like a pre-proposal often done in academia (and other realms) when preparing a full proposal is a serious amount of work.
That said, it would likely increase the number of review managers required as one has to manage the pre-reviews now.
- Rhys My initial expectation would be that the same person would curate/manage both reviews. More so, come to think of it, I am not sure the second review can be considered/treated as such or even I dare say needed. All
On 03/12/2014 07:57 AM, Rhys Ulerich wrote: the work related to the 2nd phase will be largely invisible to the user... implementation details so to speak. It might need the submission manager empowered with the last word if the submission is ready for actual inclusion or not. Seems like no need for 2nd -- implementation -- review. Obviously just my thoughts...
On 12 Mar 2014 at 7:54, Vladimir Batov wrote:
What I'd really, really like is if the review schedule also listed answers to at least the following questions so queue submitters have a better idea of what is demanded: ... Apologies for truncating your long list of "demands" :-) ... done so only to keep the conversation short and flowing.
That is a seriously big list... and IMO unreasonable given the author has no guarantees whatsoever that all that effort will not be wasted...
Sorry, I probably wasn't clear: apart from the CI test requirement, all fields are *optional*.
if the respective library is rejected for completely different reasons -- high-level design, applicability, you name it. More practical (less off-putting) IMO might be 2-level review when an idea/design, API, first-cut implementation and readable/sensible documentation are presented for evaluation. If that's rejected outright, then it saves the author a lot of effort that he might direct onto improving his original design/offering. If the initial concept is accepted, then the author would have a real incentive to keep working and improving his original submission behind more/less stable and already-approved API. I think in reality that happens all the time in Boost (or any public library for that matter). Spirit's considerable evolution/transformation might be an example.
I'm also not unsympathetic to this - after all, witness last week how only Vicente responded to my request for a design review for a generic continuation monad framework. Equally, I would say that writing top notch software is something you get to do rarely at work because management never lets you finish properly, or if they do let you finish, most of your workload is filling in never ending compliance forms and statements and repeatedly explaining your implementation to people who can't understand in presentations rather than coding. For me, at least, properly finishing software is why I exchange family and personal time for writing open source - it keeps me sane when working in corporations where it's very rarely a positive coding environment. In other words, writing top notch software I find rewarding in itself, because I never am normally allowed to. Niall -- Currently unemployed and looking for work in Ireland. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
On 11.3.2014 г. 19:45 ч., Niall Douglas wrote:
4. Code coverage of the test and functional suite (in percentage) with an auto-reject if it's not above 75%. Travis CI + coveralls.io will do this for you for free, no one has any excuse to not have this nowadays. For code not covered, break down what it consists of?
At first that got me really excited, but as I found out after a short research, even Boost.AFIO's .travis.yml indicates that getting a Boost library to build with Travis CI is very hard, if at all reasonable. I may be willing to add CI to my library, but it doesn't seem like Travis would be the way to go about that -- Borislav
On 11 Mar 2014 at 23:51, Borislav Stanimirov wrote:
with an auto-reject if it's not above 75%. Travis CI + coveralls.io
At first that got me really excited, but as I found out after a short research, even Boost.AFIO's .travis.yml indicates that getting a Boost library to build with Travis CI is very hard, if at all reasonable.
You surprise me. Simply clone AFIO's travis.yml and adjust as fit, it's only Unix shell scripting. Note most of the tests are commented out because I have Jenkins CI doing all the testing.
I may be willing to add CI to my library, but it doesn't seem like Travis would be the way to go about that
You may need to reorg your Boost testing to compile all your tests into a single, monolithic test binary which runs all the tests at once (you can see how AFIO did this, Paul came up with a cunning portable batch/shell script). You can then feed said binary into valgrind, have clang do a ThreadSanitize, and of course run it standalone. Travis gives you 20 mins for download, install, build and test duration (longer if you're printing output), so make sure all your tests finish (including when run in valgrind) before then. Tip: precompiled headers are useful here, look into AFIO's --fast-build build switch. Travis isn't perfect, and only CI's on Linux, but it is very good. And it's free. Niall -- Currently unemployed and looking for work in Ireland. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
On 12.3.2014 г. 02:08 ч., Niall Douglas wrote:
On 11 Mar 2014 at 23:51, Borislav Stanimirov wrote:
with an auto-reject if it's not above 75%. Travis CI + coveralls.io
At first that got me really excited, but as I found out after a short research, even Boost.AFIO's .travis.yml indicates that getting a Boost library to build with Travis CI is very hard, if at all reasonable.
You surprise me. Simply clone AFIO's travis.yml and adjust as fit, it's only Unix shell scripting. Note most of the tests are commented out because I have Jenkins CI doing all the testing.
Ah, my mistake then. Looking at the yml, with most of its lines commented out, it just seemed to me that you had spent some trying to get it to work and eventually gave up and switched to your own CI on your own server/machine (https://ci.nedprod.com). I'm going to try using it this weekend. Thanks -- Borislav
On 12 Mar 2014 at 10:26, Borislav Stanimirov wrote:
At first that got me really excited, but as I found out after a short research, even Boost.AFIO's .travis.yml indicates that getting a Boost library to build with Travis CI is very hard, if at all reasonable.
You surprise me. Simply clone AFIO's travis.yml and adjust as fit, it's only Unix shell scripting. Note most of the tests are commented out because I have Jenkins CI doing all the testing.
Ah, my mistake then. Looking at the yml, with most of its lines commented out, it just seemed to me that you had spent some trying to get it to work and eventually gave up and switched to your own CI on your own server/machine (https://ci.nedprod.com).
I'll go add some explanatory comments, thanks for the tip. No, I am probably happier with the Travis CI output than the Jenkins CI output. Travis understands branches much better for example. Travis actually tells me useful information, whereas Jenkins tells me lots of stuff I don't need to know and forces me to click down multiple pages to get at what I actually want to know. If it weren't for needing Windows and BSD CI, I'd have stuck with Travis.
I'm going to try using it this weekend. Thanks
If you need any help, come back here. Myself and Antony Polukhin are very keen to get a lot more Travis testing into Boost. Best start is with new libraries. BTW, that CI caught my very first C++11 induced bug a few months back! The clue was that only very new compilers segfaulted i.e. if the compiler implemented rvalue refs v3, you got the segfault, otherwise not. Without the CI I'd have never noticed, nor had such a great clue as to what had gone wrong. Niall -- Currently unemployed and looking for work in Ireland. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
participants (7)
-
Borislav Stanimirov
-
Erik Erlandson
-
Niall Douglas
-
Rhys Ulerich
-
Rob Stewart
-
Vladimir Batov
-
Vladimir Prus