[boost] [Review results] Accumulators accepted
The accumulator library submitted by Eric Niebler has been accepted into boost. Thanks, once again to all of the people who contributed to the development of the library, to everyone who contributed to the review, and to Eric for a fine submission. During the review, Eric and some of the reviewers had a useful discussion about issues for the library, and possible improvements. A condensed version of the outcomes from that discussion is included below. Also, in perusing the archives for other comments about the library, I found a couple of other issues that were discussed. I have included them below, as well. In many cases, Eric acknowledged issues and committed to a fix during the review. Those issues are included below, along with the current state of some of the more open ended discussions to provide easy reference for all interested parties. For my own organizational convenience, I have numbered the entries below. This does not indicate relative importance. 1) Prior to the review it was asked why the function that returns a histogram is called the density. Little was done at the time, as Eric suggested it should be addressed during the review. As far as I can see, it was not brought back up during the review process. This should be examined to make sure that density is the best name for the function, and changed if needed. 2) There are a number of variant implementations of some statistical functions. The documentation should clearly indicate which variants focus on quick and dirty implementations, and which provide more numeric stability. Eric has stated that this will be addressed in the revisions of the documentation. 3) The question was asked whether or not it is feasible to provide outlier rejection for some or all of the accumulators. I can find no further discussion of this question in the review, however, since this is a very important statistical technique it would be a good feature to investigate for future addition. 4) Michael Stevens expressed interest in a form of the variance that accumulates the differences squared iteratively and divides the result by n. I saw no direct response, but this is basically what the immediate_variance does. 5) Interest was expressed in making the compensated sum the default version of the sum. It could be supplimented by the quick and dirty version as an alternative. However, since there are reports that many optimizing compilers turn the compensated sum algorithm into the same compiled code as the simple sum, tests should be done to see if there is any real gain before the modification. 6) There are a number of broken links in the quickbook docs. This has been acknowledged and is in line for a fix. 7) Steven Watanabe suggested changing the structure to move away from the fusion vector dependence. After discussion it appears that fusion cons,. may be a better choice. However there may be a use case for the original library that precipitated the choice of fusion vector. This should be checked, and if Steven's idea holds up under scrutiny, it should be implemented. 8) There is a request that the user guide specifies which header each component is in. This has been acknowledged and is in line for a fix. 9) The macro BOOST_PARAMETER_NESTED_KEYWORD has no description in the docs. This has been acknowledged and is in line for a fix. 10) There were a number of requests for improvements to the reference manual, ranging from wording changes to a reorganization that reflects class structure instead of header structure. In some cases, these improvements are direct, and they are in line for fixes. In other cases there is currently no known way to do it with the boost tool chain. Eric, and many of the other members of boost would greatly appreciate anyone who has ideas and time to fix the harder problems. 11) There was a suggestion that the documentation could mention the TR1 reference_wrapper as a future solution to the accumulator_set_wrapper issues, and the implementation could be changed once there is a boost accepted implementation of the reference_wrapper. My impression was that I was not the only person who didn't realize that it could solve that problem. 12) Functions that allow a range of values to be pushed into an accumulator set all at once should be added. This could include forms that take begin and end iterators and forms that take a sequence. Eric agreed that this is a good idea. 13) Paul Bristow mentioned that the kurtosis has a confusing naming history. What the docs refer to as the kurtosis would better be called the "kurtosis excess." His suggestion was that a name change be considered and that the docs be modified to acknowledge the confusion whether the name is changed or not. Eric agreed with this suggestion. 14) Accessors for the standard deviation, the unbiased variance and the unbiased (N-1) standard deviation were requested. The fact that John Maddock fell into the trap of miscalculating the standard deviation shows that even experts can make mistakes when converting from the variance. Thus, it is a good idea for inclusion. Eric said he is interested, and he would also welcome submissions that provide these functions. 15) The docs mention "Even the extractors can accept named parameters. In a bit, we'll see a situation where that is useful," but there is no later mention. Eric has an example he intended that to refer to, but forgot to include it in the last edition of the docs. He intends to fix that. 16) The ability to "reset" and accumulator was requested. Eric pointed out that it is quite possible to make accumulators with reset methods. He also pointed out that it might be desirable to reset accumulator_sets, but this could also pose a problem. It is not clear what should happen if one of the accumulators used by an accumulator set does not have a reset method. Further thought should be given to this for possible inclusion in a later revision. 17) Autocorrelation for accumulator_sets should be explored. Matthias already plans to do this in the coming year for possible inclusion in a later revision. 18) The documentation on what to expect when one accumulator is dropped while a second accumulator that depends on the first is not dropped should be clarified. Careful thought on this may lead to a change in current behavior. 19) There is a request that the docs state more clearly what happens when more than one accumulator maps to the same feature. Eric plans to include this in documentation revisions. 20) It is not currently possible to combine accumulators. In some very common use cases, this would be an important feature. However, not all accumulators can be combined in any sensible way. A solution to this will require some study and design work, but it is a valuable addition for a future revision. One possible solution is to have a compile time check to see if the accumulators are combinable. 21) While finding newer/faster/more robust algorithms is not a bad thing, the focus of this review has been the interfaces. If the interfaces are good, improved algorithms can be worked in later. There is some minimum standard for performance, but it is not the focal point or the submission. 22) Javier and Hans submitted lists of documentation corrections that Eric acknowledged and plans to fix. 23) It is agreed that there is a need for a more gentle and thorough getting started document. This should include some compelling examples that show why this is a good design decision. 24) More of the formulas should be available in the documentation. Eric requested volunteers to make some formulas into LaTeX, and I have already contacted him to do so. 25) The docs for how to incorporate new features should be improved. This can be helped tremendously if the people who had problems doing this would send Eric descriptions of where they had problems with the docs and the process. Thanks again to everyone for your time and work. Any problems or misrepresentations in the above list are purely my fault, and I apologize for them. John _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/ listinfo.cgi/boost
participants (1)
-
John Phillips