Dear Steven, thank you very much for the careful reading of the docs and sorry for the overuse of bold text. Now that people actually notice the project I can stop shouting. I will implement all these wording mistakes that you point out. A few clarifcations below: abstract.html:
- "The C implementations of the GSL are elegant..." Are there multiple implementations?
Ok, one implementation per set, but there are two sets of functions, one set for 1d histograms and one set for 2d histograms. - Is there a reason that CMakeLists.txt is in
build/ instead of test/?
Mateusz moved the CMakeLists.txt into the root directory which is better/more straight forward and follows the example of hana and compute. We could also move it to test as you suggest, but I prefer it in the root directory. I read in some Boost guides (I think it was on the blincubator, but I can't find the source just now and blincubator.com gives me a 403 error today) that all build scripts should reside in the build directory or next to the source. So I put it in "build". - "in addition, the factory also accepts iterators over a sequence of
axis::any, the polymorphic type that can hold concrete axis types" The relationship between axis::any and axis::any_std (the type actually used) in unclear.
True, this is not explained well. axis::any is a closed-set polymorphic type derived from boost::variant and templated on a list of types that the polymorphic type should be able to represent. axis::any_std is a concrete type where the type list is filled with standard axis types which users are expected to use most frequently.
- I wonder whether you could handle the issue that h * 2 != h + h, by using h * weight(2) to make it clear that multiplication changes the weights of the samples instead of changing the number of samples.
If I understand correctly, you would only allow scaling by a weight type, not by an arbitrary number. That makes sense and is more explicit. Sounds like a good idea.
benchmarks.html:
- No GSL?
There are GSL benchmarks for 1D and 2D, where our static variants are faster. The GSL does not provide histograms for more than two dimensions. https://www.gnu.org/software/gsl/manual/html_node/Histograms.html
- You say uniform and normal distributions, but there's only a single plot. Does this mean that you are putting both uniform and normal variates into a single histogram?
Good point, this is a mistake in my plotting script. I originally wanted to plot both results but ended up plotting only one. The benchmarks for uniform and normal data are slightly different, because they trigger the branches in the axis code differently. By design of the benchmark, branch prediction works better for the uniform data.
- "Axis access" Not the most fortuitous word combination.
As long as you don't try to say it out loud... ;) - It seems that using weighted fills breaks the special
overflow handling properties of adaptive_storage. I think that this deserves a distinct note.
Right.
- It doesn't seem very sane to have the value_type be a reference. Is there any reason for this other than the fact that you specify the exact signature of axis_type::index?
This is a mistake, the method is not required to accept a reference, pass-by-value is also possible. How should I write it to indicate that passing by value or reference is possible?
- I would prefer to set the size of a storage_type in the constructor instead of using reset, if that's possible.
I was flip-flopping on that one. A call `value.reset(n)` can be replaced by `value = storage_type(n)`. I don't see a good reason to prefer one over the other, do you?
- I don't understand the difference between the two overloads of storage_type::add.
I should explain better, the whole section about concepts is very brief. The first version of "add" is used when you add two histograms and integral counts are added to integral counts. The second version of add - which accepts the weight type - is used when you increment the counter not by 1, but by a weight. In an earlier stage of this library, the corresponding method was called `increment_by_weight`. The second overload is required because it causes an internal transformation from integral counters into dual counts which track the sum of weights and the sum of weights squared in case of the adaptive_storage (so that a variance estimate can be computed). This transformation is not needed when you simply add two histograms.
- Unfortunately forward declarations seem to mess up the doxygen/boostbook toolchain. As a result a number of components are listed in histogram_fwd.hpp I'll look into this.
Thanks, I am glad for any help on this issue.
- I'll deal with the reference content along with the code. I will note that the reference documentation seems a bit lacking overall as there are many components without any description.
True, just like the concepts it got a bit less love.
- The passive voice is overused.
And in the last paper that I submitted to a journal, they complained about my use of active voice... On Mon, Sep 17, 2018 at 10:58 PM Steven Watanabe via Boost < boost@lists.boost.org> wrote:
AMDG
abstract.html:
- "the one- and multi-dimensional case" "cases" should be plural.
- "..submit this project to Boost, that's why..." Comma splice
- Please don't overuse *bold text*
acknowledgments.html
- "converting the documentation, and adding Jamfiles" No comma here.
motivation.html
- "The GNU Scientific Library (GSL) and in the ROOT framework" Remove "in"
- "The C implementations of the GSL are elegant..." Are there multiple implementations?
- "The implementations are not customizable, you have to..." Comma splice
build.html
- Is there a reason that CMakeLists.txt is in build/ instead of test/?
getting_started.html
- "int main(int, char**) {" If you're not using the parameters, `int main()` is a valid signature.
- Since you require C++11, you might as well use <random> instead of Boost.Random. The Boost implementation of mt19937 has better performance, but that doesn't really matter here.
- "in addition, the factory also accepts iterators over a sequence of axis::any, the polymorphic type that can hold concrete axis types" The relationship between axis::any and axis::any_std (the type actually used) in unclear.
guide.html:
- "A histogram consists a number of..." "consists /of/ a number of..."
- "which provides safe counting, is fast and memory efficient." There are only two elements, so use "and" instead of a comma or say "is fast, and /is/ memory efficient."
- "Axes objects with different label do..." -> s/Axes/Axis/, s/label/labels/
- "...sequence of bin indices for each axes in order." s/axes/axis/
- I wonder whether you could handle the issue that h * 2 != h + h, by using h * weight(2) to make it clear that multiplication changes the weights of the samples instead of changing the number of samples.
- "...remove some axes and look the equivalent lower..." "...look /at/ the equivalent..."
- "so it is not worth keeping that axies around" s/axies/axis/
- "Users can create new axis classes ..., or implemented..." s/implemented/implement/
- "Here is a contrieved example of a..." s/contrieved/contrived/
- "The guarantees ... are only valid, if the default..." No comma.
- "... you need know what you are doing" "you need /to/ know"
benchmarks.html:
- No GSL?
- You say uniform and normal distributions, but there's only a single plot. Does this mean that you are putting both uniform and normal variates into a single histogram?
rationale.html
- "Axis access" Not the most fortuitous word combination.
- "The buildin storage types..." s/buildin/builtin/
- It seems that using weighted fills breaks the special overflow handling properties of adaptive_storage. I think that this deserves a distinct note.
- "Why is Boost.Histogram not build on top..." s/build/built/
concepts.html:
- Concept names use CamelCase, by convention.
- It doesn't seem very sane to have the value_type be a reference. Is there any reason for this other than the fact that you specify the exact signature of axis_type::index?
- I would prefer to set the size of a storage_type in the constructor instead of using reset, if that's possible.
- I don't understand the difference between the two overloads of storage_type::add.
reference.html:
- Unfortunately forward declarations seem to mess up the doxygen/boostbook toolchain. As a result a number of components are listed in histogram_fwd.hpp I'll look into this.
- I'll deal with the reference content along with the code. I will note that the reference documentation seems a bit lacking overall as there are many components without any description.
General:
- The passive voice is overused.
In Christ, Steven Watanabe
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost