Am 30.11.18 um 11:36 schrieb Hans Dembinski:
You are overestimating the importance of the *-flow bins, I think. Users usually ignore them when they analyse their histograms. They must be there for other reasons which are explained in the rationale and they are very useful for expert-level statistical analyses and for debugging. The beginner, however, should not notice their presence.
In fact, the `indexed` range adaptor should probably skip them by default, and only iterate over them when that is explicitly requested. Sounds reasonable: A range excluding the over/underflow bins and one including it. An axis is not a container. It does not hold values and it has no operator[], precisely to emphasise this difference. It has size() though. See my email to Gavin with a long explanation why I think that makes sense. Your code example was the following:
Other idea: If those bins are so special that they don't fit into the [0, size()) range, why not use a different function for getting them, which is not the index operator? high_bin()/low_bin() come to mind. See explanation to Gavin why this is worse. Combining this with "Users usually ignore them[...] the `indexed` range adaptor should probably skip them by default" I do see the need for extra functions here too. Your argument against "high_bin()/low_bin()" was: Iteration must be split. But your above comment already suggests,
for (unsigned i = 0; i < axis.size(); ++i) { auto x = h[i]; // do something with bin } So it looks like a container, although size and []-operator are in different instances (which feels weird, but ok) that there are iterators which can cover the whole range. Could they solve this split-iteration-problem?
But WHY was this chosen? Wouldn't it be ok if 0 is the first bin which starts at -inf and size()-1 to be the last one spanning to inf? This would allow a histogram of size 1 which has a single bin holding all values. And why would you want such an axis? It would be pointless and make the histogram operate slower.
I was not saying this should be done. It would just be consistent. There are 2 dimensions: - open ranged bins yes/no - number of bins In my mind enabling open ranged bins does not ADD bins but makes the first and last go to +-inf: axis(4,0,10,"",uoflow_type::on) -> [-inf,0), [0,5), [5,10), [10, inf] axis(4,0,10,"",uoflow_type::off) -> [0,2.5), [2.5,5), [5,7.5), [7.5,10) Of course this might be confusing so default should be "off" as "users usually ignore them" so they are advanced things one does not generally need, right? (Side note: The parameter description at https://hdembinski.github.io/histogram/doc/html/boost/histogram/axis/regular... is confusing due to the list order not matching the parameter order.) So my TLDR of this is: Consistency and meeting expectations. If it breaks either, think again about the choices made. For this it is either: - *-flow bins are kinda regular bins -> included in size(), iteration, same behavior like regulars - *-flow are special bins -> not included in size(), special accessors and iterators with default ones not including them. Given that: Why not have special constants for Underflow AND Overflow bin (e.g. -1 and -2) (instead of -1 and size(), where the latter is a runtime constant), then you could have a `int find_including_ouflow` and a `unsigned find` as well as `get(unsigned)`, `get_with_uoflow(int)` -> Idea is to make the special handling obvious Alex PS: I don't want to push anything. Just my thoughts on your issue in the hope it helps you finding a solution which you are happy with.