Re: [boost] [histogram] should some_axis::size() return unsigned or int?

30 Nov 2018

      Am 30.11.18 um 11:36 schrieb Hans Dembinski:
...
You are overestimating the importance of the *-flow bins, I think. Users usually ignore them when they analyse their histograms. They must be there for other reasons which are explained in the rationale and they are very useful for expert-level statistical analyses and for debugging. The beginner, however, should not notice their presence.
In fact, the `indexed` range adaptor should probably skip them by default, and only iterate over them when that is explicitly requested.
Sounds reasonable: A range excluding the over/underflow bins and one 
including it.
An axis is not a container. It does not hold values and it has no operator[], precisely to emphasise this difference. It has size() though. See my email to Gavin with a long explanation why I think that makes sense.
Your code example was the following:
...
...
Other idea: If those bins are so special that they don't fit into the [0, size()) range, why not use a different function for getting them, which is not the index operator? high_bin()/low_bin() come to mind.
See explanation to Gavin why this is worse.
Combining this with "Users usually ignore them[...] the `indexed` range 
adaptor should probably skip them by default" I do see the need for 
extra functions here too. Your argument against "high_bin()/low_bin()" 
was: Iteration must be split. But your above comment already suggests,
for (unsigned i = 0; i < axis.size(); ++i) {
   auto x = h[i];
   // do something with bin
}

So it looks like a container, although size and []-operator are in 
different instances (which feels weird, but ok)
that there are iterators which can cover the whole range. Could they 
solve this split-iteration-problem?
...
...
But WHY was this chosen? Wouldn't it be ok if 0 is the first bin which starts at -inf and size()-1 to be the last one spanning to inf? This would allow a histogram of size 1 which has a single bin holding all values.
And why would you want such an axis? It would be pointless and make the histogram operate slower.
I was not saying this should be done. It would just be consistent. There 
are 2 dimensions:
- open ranged bins yes/no
- number of bins
In my mind enabling open ranged bins does not ADD bins but makes the 
first and last go to +-inf:

axis(4,0,10,"",uoflow_type::on) -> [-inf,0), [0,5), [5,10), [10, inf]
axis(4,0,10,"",uoflow_type::off) -> [0,2.5), [2.5,5), [5,7.5), [7.5,10)

Of course this might be confusing so default should be "off" as "users 
usually ignore them" so they are advanced things one does not generally 
need, right?
(Side note: The parameter description at 
https://hdembinski.github.io/histogram/doc/html/boost/histogram/axis/regular... 
is confusing due to the list order not matching the parameter order.)

So my TLDR of this is: Consistency and meeting expectations. If it 
breaks either, think again about the choices made.

For this it is either:

- *-flow bins are kinda regular bins -> included in size(), iteration, 
same behavior like regulars
- *-flow are special bins -> not included in size(), special accessors 
and iterators with default ones not including them.
     Given that: Why not have special constants for Underflow AND 
Overflow bin (e.g. -1 and -2) (instead of -1 and size(), where the 
latter is a runtime constant), then you could have a `int 
find_including_ouflow` and a `unsigned find` as well as `get(unsigned)`, 
`get_with_uoflow(int)` -> Idea is to make the special handling obvious

Alex

PS: I don't want to push anything. Just my thoughts on your issue in the 
hope it helps you finding a solution which you are happy with.

Re: [boost] [histogram] should some_axis::size() return unsigned or int?

Alexander Grund