Re: [boost] Boost Library Testing - a modest proposal - was boost.test regression or behavior change (was Re: Boost.lockfree)

10 Oct 2015

...
On 09 Oct 2015, at 21:47, Robert Ramey <ramey@rrsd.com> wrote:
On 10/9/15 10:54 AM, Raffi Enficiaud wrote:
It's hard to tell, but it seems to me that so far we're in agreement.
...
...
b) Testing on other platforms.
We have a system which has worked pretty well for many years. Still it
has some features that I'm not crazy about.
i) it doesn't scale well - as boost gets bigger the testing load gets
bigger.
I suggested a test procedure on "stages of quality" in my previous post:
- fast feedback by continuous runners, giving a quick status on some
mainstream compilers. Runners may have overlapping configuration/setup,
so that the load is balanced somehow.
- scheduling of less available runners on candidates selected from
previous stage. The interface can be by increasing a git branch, the
runners picking that branch only.
This a pretty elaborate setup.  And also fairly ambiguous to me. Seems like implementing such a thing would be quite an effort - by whom I don't know.
I am not sure any real solutions to testing needs of boost is simple.  So it may be some elaborate setup is needed.  Let´s hope not.
...
...
...
ii) it tests the develop branch of each library against the develop
branch of all the other libraries
...
...
Exactly,
OK - so we're agreement about this.
...
but also not being able to track down the history of the
versions on the current dashboard is far from helping. As a developer, I
would like to see a summary of eg. the number of failing tests vs.
number of test, and *per revision*.
I don't think such information would be useful to me.  But maybe that's just me.
...
...
iii) it relies on volunteer testers to select compilers/platforms to
test under. So it's not exhaustive and the selection might not reflect
that which people are actually using.
I would say that it would be good if each runner publishes the setup
(not the runtime, but how it has been deployed), and maybe a script for
being able to reproduce this runner. I think about docker (and how easy
it is to describe fully a system), there are tools for the other
platforms, more complicated though.
...
The idea behind that is to be able to reproduce the runners, so that
they are not shown by name (eg. teeks99-08) but by property (eg.
win2012R2-64on64, msvc-12). I am not saying that the current setup
should not be followed, I am suggesting a way to address the scalability
issue. For that we can have equivalent runners and balance the load.
Sounds very ambitious and complex.
...
...
I would like to see us encourage our users to test the libaries that
they use. This system would work in the following way.
If by users you mean the post-release /end users/, are you expecting a
post-release feedback? I am not sure I understand.
This suggestion doesn't address pre-release issues.  Frankly, except for a few issues (develop vs master) cited above I don't think they are a big problem and I think the current testing setup is adequate.
But this system can really only test the combinations that the testers select.  The problem comes up after release when one gets bug reports form users of the released library.  I would like to get these sooner rather than later and on the platforms that people are actually using.
I often get issues reported which are related the current configuration but but the user hasn't run the latest tests on his current setup so all I get is a complaint.  If the user ran the tests on the libraries which he's using (which he should be doing in any case!) I'd have a lot more to work with and bugs would get discovered and addressed sooner with less effort.
Of course if users want to switch to develop branch on those libraries they use and run the tests pre-release - that would be great.  But I'm not really expecting many people to do that.
That is really the big question.  Is it realistic to hope for sufficient numbers of users, with interesting configurations that are not already very well tested, to set up test runners like this.  I do not think it is something users would not seriously consider if challenged, but in practical life this depend on practical things, such as: 

- Time to set up and maintain the test runner v.s. just running tests privately on what 
  you use.
- Hardware and software availability, including licenses, in an environment which for
  security reasons often need to be isolated from the rest of the local development 
  environment.
- Willingness to publish their use of boost.  This may be more important than many 
  expect as this is not a boolean state, it is a lot of fuzziness and uncertain users 
  are harder to convince.

Clearly, if there is ways of improving this, so it is simple, easier, less costly, more private, then there may be good hope.  If it is not improved, I am a bit on the sceptical that sufficient interesting new test runners will arrive.  But I guess it is hard to know if you do not try.    

— 
Bjørn

Re: [boost] Boost Library Testing - a modest proposal - was boost.test regression or behavior change (was Re: Boost.lockfree)

Bjørn Roald