AMDG On 12/12/2015 08:11 PM, Sergey Sprogis wrote:
I’m wondering how difficult it would be to add into bjam one option, something like:
-negative_tests=off
which will not launch any of negative (expected to fail) tests.
For people who use Boost to test newly developed compilers negative tests are quite a nuisance. I mean those tests which got “compile_fail” and “run_fail” status inside final *.xml file generated at the end of Boost regression testing. [ they are also marked similarly inside Jamfiles ]
If you're using the xml files, then isn't there enough information to filter out these results automatically?
That’s because newly developed compilers in their early stages of implementation normally have bugs which produce a lot o false error messages for correct Boost code. Important task here is to extract those messages, to evaluate them, and to prove that they are indeed correct, but compiler is wrong.
And when hundreds of such false error messages are mixing together with thousands of legitimate error messages produced by negative tests (there are > 700 of them in boost.1.59.0, for example) it becomes a non-trivial task to filter them out. So the natural desire is not to launch such tests at all.
If you're using the jam output directly, you can filter by searching for "...failed" or just run the tests a second time, which will only attempt to run tests that failed the first time.
Another, slightly unpleasant effect of negative tests is that they appear during so called “test pass rate” calculations during compiler testing process.
Typically managers responsible for compiler development want to know the progress in terms of what “test pass rate” produces new compiler for the whole Boost testing or for the testing of some specific libraries. Normally such pass rate is calculated as the ratio between number of passed tests and the total number of tests. But ideally, tests in those calculations should be correct, so if they fail, it 100% means compiler bugs. And here again, negative tests are not useful, and should be avoided to make calculations more accurate.
I'm not sure I follow. Shouldn't the compiler accepting incorrect code also be considered a compiler bug?
On a side note, I think it could be also useful to add such total testing pass rates into the Boost Regression Dashboard, so the quality of every tested compiler could be easy to seen. Much more people could be interested in looking at such dashboard, I guess.
In Christ, Steven Watanabe