[testing] Proposal - regression tests results display upgrade - Boost

[testing] Proposal - regression tests results display upgrade

older
Re: [boost] [units] Pull request...

Adam Wulkiewicz

28 May 2014 28 May '14

1:08 p.m.

Hi, Currently test failures of a new feature are always marked with the yellow color and a word 'fail'. Would it make sense to use some more descriptive color/naming scheme? In particular it would be nice to distinguish between the actual failures and the situations when the compilation of a test took too much time or an output file was too big. Would it make sense to also distinguish between compilation, linking and run failure? E.g. instead of a simple 'fail' use 'fail (c)', 'fail (l)' and 'fail (r)'. E.g. the color for actual failures could be orange and for "others" could stay yellow. Furthermore, currently it's impossible to see the results of compilation if a test passed. Would it make sense to change it? This would allow us to view and fix warnings even if the test was passed. The test passed with warnings could be marked e.g. with 'pass (w)' or the color could be set to some between yellow and this light green. If you directed me where it should be done I could try to change it. I'm guessing it would require modifying some script generating the page and some CSS? Regards, Adam

Show replies by date

Beman Dawes

28 May 28 May

1:56 p.m.

On Wed, May 28, 2014 at 9:08 AM, Adam Wulkiewicz <adam.wulkiewicz@gmail.com>wrote:

...

Hi,

Currently test failures of a new feature are always marked with the yellow color and a word 'fail'. Would it make sense to use some more descriptive color/naming scheme? In particular it would be nice to distinguish between the actual failures and the situations when the compilation of a test took too much time or an output file was too big. Would it make sense to also distinguish between compilation, linking and run failure? E.g. instead of a simple 'fail' use 'fail (c)', 'fail (l)' and 'fail (r)'. E.g. the color for actual failures could be orange and for "others" could stay yellow.

Furthermore, currently it's impossible to see the results of compilation if a test passed. Would it make sense to change it? This would allow us to view and fix warnings even if the test was passed. The test passed with warnings could be marked e.g. with 'pass (w)' or the color could be set to some between yellow and this light green.

Makes sense to me. IIRC, Marshall Clow has given some thought to regression test reporting needs, so you might want to get input from him.

...

If you directed me where it should be done I could try to change it. I'm guessing it would require modifying some script generating the page and some CSS?

See boost-root\tools\regression\src\report Thanks, --Beman

Paul A. Bristow

4 Jun 4 Jun

11:49 a.m.

New subject: [testing] Proposal - regression tests results display upgrade

...

-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Adam Wulkiewicz Sent: 28 May 2014 14:09 To: boost@lists.boost.org Subject: [boost] [testing] Proposal - regression tests results display upgrade

Hi,

Currently test failures of a new feature are always marked with the yellow

color and > a word 'fail'. > Would it make sense to use some more descriptive color/naming scheme? > In particular it would be nice to distinguish between the actual failures and the > situations when the compilation of a test took too much time or an output file was > too big. > Would it make sense to also distinguish between compilation, linking and run failure? > E.g. instead of a simple 'fail' use 'fail (c)', 'fail (l)' > and 'fail (r)'. > E.g. the color for actual failures could be orange and for "others" > could stay yellow. > > Furthermore, currently it's impossible to see the results of compilation if a test > passed. > Would it make sense to change it? > This would allow us to view and fix warnings even if the test was passed. > The test passed with warnings could be marked e.g. with 'pass (w)' or the color could > be set to some between yellow and this light green.

+1 definitely. This is particularly a problem for Boost.Math - the largest library in Boost, by far, in both code and tests, with several tests that often time out. Various summary counts of passes and fails would be good as well. It takes a while to scan the current (nice) display, eg http://www.boost.org/development/tests/develop/developer/math.html Paul --- Paul A. Bristow Prizet Farmhouse Kendal UK LA8 8AB +44 01539 561830

Adam Wulkiewicz

5 Jun 5 Jun

6:21 p.m.

New subject: [testing] Proposal - regression tests results display upgrade

Paul A. Bristow wrote:

...

...
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Adam Wulkiewicz Sent: 28 May 2014 14:09 To: boost@lists.boost.org Subject: [boost] [testing] Proposal - regression tests results display upgrade

Hi,

Currently test failures of a new feature are always marked with the yellow

color and > a word 'fail'. > Would it make sense to use some more descriptive color/naming scheme? > In particular it would be nice to distinguish between the actual failures and the > situations when the compilation of a test took too much time or an output file was > too big. > Would it make sense to also distinguish between compilation, linking and run failure? > E.g. instead of a simple 'fail' use 'fail (c)', 'fail (l)' > and 'fail (r)'. > E.g. the color for actual failures could be orange and for "others" > could stay yellow. > > Furthermore, currently it's impossible to see the results of compilation if a test > passed. > Would it make sense to change it? > This would allow us to view and fix warnings even if the test was passed. > The test passed with warnings could be marked e.g. with 'pass (w)' or the color could > be set to some between yellow and this light green. +1 definitely.

This is particularly a problem for Boost.Math - the largest library in Boost, by far, in both code and tests, with several tests that often time out.

Various summary counts of passes and fails would be good as well.

It takes a while to scan the current (nice) display, eg

http://www.boost.org/development/tests/develop/developer/math.html

I thought about some list of phrases/regexps which if found/matched in the compiler output or even the output of the regression tool would result in various coloring of cells and names. E.g. something like: fail yellow "compile with /bigobj" fail yellow "300 second time limit exceeded" fail(c) orange ^"Compile".+"fail"$ # ... Or something similar in XML. Could also be set per-library if needed. As much as I'd like to play with it I'm affraid I won't find enough time. Regards, Adam

Adam Wulkiewicz

13 Aug 13 Aug

2:42 p.m.

New subject: [testing] Proposal - regression tests results display upgrade

Paul A. Bristow wrote:

...

...
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Adam Wulkiewicz Would it make sense to use some more descriptive color/naming scheme? In particular it would be nice to distinguish between the actual failures and the situations when the compilation of a test took too much time or an output file was too big. Would it make sense to also distinguish between compilation, linking and run failure?

+1 definitely.

This is particularly a problem for Boost.Math - the largest library in Boost, by far, in both code and tests, with several tests that often time out.

Various summary counts of passes and fails would be good as well.

It takes a while to scan the current (nice) display, eg

http://www.boost.org/development/tests/develop/developer/math.html

Ok, I managed to find some time to play with it. AFAIU the reports are generated using the code from: https://github.com/boostorg/boost/tree/develop/tools/regression/src/report, is that correct? The first change is simple since I'd like to see if I'm doing everything right. I changed the way how those specific fails are displayed. If the compilation fails and at the end of the compiler output (25 last characters) one of the following strings can be found: - "File too big" - "time limit exceeded" the test is considered as "unfinished compilation" and displayed on the library results page as a yellow cell with a link named "fail?". So it's distinguishable from the normal "fail". Here is the PR: https://github.com/boostorg/boost/pull/25 I only tested it locally on a test done by 1 runner for Geometry and Math libraries. Is there a way I could test it on results sent by all of the runners? How is this program executed? Is there some script which e.g. checks some directory and passes all of the tests as command arguments? Regards, Adam

Beman Dawes

7:51 p.m.

On Wed, Aug 13, 2014 at 10:42 AM, Adam Wulkiewicz <adam.wulkiewicz@gmail.com

...

wrote:

...

Paul A. Bristow wrote:

...
-----Original Message-----

...
From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Adam

Wulkiewicz

...
Would it make sense to use some more descriptive color/naming scheme?

In particular it would be nice to distinguish between the actual failures and

the

...
situations when the compilation of a test took too much time or an output file

was

...
too big. Would it make sense to also distinguish between compilation, linking and run

failure?

+1 definitely.

This is particularly a problem for Boost.Math - the largest library in Boost, by far, in both code and tests, with several tests that often time out.

Various summary counts of passes and fails would be good as well.

It takes a while to scan the current (nice) display, eg

http://www.boost.org/development/tests/develop/developer/math.html

Ok, I managed to find some time to play with it. AFAIU the reports are generated using the code from: https://github.com/boostorg/boost/tree/develop/tools/regression/src/report, is that correct?

Yes, I believe so.

...

The first change is simple since I'd like to see if I'm doing everything right. I changed the way how those specific fails are displayed. If the compilation fails and at the end of the compiler output (25 last characters) one of the following strings can be found: - "File too big" - "time limit exceeded" the test is considered as "unfinished compilation" and displayed on the library results page as a yellow cell with a link named "fail?". So it's distinguishable from the normal "fail".

Here is the PR: https://github.com/boostorg/boost/pull/25

I skimmed it and didn't see any red flags.

...

I only tested it locally on a test done by 1 runner for Geometry and Math libraries. Is there a way I could test it on results sent by all of the runners? How is this program executed?

IIRC, Noel Belcourt at sandia runs the tests. Thus he would be a good person to merge your pull request since he will likely be the first to notice any problems. Noel, are you reading this:-?

...

Is there some script which e.g. checks some directory and passes all of the tests as command arguments?

He would be the best person to answer that. Thanks for working that this! --Beman

Belcourt, Kenneth

10:53 p.m.

New subject: [EXTERNAL] [testing] Proposal - regression tests results display upgrade

On Aug 13, 2014, at 1:51 PM, Beman Dawes <bdawes@acm.org> wrote:

...

On Wed, Aug 13, 2014 at 10:42 AM, Adam Wulkiewicz <adam.wulkiewicz@gmail.com

...
wrote:

...
Paul A. Bristow wrote:

...
-----Original Message-----

...
From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Adam

Wulkiewicz

...
Would it make sense to use some more descriptive color/naming scheme?

In particular it would be nice to distinguish between the actual failures and

the

...
situations when the compilation of a test took too much time or an output file

was

...
too big. Would it make sense to also distinguish between compilation, linking and run

failure?

+1 definitely.

This is particularly a problem for Boost.Math - the largest library in Boost, by far, in both code and tests, with several tests that often time out.

Various summary counts of passes and fails would be good as well.

It takes a while to scan the current (nice) display, eg

http://www.boost.org/development/tests/develop/developer/math.html

Ok, I managed to find some time to play with it. AFAIU the reports are generated using the code from: https://github.com/boostorg/boost/tree/develop/tools/regression/src/report, is that correct?

Yes, I believe so.

...
The first change is simple since I'd like to see if I'm doing everything right. I changed the way how those specific fails are displayed. If the compilation fails and at the end of the compiler output (25 last characters) one of the following strings can be found: - "File too big" - "time limit exceeded" the test is considered as "unfinished compilation" and displayed on the library results page as a yellow cell with a link named "fail?". So it's distinguishable from the normal "fail".

Here is the PR: https://github.com/boostorg/boost/pull/25

I skimmed it and didn't see any red flags.

...
I only tested it locally on a test done by 1 runner for Geometry and Math libraries. Is there a way I could test it on results sent by all of the runners? How is this program executed?

IIRC, Noel Belcourt at sandia runs the tests. Thus he would be a good person to merge your pull request since he will likely be the first to notice any problems.

Noel, are you reading this:-?

A little delayed but yes.

...

Is there some script which e.g. checks some directory and passes all of

...
the tests as command arguments?

He would be the best person to answer that.

I’ll have to investigate, don’t know off the top of my head. — Noel

Rene Rivera

14 Aug 14 Aug

8 p.m.

New subject: [EXTERNAL] [testing] Proposal - regression tests results display upgrade

OK, I only now have time to read email.. Is there something I can help answer? (Not sure what the question refers to). Rene. On Wed, Aug 13, 2014 at 5:53 PM, Belcourt, Kenneth <kbelco@sandia.gov> wrote:

...

On Aug 13, 2014, at 1:51 PM, Beman Dawes <bdawes@acm.org> wrote:

...
On Wed, Aug 13, 2014 at 10:42 AM, Adam Wulkiewicz < adam.wulkiewicz@gmail.com

...
wrote:

...
Paul A. Bristow wrote:

...
-----Original Message-----

...
From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Adam

Wulkiewicz

...
Would it make sense to use some more descriptive color/naming scheme?

In particular it would be nice to distinguish between the actual failures and

the

...
situations when the compilation of a test took too much time or an output file

was

...
too big. Would it make sense to also distinguish between compilation, linking and run

failure?

+1 definitely.

This is particularly a problem for Boost.Math - the largest library in Boost, by far, in both code and tests, with several tests that often time out.

Various summary counts of passes and fails would be good as well.

It takes a while to scan the current (nice) display, eg

http://www.boost.org/development/tests/develop/developer/math.html

Ok, I managed to find some time to play with it. AFAIU the reports are generated using the code from:

https://github.com/boostorg/boost/tree/develop/tools/regression/src/report ,

...
is that correct?

Yes, I believe so.

...
The first change is simple since I'd like to see if I'm doing everything right. I changed the way how those specific fails are displayed. If the compilation fails and at the end of the compiler output (25 last characters) one of the following strings can be found: - "File too big" - "time limit exceeded" the test is considered as "unfinished compilation" and displayed on the library results page as a yellow cell with a link named "fail?". So it's distinguishable from the normal "fail".

Here is the PR: https://github.com/boostorg/boost/pull/25

I skimmed it and didn't see any red flags.

...
I only tested it locally on a test done by 1 runner for Geometry and

Math

...
libraries. Is there a way I could test it on results sent by all of the runners? How is this program executed?

IIRC, Noel Belcourt at sandia runs the tests. Thus he would be a good person to merge your pull request since he will likely be the first to notice any problems.

Noel, are you reading this:-?

A little delayed but yes.

...
Is there some script which e.g. checks some directory and passes all of

...
the tests as command arguments?

He would be the best person to answer that.

I’ll have to investigate, don’t know off the top of my head.

— Noel

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

-- -- Rene Rivera -- Grafik - Don't Assume Anything -- Robot Dreams - http://robot-dreams.net -- rrivera/acm.org (msn) - grafikrobot/aim,yahoo,skype,efnet,gmail

Adam Wulkiewicz

9:36 p.m.

New subject: [EXTERNAL] [testing] [regression] Tests results display upgrades and fixes

Rene Rivera wrote:

...

OK, I only now have time to read email.. Is there something I can help answer? (Not sure what the question refers to).

I prepared a PR (https://github.com/boostorg/boost/pull/25) changing the way how the library results page is generated. It works locally in a way that it generates the results page but I'm not sure really if this is the code of a program used to generate the regression summary pages and how is it executed. Am I playing with the correct program? How the procedure of contribution to Regression looks like? In this particular case I'm unable to test what I've done on a "living organism" since AFAIU the tests are processed on an external server. Assuming that my contribution is desired, should someone with sufficient access rights test the code from my fork or should the code be merged and then tested as "official"? Or should I request access rights required to test it myself? Another thing. I'd like to investigate why the results are missing for Geometry and Spirit master branch. I'm guessing that if I wasn't able to somehow figure it out locally then some form of access to all of the test results sent by the runners would be needed for checking the XMLs and debugging the program generating the pages. Would this be possible or do you have a better idea? Regards, Adam

Thomas Suckow

10 p.m.

New subject: [testing] [regression] Tests results display upgrades and fixes

...

Another thing. I'd like to investigate why the results are missing for Geometry and Spirit master branch. I'm guessing that if I wasn't able to somehow figure it out locally then some form of access to all of the test results sent by the runners would be needed for checking the XMLs and debugging the program generating the pages. Would this be possible or do you have a better idea?

I am not sure how the summary is generated but I looked into the XML and as far as I can tell spirit has been moved to spirit/test. As for why it still shows up in the summary, I don't know enough to say. Similarly geometry seems to have moved to geometry/test but for some reason does not appear in the summary. Hope that helps point you in the right direction. Regards, Thomas

Adam Wulkiewicz

17 Aug 17 Aug

12:37 p.m.

New subject: [testing] [regression] Tests results display upgrades and fixes

Thomas Suckow wrote:

...

...
Another thing. I'd like to investigate why the results are missing for Geometry and Spirit master branch. I'm guessing that if I wasn't able to somehow figure it out locally then some form of access to all of the test results sent by the runners would be needed for checking the XMLs and debugging the program generating the pages. Would this be possible or do you have a better idea?

I am not sure how the summary is generated but I looked into the XML and as far as I can tell spirit has been moved to spirit/test. As for why it still shows up in the summary, I don't know enough to say. Similarly geometry seems to have moved to geometry/test but for some reason does not appear in the summary.

Yes, AFAIK everything is ok with the libraries. Though I have an idea what may be the cause of the problem. On the page: http://www.boost.org/development/tests/master/developer/geometry_.html empty cells have class set to "library-missing". I can't find anything in the program I modified that could generate this HTML. However I can find it in the XSL related with the scripts probably also involved in the reports generation from the tree: https://github.com/boostorg/regression/tree/develop/xsl_reports. So the question remains. How exactly the results are generated? Are they generated the same way for develop and master branches? Is the C++ program I modified used in any case? Regards, Adam

3975

Age (days ago)

4056

Last active (days ago)

List overview

Download

10 comments

6 participants

participants (6)

Adam Wulkiewicz
Belcourt, Kenneth
Beman Dawes
Paul A. Bristow
Rene Rivera
Thomas Suckow