On Mon, May 8, 2023 at 5:22 PM Andrey Semashev via Boost < boost@lists.boost.org> wrote:
On 5/8/23 20:40, John Maddock via Boost wrote:
Machine time could well be donated by volunteers and perhaps replace the current test/status matrix, which is fine, but requires you to go off seeking for results, which may or may not have cycled yet. Plus that matrix relies on a "build the whole of Boost" approach which increasingly simply does not scale.
I'm really grateful to the volunteers that run the tests and maintain the official test matrix, but honestly, I'm not paying attention to it anymore. I have three main issues with it:
1. Slow turnaround. From my memory, it could take weeks or more for the runners to run the tests over a commit I made. With this order of times, it is impossible to perform continued development while maintaining code in working state.
2. Lack of notifications.
3. Problematic debugging. It was not uncommon that a test run failed because of some misconfiguration on the runner's side. And it was also not uncommon that build logs were unavailable.
So, while, again, I'm most grateful to the people that made public testing possible at all before we had the current CI services, today we do have CI services with the above problems fixed (more or less), and I'm spoiled by them. It is true that the public CI resources are limited and insufficient at times, so, IMHO, the way forward would be towards fixing this problem without losing the convenience of the CI services we've become used to.
As the person responsible for most (currently all) of the test runners in the official test matrix, I thought I'd throw a couple cents in here. I currently have three machines running these tests. These machines were purchased by boost (then from SFC) in 2017, and are a bit old but still running well. One of the machines is running windows server, it cycles running through the last six versions of visual studio (plus the latest one with `/std:c++latest`) for develop + master. The two other machines are linux runners. One of them just runs the latest version of clang + gcc for develop + master, at about 2hr a pop there should never be a commit that goes more than ~8hr without one of these configurations running. The other machine runs a *huge* number (>150) of gcc/clang configs. This takes approximately a week to get through all of them, but provides a breadth of testing that isn't available anywhere else. See the table in this readme [1] for the full list, as well as a bit more about the runners. To address a bit of #3 above, these are fully running on docker containers, I'd be more than happy to add other users configs to this list. I've also got a couple raspberry pi machines running gcc + clang / develop + master, but they are *very* slow (20hrs?). I've also got a RISC-V SBC that I want to get this going on, but haven't found the time yet. Architecture *shouldn't* matter for most of what we do, but there are a couple edge cases where it can be useful. I don't spend a lot of time on caring and feeding of this, so would be happy to keep it going in the future...but the down side of that is that there have been (and I'm sure will continue to be) instances where things break and go unnoticed for weeks. (I just noticed MSVC has been failing all the develop runs for weeks, almost certainly a config issue on my end) All that said, I'm not sure where this fits into the picture going forward. Andrey's points above (esp. 2&3) are very valid. I wouldn't depend on a CI system like this in my day-job. There are plenty more issues with the system as well...specifically: * Tons of time wasted re-building things that haven't changed (sometimes for years!) * No history saved beyond the most recent build * Processing of results a centralized point of failure I'm happy to go with whatever the community is looking to do on this front. Some options as I see it: * Keep the current test matrix as a compliment to the various CI systems starting to roll out. * Merge my runner system in with the new C++ Alliance CI system * Shutdown the test matrix and move fully to C++ Alliance cloud CI Tom [1]https://github.com/teeks99/boost-build/blob/master/Regression/README.md