Hi all, For someone that hasn't been that long involved in Boost, it feels strange to distribute the documentation in the same release tarball as the code, specially considering the usual complains about "Boost size". I'm sure there are reasons behind it, and I'd like to learn about them. This is a candid question, out of being a newbie here. Thanks, Ruben
Ruben Perez wrote:
Hi all,
For someone that hasn't been that long involved in Boost, it feels strange to distribute the documentation in the same release tarball as the code, specially considering the usual complains about "Boost size".
I'm sure there are reasons behind it, and I'd like to learn about them. This is a candid question, out of being a newbie here.
In the beginning, when dinosaurs roamed the Earth, documentation consisted of each library having a heap of .html files in its directory, following no convention in either naming or placement. So there was no way to have a separate tarball with the docs, but there wasn't any need either, because Boost was much smaller.
In the beginning, when dinosaurs roamed the Earth, documentation consisted of each library having a heap of .html files in its directory, following no convention in either naming or placement. So there was no way to have a separate tarball with the docs, but there wasn't any need either, because Boost was much smaller.
Does that mean that dinosaurs still roam around here? :) Is there any reason as of today to keep doing it? (other than no-one volunteering to make the change). I guess there must be, since we have separate no-docs GitHub releases. Regards, Ruben.
Ruben Perez wrote:
In the beginning, when dinosaurs roamed the Earth, documentation consisted of each library having a heap of .html files in its directory, following no convention in either naming or placement. So there was no way to have a separate tarball with the docs, but there wasn't any need either, because Boost was much smaller.
Does that mean that dinosaurs still roam around here? :)
In some cases yes. :-)
Is there any reason as of today to keep doing it? (other than no-one volunteering to make the change). I guess there must be, since we have separate no-docs GitHub releases.
Our documentation build infrastructure is not set up to produce separate documentation tarballs, and it's not clear what needs to change, and how, in order for this to happen. (Apart from the obvious - do something for the libraries that still have their docs in .html form and don't participate in the doc build at all.)
can use an archive that is smaller (because it contains no documentation)
There is an idea that we have started experimenting with, going even further than skipping html docs compilation. - delete the entire doc/ folder, from every library. - remove the test/ folder from each library. - remove the example/ folder from each library. In other words, only important C++ "source" code such as include/ and src/ would remain. Technical feedback is welcome. Do you believe this is untenable or would cause problems for users? Deleting the above mentioned directories breaks compilation in certain cases. It must be investigated further.
Sam Darwin wrote:
can use an archive that is smaller (because it contains no documentation)
There is an idea that we have started experimenting with, going even further than skipping html docs compilation.
- delete the entire doc/ folder, from every library. - remove the test/ folder from each library. - remove the example/ folder from each library.
How much does this gain? And same question, but only with doc/ deleted. I suspect that most of the gains can be realized by deleting three or four carefully chosen libs/X/doc folders.
On 5/24/24 22:24, Sam Darwin via Boost wrote:
How much does this gain?
Tested a few months ago. Don't have the exact numbers, but I think the archives were 50% smaller. You are right, focusing on doc/ might be enough. Still, if other folders can be removed, and not lose functionality...
It depends on what you're willing to lose. Removing docs is mostly acceptable since they are available online, and going online is usually not a problem. Though going online might still be a problem. For example, one of my past employers limited Internet access on the work place, so employees could only visit whitelisted websites. Tests and examples don't have an online equivalent, so if a user wants to test a library in his particular environment, he can't do that if tests are not distributed. Similarly, if he's reading a library docs that refer to an example, he can't view the example and may not be able to continue reading the docs.
On Fri, May 24, 2024 at 2:57 PM Andrey Semashev via Boost < boost@lists.boost.org> wrote:
Removing docs is mostly acceptable since they are available online
Well... I don't think removing them completely is a great idea. Instead, the Boost release could consist of two separate archives. One with the documentation, and the other with everything else. And if these two archives are unpacked into the same directory, the result is the same as when unpacking an all-in-one archive. Thanks
On 6/6/24 20:26, Vinnie Falco wrote:
On Fri, May 24, 2024 at 2:57 PM Andrey Semashev via Boost
mailto:boost@lists.boost.org> wrote: Removing docs is mostly acceptable since they are available online
Well... I don't think removing them completely is a great idea.
Right, to be clear, I didn't suggest to completely remove them from downloads. I'm just saying that an archive without the docs would make sense - in addition to an archive that contains docs (whether it is a full package with sources or just the docs).
The current release processes are designed to generate full archives, and it is convenient to keep this as one of potentially many build options. They are permanent archive records of a complete boost release. They can be deployed on the website, as-is. Scripts don't need to change. Any long-time boost users who might expect this choice, may still access it, the same as before. The explanation is clear to advertise: full downloads. The next choice: minimal, fast, source-only. This may become the new default "source-only". A question is whether test/ and example/ should also be removed. I did try compiling this, and it appeared to succeed. The result was around 25% of the previous size. That's quite an improvement. If both of the above choices exist, and are available as releases (source-only, and full), then this may be the sweet spot of complexity. It answer the needs of basically everyone, while only requiring one additional bundle. On Thu, Jun 6, 2024 at 2:46 PM Andrey Semashev via Boost < boost@lists.boost.org> wrote:
On 6/6/24 20:26, Vinnie Falco wrote:
On Fri, May 24, 2024 at 2:57 PM Andrey Semashev via Boost
mailto:boost@lists.boost.org> wrote: Removing docs is mostly acceptable since they are available online
Well... I don't think removing them completely is a great idea.
Right, to be clear, I didn't suggest to completely remove them from downloads. I'm just saying that an archive without the docs would make sense - in addition to an archive that contains docs (whether it is a full package with sources or just the docs).
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 6/7/24 00:56, Sam Darwin wrote:
The current release processes are designed to generate full archives, and it is convenient to keep this as one of potentially many build options. They are permanent archive records of a complete boost release. They can be deployed on the website, as-is. Scripts don't need to change. Any long-time boost users who might expect this choice, may still access it, the same as before. The explanation is clear to advertise: full downloads.
The next choice: minimal, fast, source-only. This may become the new default "source-only". A question is whether test/ and example/ should also be removed. I did try compiling this, and it appeared to succeed. The result was around 25% of the previous size. That's quite an improvement.
I think, an archive that includes at least tests is useful for third party CI. And some libraries also build examples as part of their test suites, so those should be included as well. Perhaps, we could publish three variants: full, source-only (which includes tests and examples but not docs) and minimal (which only includes library sources)?
On Thu, Jun 6, 2024 at 5:51 PM Andrey Semashev via Boost
Perhaps, we could publish three variants: full, source-only (which includes tests and examples but not docs) and minimal (which only includes library sources)?
While it would be interesting to publish source (boost-dev) and docs (boost-docs) archives separately (and other variations thereof)... I would prefer if we instead expend effort into entirely automating and securing the current release archives and process before branching out. -- -- René Ferdinand Rivera Morell -- Don't Assume Anything -- No Supone Nada -- Robot Dreams - http://robot-dreams.net
I think, an archive that includes at least tests is useful for third party CI. And some libraries also build examples as part of their test suites, so those should be included as well.
This is the case yes. For automated installation using EasyBuild (similar to Spack) the tests of the software are run after building to ensure everything works in the current environment.
Perhaps, we could publish three variants: full, source-only (which includes tests and examples but not docs) and minimal (which only includes library sources)? Makes sense to me
On Jun 7, 2024, at 00:56, Sam Darwin via Boost
wrote: The next choice: minimal, fast, source-only. This may become the new default "source-only". A question is whether test/ and example/ should also be removed. I did try compiling this, and it appeared to succeed. The result was around 25% of the previous size. That's quite an improvement.
Last I looked, there were some jpeg image files in there, some a few megabytes large.
participants (4)
-
Alexander Grund
-
Andrey Semashev
-
Kostas Savvidis
-
Peter Dimov
-
René Ferdinand Rivera Morell
-
Ruben Perez
-
Sam Darwin
-
Vinnie Falco