Library metadata

newer
[GSoC 2014] Http Server Proposal

Daniel James

23 Feb 2014 23 Feb '14

2:55 p.m.

Hi, I'm starting to add metadata to modules for updating the website, and hopefully other uses in the future. I added the metadata to Boost.Unordered and Boost.Functional to give you an idea of what this will look like: https://github.com/boostorg/functional/tree/develop/meta https://github.com/boostorg/unordered/tree/develop/meta You can see several libraries listed in the Boost.Functional file. I'll write some documentation soon, but it should be fairly easy to understand. I basically took what was in the website library list and split it up into modules. Quick summary of the fields: key - Used to be the website to identify each library. Don't change it, or the website will think it's a new library. boost-version - The version in which the library was released. name - The name of the library. authors - The authors of the library - will probably add a 'maintainers' field later. description - Library description - this is used in the library list, so keep it short. std-proposal - Is the library proposed for the standard. std-tr1 - Is the library part of TR1. category - The categories the library belongs to. one element for each category. The documentation will include a list of available categories. documentation - Path to the documentation. The documentation field is optional. If it isn't included, it'll use the default path: /libs/module-name/. If it is included, relative paths are resolved relative to the module directory, absolute paths from boost root - although I'd suggest you always use a path within the module, and if there's only a single library in a module, use the default path. This isn't the final format - I think the standards status fields could do with an overhaul to support C++11 etc. You also don't need to write the initial files yourself, I'll generate them and create pull requests once this is settled. Let me know what you think. thanks, Daniel

Show replies by date

Ahmed Charles

24 Feb 24 Feb

12:09 a.m.

----------------------------------------

...

Date: Sun, 23 Feb 2014 14:55:51 +0000 From: dnljms@gmail.com To: boost@lists.boost.org Subject: [boost] Library metadata

Hi,

I'm starting to add metadata to modules for updating the website, and hopefully other uses in the future. I added the metadata to Boost.Unordered and Boost.Functional to give you an idea of what this will look like:

https://github.com/boostorg/functional/tree/develop/meta https://github.com/boostorg/unordered/tree/develop/meta

You can see several libraries listed in the Boost.Functional file. I'll write some documentation soon, but it should be fairly easy to understand. I basically took what was in the website library list and split it up into modules. Quick summary of the fields:

key - Used to be the website to identify each library. Don't change it, or the website will think it's a new library. boost-version - The version in which the library was released. name - The name of the library. authors - The authors of the library - will probably add a 'maintainers' field later. description - Library description - this is used in the library list, so keep it short. std-proposal - Is the library proposed for the standard. std-tr1 - Is the library part of TR1. category - The categories the library belongs to. one element for each category. The documentation will include a list of available categories. documentation - Path to the documentation.

The documentation field is optional. If it isn't included, it'll use the default path: /libs/module-name/. If it is included, relative paths are resolved relative to the module directory, absolute paths from boost root - although I'd suggest you always use a path within the module, and if there's only a single library in a module, use the default path.

This isn't the final format - I think the standards status fields could do with an overhaul to support C++11 etc. You also don't need to write the initial files yourself, I'll generate them and create pull requests once this is settled.

Let me know what you think.

+1 on adding a maintainers field. On another note, did you consider JSON as a viable format or is there a strong reason to go with XML?

Daniel James

12:45 a.m.

On 24 February 2014 00:09, Ahmed Charles <acharles@outlook.com> wrote:

...

On another note, did you consider JSON as a viable format or is there a strong reason to go with XML?

I'm just using it because it's the existing data format. JSON is fine, the data is simple enough that it should be easy to support. Some time ago, I decided not to use it on the site because I wanted to support Python 2.5, but I don't think that's necessary any more. Although if we do something similar for the expected failures, that will probably have to be in xml.

Daniel James

26 Feb 26 Feb

7:31 a.m.

On 24 February 2014 00:45, Daniel James <dnljms@gmail.com> wrote:

...

On 24 February 2014 00:09, Ahmed Charles <acharles@outlook.com> wrote:

...
On another note, did you consider JSON as a viable format or is there a strong reason to go with XML?

I'm just using it because it's the existing data format. JSON is fine, the data is simple enough that it should be easy to support. Some time ago, I decided not to use it on the site because I wanted to support Python 2.5, but I don't think that's necessary any more. Although if we do something similar for the expected failures, that will probably have to be in xml.

OK, I've added support for json and added a maintainers fields. The functional metadata is now in JSON: https://github.com/boostorg/functional/blob/develop/meta/libraries.json Both 'authors' and 'maintainers' can be a single string or an array of strings.

Robert Kawulak

17 Mar 17 Mar

1:13 a.m.

...

From: Daniel James OK, I've added support for json and added a maintainers fields.

So how about a single 'std' field instead of those 'std-...' ones?

Daniel James

11:47 a.m.

On 17 March 2014 01:13, Robert Kawulak <robert.kawulak@gmail.com> wrote:

...

...
From: Daniel James OK, I've added support for json and added a maintainers fields.

So how about a single 'std' field instead of those 'std-...' ones?

I'll try to do that soon. It's not a big job, but I still have the reservations I mentioned before.

Robert Kawulak

24 Feb 24 Feb

1:52 a.m.

...

From: Daniel James std-proposal - Is the library proposed for the standard. std-tr1 - Is the library part of TR1.

Just an idea: it seems like this approach is not too flexible given that there are numerous standards/TRs/TSs coming now and in the future that possibly include Boost libraries. Instead of adding a new field for each one of them, maybe it would be better to have one "std" field with a list specifying in which standard/TR/TS a library is included/proposed for? Best regards, Robert

Glen Fernandes

2:55 a.m.

On Sun, Feb 23, 2014 at 5:52 PM, Robert Kawulak <robert.kawulak@gmail.com> wrote:

...

Just an idea: it seems like this approach is not too flexible given that there are numerous standards/TRs/TSs coming now and in the future that possibly include Boost libraries. Instead of adding a new field for each one of them, maybe it would be better to have one "std" field with a list specifying in which standard/TR/TS a library is included/proposed for?

Something flexible, like a list, would be good. In Boost.Smart_Ptr, we have class templates like shared_ptr and function templates like make_shared that are part of C++11, but the Boost implementations are now improved and proposed for the next standard (N3920 in TS1, and N3939 in TS2). Glen

Daniel James

12:55 p.m.

On 24 February 2014 02:55, Glen Fernandes <glen.fernandes@gmail.com> wrote:

...

On Sun, Feb 23, 2014 at 5:52 PM, Robert Kawulak <robert.kawulak@gmail.com> wrote:

...
Just an idea: it seems like this approach is not too flexible given that there are numerous standards/TRs/TSs coming now and in the future that possibly include Boost libraries. Instead of adding a new field for each one of them, maybe it would be better to have one "std" field with a list specifying in which standard/TR/TS a library is included/proposed for?

Something flexible, like a list, would be good. In Boost.Smart_Ptr, we have class templates like shared_ptr and function templates like make_shared that are part of C++11, but the Boost implementations are now improved and proposed for the next standard (N3920 in TS1, and N3939 in TS2).

This is what I was referring to when I said that the fields could do with an overhaul. A list should be fine, especially if we use json. I'm not sure how useful the existing fields are, the reality is often more complicated than a simple data structure can handle, and I don't think the existing filters on the library list are very useful.

Jason Roehm

12:18 p.m.

On 02/23/2014 09:55 AM, Daniel James wrote:

...

This isn't the final format - I think the standards status fields could do with an overhaul to support C++11 etc. You also don't need to write the initial files yourself, I'll generate them and create pull requests once this is settled.

Let me know what you think.

Maybe this is out of scope, but another piece of metadata that could be useful would be some indication of compiler support (or non-support). Since some libraries are known to not work with certain compilers, it would be nice if there was a standard, central place to find that information on a library-by-library basis. I once spent a while trying to find open bugs that would explain why Boost.Coroutine didn't compile on my system with gcc 4.4. After a lot of searching, I found that this fact was known and indicated in the regression test configuration (i.e. gcc 4.4 isn't a supported compiler). I was unable to find this stated anywhere in the documentation, though. I think it could be beneficial to users if there was a standard place to find this information quickly. Jason

Daniel James

1:10 p.m.

On 24 February 2014 12:18, Jason Roehm <jasonr@3db-labs.com> wrote:

...

Maybe this is out of scope, but another piece of metadata that could be useful would be some indication of compiler support (or non-support). Since some libraries are known to not work with certain compilers, it would be nice if there was a standard, central place to find that information on a library-by-library basis. I once spent a while trying to find open bugs that would explain why Boost.Coroutine didn't compile on my system with gcc 4.4. After a lot of searching, I found that this fact was known and indicated in the regression test configuration (i.e. gcc 4.4 isn't a supported compiler). I was unable to find this stated anywhere in the documentation, though. I think it could be beneficial to users if there was a standard place to find this information quickly.

This data is sort of stored in the explicit failure xml file (in the status directory), although perhaps not in a form that easily summarised for users. As an example, the coroutine entry follows. It only gives very specific versions of gcc, rather than stating 'gcc 4.4 and earlier', or whatever is appropriate. This is pretty much baked into the xml schema, which is only designed for testing purposes. There's a vague plan to break up the file and put it in the metadata folders, maybe that could be extended to include some documentation info which we could present to users.  <library name="coroutine"> <mark-unusable> <toolset name="cray-*"/> <toolset name="darwin-4.4"/> <toolset name="darwin-4.4_0x"/> <toolset name="gcc-4.4.4"/> <toolset name="gcc-4.4.4_0x"/> <toolset name="msvc-8.0"/> <toolset name="pgi-*"/> <toolset name="vacpp-*"/> <toolset name="gcc-mingw-4.4*"/> </mark-unusable> </library>

Paul A. Bristow

3:14 p.m.

...

-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Daniel James Sent: Monday, February 24, 2014 1:10 PM To: boost@lists.boost.org Subject: Re: [boost] Library metadata

On 24 February 2014 12:18, Jason Roehm <jasonr@3db-labs.com> wrote:

...
Maybe this is out of scope, but another piece of metadata that could be useful would be some indication of compiler support (or non-support). Since some libraries are known to not work with certain compilers, it would be nice if there was a standard, central place to find that information on a library-by-library basis. I once spent a while trying to find open bugs that would explain why Boost.Coroutine didn't compile on my

system with gcc 4.4.

...
After a lot of searching, I found that this fact was known and indicated in the regression test configuration (i.e. gcc 4.4 isn't a supported compiler). I was unable to find this stated anywhere in the documentation, though. I think it could be beneficial to users if there was a standard place to find this information quickly.

This data is sort of stored in the explicit failure xml file (in the status directory), although

perhaps not in a

...

form that easily summarised for users. As an example, the coroutine entry follows. It only gives very specific versions of gcc, rather than stating 'gcc 4.4 and earlier', or whatever is appropriate. This is pretty much baked into the xml schema, which is only designed for testing purposes. There's a vague plan to break up the file and put it in the metadata folders, maybe that could be extended to include some documentation info which we could present to users.

 <library name="coroutine"> <mark-unusable> <toolset name="cray-*"/> <toolset name="darwin-4.4"/> <toolset name="darwin-4.4_0x"/> <toolset name="gcc-4.4.4"/> <toolset name="gcc-4.4.4_0x"/> <toolset name="msvc-8.0"/> <toolset name="pgi-*"/> <toolset name="vacpp-*"/> <toolset name="gcc-mingw-4.4*"/> </mark-unusable> </library>

Will this be at a low enough level to be useful for a rambling library like Boost.Math? When one could look at the http://www.boost.org/development/tests/trunk/developer/math.html results, they showed a lot of yellow, but only for some compilers and platforms for *some tests*. There are: * very few compilers where the advice is "No way :-(" - don't bother. * lots where many uses will work (but not quite all uses, despite heroic efforts to cater for their foibles). * a few up-to-date ones with a "Everything works" advice. If all but the latter are marked as unusable, this will give an unnecessarily pessimistic view on the library. Boost.Multiprecision poses similar problems, and I am sure there are others too. Paul PS I didn't find the regression page immediately using my favourite search engine. I'm not quite sure what it should be called, but regression isn't the first name I would think of? It would be nice if users got to this metadata easily - without searching for 'regression' & 'metadata' ;-)

Steven Watanabe

10:48 p.m.

AMDG On 02/24/2014 07:14 AM, Paul A. Bristow wrote:

...

Will this be at a low enough level to be useful for a rambling library like Boost.Math?

When one could look at the http://www.boost.org/development/tests/trunk/developer/math.html results, they showed a lot of yellow, but only for some compilers and platforms for *some tests*.

You can also mark up individual test cases like this:  <library name="algorithm"> <mark-expected-failures> <test name="empty_search_test"/> <test name="search_test1"/> <test name="search_test2"/> <test name="search_test3"/> <test name="is_permutation_test1"/> <toolset name="vacpp-10.1"/> <note author="Marshall Clow"> These failures are caused by a lack of support/configuration for Boost.Tr1 </note> </mark-expected-failures> </library> In Christ, Steven Watanabe

Daniel James

11:16 p.m.

On 24 February 2014 15:14, Paul A. Bristow <pbristow@hetp.u-net.com> wrote:

...

Will this be at a low enough level to be useful for a rambling library like Boost.Math?

This is what Boost.Math is using already, it's entry is a lot longer. As Steven pointed out, I did show a lot of what can be represented.

...

When one could look at the http://www.boost.org/development/tests/trunk/developer/math.html results, they showed a lot of yellow, but only for some compilers and platforms for *some tests*.

There are:

* very few compilers where the advice is "No way :-(" - don't bother.

* lots where many uses will work (but not quite all uses, despite heroic efforts to cater for their foibles).

* a few up-to-date ones with a "Everything works" advice.

If all but the latter are marked as unusable, this will give an unnecessarily pessimistic view on the library.

Boost.Multiprecision poses similar problems, and I am sure there are others too.

Sure, that's why it's tricky to add to the library metadata, I don't want that file to be too complicated.

...

PS I didn't find the regression page immediately using my favourite search engine.

You won't because the site's robots.txt tells bots not access any pages under 'development'. This was done because the test results are expensive to serve, and web crawlers were accessing every single test result. I could probably relax it a bit so that the plain html pages aren't blocked.

...

I'm not quite sure what it should be called, but regression isn't the first name I would think of? It would be nice if users got to this metadata easily - without searching for 'regression' & 'metadata' ;-)

The idea is that the contents of the metadata files would be used to generate information that's presented to users. Currently this is just the library list on the website, but there should be other uses for this.

Eric Niebler

7:20 p.m.

On 02/23/2014 06:55 AM, Daniel James wrote:

...

This isn't the final format - I think the standards status fields could do with an overhaul to support C++11 etc. You also don't need to write the initial files yourself, I'll generate them and create pull requests once this is settled.

Let me know what you think.

For the ryppl project, a lot of thought was put into metadata. With Dave working at Apple now, I'm not sure whether ryppl is still "funct" (as opposed to defunct), but maybe you could ask on the ryppl-dev list for some guidance? It would suck if the formats/fields were incompatible. If nobody replies, then for sure do your own thing. Eric

Daniel James

11:14 p.m.

On 24 February 2014 19:20, Eric Niebler <eniebler@boost.org> wrote:

...

On 02/23/2014 06:55 AM, Daniel James wrote:

...
This isn't the final format - I think the standards status fields could do with an overhaul to support C++11 etc. You also don't need to write the initial files yourself, I'll generate them and create pull requests once this is settled.

Let me know what you think.

For the ryppl project, a lot of thought was put into metadata. With Dave working at Apple now, I'm not sure whether ryppl is still "funct" (as opposed to defunct), but maybe you could ask on the ryppl-dev list for some guidance? It would suck if the formats/fields were incompatible. If nobody replies, then for sure do your own thing.

I find it very hard to care about Ryppl. I'm more concerned with what we need to do now and what we've been doing for years than compatibility with a format we may never use. If in the future the metadata format needs to change, it shouldn't be a huge undertaking to do that, it can be mostly automated. Compared to the other challenges involved with ryppl it would be trivial. I'm not dealing with dependencies or anything tricky.

4129

Age (days ago)

4151

Last active (days ago)

List overview

Download

15 comments

8 participants

participants (8)

Ahmed Charles
Daniel James
Eric Niebler
Glen Fernandes
Jason Roehm
Paul A. Bristow
Robert Kawulak
Steven Watanabe