[text] Copyright notices

newer
[review] [text] Text formal review

Phil Endecott

15 Jun 2020 15 Jun '20

4:19 p.m.

Dear All, Looking at the proposed Boost.Text source, I find that: 1. Many files don't have a copyright header. Random example: https://github.com/tzlaine/text/blob/master/include/boost/text/unencoded_rop... I guess this is an oversight and is easily fixed. 2. Some files have a copyright header which is not the Boost licence. Example: https://github.com/tzlaine/text/blob/master/include/boost/text/transcode_alg... This one has a requirement that binary distribution includes the copyright notice, which is not a requirement of the Boost licence, and will be problematic for some users. Are the only affected files the SIMD implementation, (c) Robert N Steagall? If so, can this be disabled (by default?) by the user to avoid the copyright notice requirement? 3. The documentation includes a Unicode copyright notice: https://tzlaine.github.io/text/doc/html/boost_text__proposed_/unicode_copyri... This also includes a requirement that the copyright notice is included in data files, software or associated documentation, whatever that means. It's not clear to me what this actually covers; again, can some functionality (e.g. UTF-N-to-M conversion without normalisation) work without the Unicode data? Regards, Phil.

Show replies by date

Andrey Semashev

15 Jun 15 Jun

4:35 p.m.

On 2020-06-15 19:19, Phil Endecott via Boost wrote:

...

2. Some files have a copyright header which is not the Boost licence. Example: https://github.com/tzlaine/text/blob/master/include/boost/text/transcode_alg...

This one has a requirement that binary distribution includes the copyright notice, which is not a requirement of the Boost licence, and will be problematic for some users.

Are the only affected files the SIMD implementation, (c) Robert N Steagall? If so, can this be disabled (by default?) by the user to avoid the copyright notice requirement?

I think the fact that Boost uses the Boost Software License is a valuable feature. Having differently licensed files complicates inclusion in the downstream projects. In the particular case you referenced, it seems the license is 3-clause BSD. That license is incompatible with Boost license requirements: - Must not require that the license appear with executables or other binary uses of the library. https://www.boost.org/development/requirements.html#License I think, this issue should be blocking library acceptance.

degski

7:30 p.m.

On Mon, 15 Jun 2020 at 11:35, Andrey Semashev via Boost < boost@lists.boost.org> wrote:

...

On 2020-06-15 19:19, Phil Endecott via Boost wrote:

...
2. Some files have a copyright header which is not the Boost licence. Example:

https://github.com/tzlaine/text/blob/master/include/boost/text/transcode_alg...

...
This one has a requirement that binary distribution includes the copyright notice, which is not a requirement of the Boost licence, and will be problematic for some users.

Are the only affected files the SIMD implementation, (c) Robert N Steagall? If so, can this be disabled (by default?) by the user to avoid the copyright notice requirement?

I think the fact that Boost uses the Boost Software License is a valuable feature. Having differently licensed files complicates inclusion in the downstream projects.

Yes, also for libraries outside of Boost (some obviously by ml'ers), I always like to see with a Boost License (or MIT). degski

Niall Douglas

16 Jun 16 Jun

12:53 p.m.

On 15/06/2020 17:19, Phil Endecott via Boost wrote:

...

Are the only affected files the SIMD implementation, (c) Robert N Steagall? If so, can this be disabled (by default?) by the user to avoid the copyright notice requirement?

If that's *the* Bob Steagall I'm almost certain he'll relicence his code under Boost if you ask him. Niall

Hans Dembinski

1:45 p.m.

...

On 16. Jun 2020, at 14:53, Niall Douglas via Boost <boost@lists.boost.org> wrote:

On 15/06/2020 17:19, Phil Endecott via Boost wrote:

...
Are the only affected files the SIMD implementation, (c) Robert N Steagall? If so, can this be disabled (by default?) by the user to avoid the copyright notice requirement?

If that's *the* Bob Steagall I'm almost certain he'll relicence his code under Boost if you ask him.

I think a multi-platform library like Boost should avoid using SIMD intrinsics and write the code so that the auto-vectorisers of the compilers understand it - if that is possible. I don't claim it is possible here, but I have a lot of trust in the auto-vectorisers of gcc and clang. Godbolt is very helpful in designing code that can be auto-vectorised well. Best regards, Hans

Andrey Semashev

2:32 p.m.

On 2020-06-16 16:45, Hans Dembinski via Boost wrote:

...

...
On 16. Jun 2020, at 14:53, Niall Douglas via Boost <boost@lists.boost.org> wrote:

On 15/06/2020 17:19, Phil Endecott via Boost wrote:

...
Are the only affected files the SIMD implementation, (c) Robert N Steagall? If so, can this be disabled (by default?) by the user to avoid the copyright notice requirement?

If that's *the* Bob Steagall I'm almost certain he'll relicence his code under Boost if you ask him.

I think a multi-platform library like Boost should avoid using SIMD intrinsics and write the code so that the auto-vectorisers of the compilers understand it - if that is possible. I don't claim it is possible here, but I have a lot of trust in the auto-vectorisers of gcc and clang. Godbolt is very helpful in designing code that can be auto-vectorised well.

In my experience, auto-vectorization is useless in all but trivial cases.

Zach Laine

3:54 p.m.

On Mon, Jun 15, 2020 at 11:20 AM Phil Endecott via Boost <boost@lists.boost.org> wrote:

...

Dear All,

Looking at the proposed Boost.Text source, I find that:

1. Many files don't have a copyright header. Random example: https://github.com/tzlaine/text/blob/master/include/boost/text/unencoded_rop... I guess this is an oversight and is easily fixed.

Yes, definitely an oversight. I'll fix these.

...

2. Some files have a copyright header which is not the Boost licence. Example: https://github.com/tzlaine/text/blob/master/include/boost/text/transcode_alg...

This one has a requirement that binary distribution includes the copyright notice, which is not a requirement of the Boost licence, and will be problematic for some users.

Are the only affected files the SIMD implementation, (c) Robert N Steagall? If so, can this be disabled (by default?) by the user to avoid the copyright notice requirement?

I can ask Bob for his permission to release these under the Boost license. I'm certain he'll say yes.

...

3. The documentation includes a Unicode copyright notice: https://tzlaine.github.io/text/doc/html/boost_text__proposed_/unicode_copyri... This also includes a requirement that the copyright notice is included in data files, software or associated documentation, whatever that means.

I think it means exactly what I already did -- I put the copyright notice in the associated documentation. Note that this code comes from ICU, and Boost already has an optional dependency on ICU.

...

It's not clear to me what this actually covers; again, can some functionality (e.g. UTF-N-to-M conversion without normalisation) work without the Unicode data?

It covers the files in include/boost/text/detail/icu. Those files provide normalization. The transcoding stuff is unaffected. Zach

Zach Laine

5 p.m.

On Tue, Jun 16, 2020 at 10:54 AM Zach Laine <whatwasthataddress@gmail.com> wrote:

...

On Mon, Jun 15, 2020 at 11:20 AM Phil Endecott via Boost <boost@lists.boost.org> wrote:

...
2. Some files have a copyright header which is not the Boost licence. Example: https://github.com/tzlaine/text/blob/master/include/boost/text/transcode_alg...

This one has a requirement that binary distribution includes the copyright notice, which is not a requirement of the Boost licence, and will be problematic for some users.

Are the only affected files the SIMD implementation, (c) Robert N Steagall? If so, can this be disabled (by default?) by the user to avoid the copyright notice requirement?

I can ask Bob for his permission to release these under the Boost license. I'm certain he'll say yes.

Bob just gave his permission to relicense that code under the Boost license. That just leaves the issue with the ICU copyright. Zach

Phil Endecott

17 Jun 17 Jun

4 p.m.

Zach Laine wrote:

...

On Mon, Jun 15, 2020 at 11:20 AM Phil Endecott via Boost <boost@lists.boost.org> wrote:

...
3. The documentation includes a Unicode copyright notice: https://tzlaine.github.io/text/doc/html/boost_text__proposed_/unicode_copyri... This also includes a requirement that the copyright notice is included in data files, software or associated documentation, whatever that means.

I think it means exactly what I already did -- I put the copyright notice in the associated documentation. Note that this code comes from ICU, and Boost already has an optional dependency on ICU.

I believe that the requirement also applies to the users of Boost.Text, and that's where the problem lies. In the case of Boost.Locale, IIUC it links with ICU; you don't link with ICU but instead you've embedded tables derived from ICU into your source, so it's less obvious to users that they are using ICU and need to acknowledge its copyright somewhere. There's a rather longer ICU copyright statement here: https://github.com/unicode-org/icu/blob/master/icu4c/LICENSE Is Boost.Text using the various CJK word-splitting tables that it refers to? If users of Boost.Text need to include that in their products' smallprint, your docs should make that clear. Regards, Phil.

Peter Dimov

4:30 p.m.

Phil Endecott wrote:

...

If users of Boost.Text need to include that in their products' smallprint, your docs should make that clear.

A Boost library is not allowed to mandate this from users, regardless of whether its docs make it clear. Such a requirement is a hard blocker, a library like that can't be distributed in a Boost release. Or is not supposed to be, anyway.

Zach Laine

19 Jun 19 Jun

10:58 p.m.

On Wed, Jun 17, 2020 at 11:31 AM Peter Dimov via Boost <boost@lists.boost.org> wrote:

...

Phil Endecott wrote:

...
If users of Boost.Text need to include that in their products' smallprint, your docs should make that clear.

A Boost library is not allowed to mandate this from users, regardless of whether its docs make it clear. Such a requirement is a hard blocker, a library like that can't be distributed in a Boost release. Or is not supposed to be, anyway.

Of course, and I don't think Boost.Text actually imposes any such requirement. Boost.Locale also uses ICU headers and links to ICU, and neither shows the Unicode copyright, nor requires its users to do so. Boost.Text will have some code derived from ICU, and that implies that Boost.Text should impose the same license-notification obligations on the user that ICU does on Boost.Locale. That is, none. If Boost.Text users are violating the Unicode license by not showing it to their users, Boost.Locale has been in violation as well for a decade or so. It may be that I'm wrong about showing the Unicode license in Boost.Text, and it does not need to. I did it "just in case". Zach

Peter Dimov

11:31 p.m.

Zach Laine wrote:

...

Of course, and I don't think Boost.Text actually imposes any such requirement. Boost.Locale also uses ICU headers and links to ICU, and neither shows the Unicode copyright, nor requires its users to do so.

Boost.Text will have some code derived from ICU, and that implies that Boost.Text should impose the same license-notification obligations on the user that ICU does on Boost.Locale. That is, none.

This is one of those things for which lawyers say "this is an interesting position, and I'd be glad to try it in court for you and see how it goes." That is, while you could argue that, it's not really clear cut. Boost.Locale does not distribute any ICU code, and the idea that using API headers constitutes creating a derived work is an interesting proposition, one that we should certainly not hope to be true. Is it really necessary to use the original ICU files?

Darryl Green

20 Jun 20 Jun

6:58 a.m.

On Sat, 20 Jun 2020 at 09:32, Peter Dimov via Boost <boost@lists.boost.org> wrote:

...

Zach Laine wrote:

...
Of course, and I don't think Boost.Text actually imposes any such requirement. Boost.Locale also uses ICU headers and links to ICU, and neither shows the Unicode copyright, nor requires its users to do so.

But Boost.Local merely depends on ICU being installed. It doesn't include any ICU code.

...

...
Boost.Text will have some code derived from ICU, and that implies that Boost.Text should impose the same license-notification obligations on the user that ICU does on Boost.Locale. That is, none.

This is one of those things for which lawyers say "this is an interesting position, and I'd be glad to try it in court for you and see how it goes."

IANAL. But I also don't see how one can extrapolate from a case where headers and a library binary are obtained/built/installed by a user and then used to resolve a Boost.Local dependency to one where some of that libraries "source files" (not to be specific about file/content type/semantics) are used to produce files that form part of Boost.Text source distribution. Note this is based on my rough understanding of what Boost.Text is actually doing/depending on from unicode.org / ICU. Please correct me if I am wrong.

...

<snip>

...

Is it really necessary to use the original ICU files?

A different set of questions: 1. Does a Boost.Text library user (not author) actually need any ICU files when using Boost.Text? 2. Is the use of ICU files by Boost.Text limited to the generation (by the author/committer, not an end user) of C++ static definitions from ICU .txt file definitions / data? 3. Is there code in Boost.Test itself that was copied and modified from ICU sources? I had been (in another thread) assuming the answers to the above were No, Yes, No. Had that been the case I would have (and still will in the context of that part of the usage) argue that the use of ICU data (in .txt files) was by Zach to extract values from that published data and use them to produce C++. This extraction of data (effectively, parts of the unicode spec - one could have extracted/obtained it other ways) has been automated by Zach but that doesn't mean that the output of the automated process (the generator program/template being a creative work by Zach, not ICU) is necessarily a derived work. This depends on whether the source data used was, by itself, without the rest of the ICU code etc a creative work v.s just a list of values/information but "just data". But question 3 is more problematic - I think the answer is actually Yes. Some headers in https://github.com/tzlaine/text/tree/master/include/boost/text/detail/icu seem to be modified from ICU headers. Some of them have quite complex derivation/licence pre-history - as well as significant modification by Zach. This is more of a problem than any use of original ICU files imho because they can't be replaced with references to the ICU files (vs including them in Boost). Possible(?) solutions are: 1. I am wrong (lets hope). 2. A "clean-room" re-implementation of these headers without use/reference to ICU code 3. Modify Boost.Text detail to be an adapter between the unmodified ICU code/interface and Boost.Text (I have no idea if that is practical) 4. Give in and include ICU and unicode.org licence terms etc as/where required (I realise this won't happen and is a very messy proposition - ICU shows some of the issues with not having a policy like Boost's) Only 1 and 2 avoid requiring that users install ICU (or at least ICU headers)... Perhaps I am wrong about the answers to the questions - but I think they do need answering/resolving.

Andrey Semashev

8:09 a.m.

On 2020-06-20 01:58, Zach Laine via Boost wrote:

...

On Wed, Jun 17, 2020 at 11:31 AM Peter Dimov via Boost <boost@lists.boost.org> wrote:

...
Phil Endecott wrote:

...
If users of Boost.Text need to include that in their products' smallprint, your docs should make that clear.

A Boost library is not allowed to mandate this from users, regardless of whether its docs make it clear. Such a requirement is a hard blocker, a library like that can't be distributed in a Boost release. Or is not supposed to be, anyway.

Of course, and I don't think Boost.Text actually imposes any such requirement. Boost.Locale also uses ICU headers and links to ICU, and neither shows the Unicode copyright, nor requires its users to do so.

Boost.Locale is different in two respects: 1. The dependency on ICU is optional. Users can use Boost.Locale with other backends, and I believe one of these backends doesn't add any dependencies but the standard C++ library. 2. Even if you use ICU backend, Boost.Locale code is still covered by BSL. Whatever the additional requirements on the user, they are coming from ICU, not Boost, and per p.1 the user is in control of this. If Boost.Text offered the same kind of flexibility then this would be much less of an issue. On this note I'm going to ask something probably obvious, but still: 1. Why do you have to embed ICU code or data in Boost.Text? Why can't you link with it? 2. Is it possible to implement the ICU part locally, in Boost.Text, and cover that implementation with BSL?

Zach Laine

9:35 p.m.

On Sat, Jun 20, 2020 at 3:10 AM Andrey Semashev via Boost <boost@lists.boost.org> wrote:

...

On 2020-06-20 01:58, Zach Laine via Boost wrote:

...
On Wed, Jun 17, 2020 at 11:31 AM Peter Dimov via Boost <boost@lists.boost.org> wrote:

...
Phil Endecott wrote:

...
If users of Boost.Text need to include that in their products' smallprint, your docs should make that clear.

A Boost library is not allowed to mandate this from users, regardless of whether its docs make it clear. Such a requirement is a hard blocker, a library like that can't be distributed in a Boost release. Or is not supposed to be, anyway.

Of course, and I don't think Boost.Text actually imposes any such requirement. Boost.Locale also uses ICU headers and links to ICU, and neither shows the Unicode copyright, nor requires its users to do so.

Boost.Locale is different in two respects:

1. The dependency on ICU is optional. Users can use Boost.Locale with other backends, and I believe one of these backends doesn't add any dependencies but the standard C++ library.

2. Even if you use ICU backend, Boost.Locale code is still covered by BSL. Whatever the additional requirements on the user, they are coming from ICU, not Boost, and per p.1 the user is in control of this.

If Boost.Text offered the same kind of flexibility then this would be much less of an issue.

On this note I'm going to ask something probably obvious, but still:

1. Why do you have to embed ICU code or data in Boost.Text? Why can't you link with it?

Because that implies a 20% slowdown (more, in cases where you're not using the exact pointer types used by ICU), and of course a dependency on ICU.

...

2. Is it possible to implement the ICU part locally, in Boost.Text, and cover that implementation with BSL?

I don't know. I think that requires a "clean room reimplementation", or it's legally no different in many places. Zach

Zach Laine

9:40 p.m.

On Sat, Jun 20, 2020 at 4:35 PM Zach Laine <whatwasthataddress@gmail.com> wrote:

...

On Sat, Jun 20, 2020 at 3:10 AM Andrey Semashev via Boost <boost@lists.boost.org> wrote:

...
On this note I'm going to ask something probably obvious, but still:

1. Why do you have to embed ICU code or data in Boost.Text? Why can't you link with it?

Because that implies a 20% slowdown (more, in cases where you're not using the exact pointer types used by ICU), and of course a dependency on ICU.

To be clear, I think this is the least bad option on the table at the moment. I'm not happy with losing that extra perf, but I also think it's more important to get the interfaces into the hands of users, as this library is explicitly targeting standardization. I honestly had not considered the fact that including ICU code under the Unicode license in a Boost distribution of course requires all users to review the Unicode license in addition to the Boost one. I agree that this is not a reasonable way forward. Zach

1845

Age (days ago)

1850

Last active (days ago)

List overview

Download

15 comments

8 participants

participants (8)

Andrey Semashev
Darryl Green
degski
Hans Dembinski
Niall Douglas
Peter Dimov
Phil Endecott
Zach Laine