On Mon, Aug 24, 2020 at 8:30 AM Phil Endecott via Boost
Zach Laine wrote:
On Sun, Aug 23, 2020 at 11:08 AM Phil Endecott via Boost
wrote: Could you please explain what you've done about the copyright issues?
Sure. I've reimplemented the code that originally came from ICU, and ...
As far as I can tell, you still depend on the Unicode data files that have a Boost-incompatible licence. You previously included this Unicode copyright text in the documentation but that page has now been removed, if I'm looking in the right place.
... removed the ICU copyright from these files. They are the output of a code generation tool, and so are not copyrightable individually (like the output of lex and yacc).
For the benefit of everyone else let me describe what Zach does:
1. There are some files at unicode.org that have a Boost-incompatible licence.
2. Zach has some Python scripts at https://github.com/tzlaine/text/tree/master/scripts
3. The scripts download the files from unicode.org, convert them into C++ source files, and prefix the result "(C) Zach Laine Boost License".
4. These generated files are checked in at https://github.com/tzlaine/text/tree/master/include/boost/text/data The intention is not that end-users of Boost.Text will run the scripts, but rather that the generated files will be included in the Boost source distribution.
Zach thinks this is OK because "they are the output of a code generation tool, and so are not copyrightable individually (like the output of lex and yacc)".
I think that's completely wrong. I believe it's a well-established principle of software copyright law that the output of a tool - whether that is g++, bison, or rot13 - is a derived work of the input to that tool. You cannot (without permission) take example.y that's (C) Megacorp, run bison on it, and claim that the resulting example.tab.c is now (C) Someone Else.
This worries me. We really, really don't want to be shipping code that has copyright violations!
Agreed, though I don't think this is one instance. If this is a copyright violation, we have been in violation for years and years already. Look in boost/spirit/home/support/char_encoding/unicode/UnicodeData.txt boost/spirit/home/support/char_encoding/unicode/DerivedCoreProperties.txt and the other files in that directory. Note that these are in the header paths, not inside src/ or something. DerivedCoreProperties.txt even has the Unicode copyright still on it. Moreover, the data in the files in that directory is derived from the .txt files. Even though the .txt files appear to have been removed on November 19, we end up with the same problem, to the extent it is a problem, that Phil raises -- distribution of code derived from non-code .txt files. Zach