On 8/01/2020 14:43, Peter Dimov wrote:
Yes, concatenating two character sequences can result in technically invalid WTF-8. But that's not an issue unique to Windows. You can do the same on any non-Windows platform. It's still not clear how this prevents a `path` class from storing ~WTF-8 on Windows, or exposing a char-based API that ~WTF-8 decodes when passing to Windows, and encodes on the reverse trip.
It could. And if you're only round-tripping it to file APIs and doing nothing else, then you can probably get away with that. But there's probably other code that wants to do manipulation on the path (swapping extensions, passing to some UI, truncating the filename to 10 characters, etc). Now there's more parts of the system that needs to know you have data in not-legal-WTF-8 format, and how to deal with that. (Or more likely you end up passing it to something that expects legal UTF-8 without telling it otherwise, and it mostly works -- until it doesn't.)