On 21/06/2017 08:13, Artyom Beilis wrote:
1. Conversion will always lead to **valid UTF-8/UTF-16** regardless validity of the source unlike the current status that returns error/creates error status.
2. Instead of failing the conversion and returning an error the invalid characters will be replaced with U-FFFD - Replacement Character - similar to behavior of WinAPI.
So you will not get Invalid UTF-16 <- Quazy UTF-8 -> Invalid UTF-16 path but you will be able to complete the path as: Invalid UTF-16 <- Valid UTF-8 with substitutions -> Valid UTF-16
Isn't the problem case where you get an arbitrary-block-of-bytes (UTF-8-ish in POSIX and UTF-16-ish in Windows) filename from some other API (eg. readdir), convert to really-UTF-8 for internal use (eg. manipulation, display), and then go back to the OS to try to actually use that filename and get an unexpected "file not found" because it didn't round-trip? I don't know if there is a good solution for this other than never converting any paths and always working in the native encoding of the OS, though.