[Boost.Asio] Segmentation fault when ::getaddrinfo() returns EAI_SYSTEM without system error code set (errno = 0) - is this a bug?
Hello Boost users, I experienced a segmentation fault inside Boost in the following scenario: o There was an illegal DNS entry o When resolving that illegal DNS entry on a Yocto-based embedded Linux distribution, ::getaddrinfo() returned with the error code EAI_SYSTEM but errno was set to 0 o The function translate_addrinfo_error(int) in boost/asio/detail/impl/socket_ops.ipp translates this into boost::system::error_code(0, boost::asio::error::get_system_category()) o This is then interpreted as a success by the calling function and the other output parameters of ::getaddrinfo() are accessed which resulted in the segmentation fault My question would be whether this should be considered a bug in the Boost ASIO library? One could argue that the system should not return EAI_SYSTEM with errno not being set properly. But one could also argue that returning EAI_SYSTEM indicates that something went wrong and that the output parameters are not to be accessed. Nils ------ Nils Frielinghaus
Can you please post a small, self-contained program that reproduces the segfault?
I understand that it would be very desirable to have a small self-contained program reproducing it, but I am struggling a bit to provide this, as the issue occurs in the interaction with the OS and the DNS.
I am currently unsure on how to wrap this into a simple reproducer.
My test program looks something like this, with the DNS having an incorrect entry for "incorrectdnsentry.test.net" mapping to "dnsentry.test.net/" (the slash is not legal here):
This set-up will produce a segfault on a Yocto-based Linux distribution.
#include <iostream>
#include
What can be easily demonstrated though, is, that the
On 24/08/2021 13:57, Nils Frielinghaus via Boost-users wrote: translate_addrinfo_error function converts EAI_SYSTEM with errno=0 (which is triggered by the above set-up) into a Boost error code that shows success.
So this program will output "Success".
This is a long standing design quirk of error_code which we have fixed in proposed status_code, which is hoped to supersede error_code in a future C++ standard. Niall
On Tue, 24 Aug 2021 at 16:36, Niall Douglas via Boost-users
On 24/08/2021 13:57, Nils Frielinghaus via Boost-users wrote:
What can be easily demonstrated though, is, that the translate_addrinfo_error function converts EAI_SYSTEM with errno=0 (which is triggered by the above set-up) into a Boost error code that shows success. So this program will output "Success".
This is a long standing design quirk of error_code which we have fixed in proposed status_code, which is hoped to supersede error_code in a future C++ standard.
I think you are talking about a different problem. The issue here is that https://pubs.opengroup.org/onlinepubs/9699919799/functions/getaddrinfo.html says "[EAI_SYSTEM] A system error occurred; the error code can be found in errno." and apparently there is a implementation not honouring this contract. This is happening way before there is any error_code.
On 24/08/2021 14:47, Cristian Morales Vega wrote:
On Tue, 24 Aug 2021 at 16:36, Niall Douglas via Boost-users
wrote: On 24/08/2021 13:57, Nils Frielinghaus via Boost-users wrote:
What can be easily demonstrated though, is, that the translate_addrinfo_error function converts EAI_SYSTEM with errno=0
(which is triggered by the
above set-up) into a Boost error code that shows success.
So this program will output "Success".
This is a long standing design quirk of error_code which we have fixed in proposed status_code, which is hoped to supersede error_code in a future C++ standard.
I think you are talking about a different problem. The issue here is that
https://pubs.opengroup.org/onlinepubs/9699919799/functions/getaddrinfo.html
says "[EAI_SYSTEM] A system error occurred; the error code can be found in errno." and apparently there is a implementation not honouring this contract. This is happening way before there is any error_code.
In proposed status_code we supply a getaddrinfo_code, so getaddrinfo codes get their own status code domain. EAI_SYSTEM compares equal to errc::resource_unavailable_try_again: https://github.com/ned14/status-code/blob/master/include/getaddrinfo_code.hp... One can't make a functional getaddrinfo_code for error_code because one of the getaddrinfo() code values is zero, which error_code treats as success. This is the quirk which we fixed. Niall
On Tue, 24 Aug 2021 at 17:25, Niall Douglas via Boost-users
In proposed status_code we supply a getaddrinfo_code, so getaddrinfo codes get their own status code domain. EAI_SYSTEM compares equal to errc::resource_unavailable_try_again:
Not sure how I feel about that. I also have just noticed that while POSIX says "[EAI_SYSTEM] A system error occurred; the error code can be found in errno.", which IMHO clearly says errno must be != 0, https://man7.org/linux/man-pages/man3/getaddrinfo.3.html says "EAI_SYSTEM Other system error, check errno for details.". I guess the later could accept errno == 0, meaning "no details".
On 24/08/2021 16:15, Cristian Morales Vega wrote:
On Tue, 24 Aug 2021 at 17:25, Niall Douglas via Boost-users
wrote: In proposed status_code we supply a getaddrinfo_code, so getaddrinfo codes get their own status code domain. EAI_SYSTEM compares equal to errc::resource_unavailable_try_again:
Not sure how I feel about that.
I also have just noticed that while POSIX says "[EAI_SYSTEM] A system error occurred; the error code can be found in errno.", which IMHO clearly says errno must be != 0, https://man7.org/linux/man-pages/man3/getaddrinfo.3.html says "EAI_SYSTEM Other system error, check errno for details.". I guess the later could accept errno == 0, meaning "no details".
Ah nice bug catch. I've logged it to https://github.com/ned14/status-code/issues/34. Once fixed, we'll also carry errno as payload, and if getaddrinfo returned EAI_SYSTEM, then getaddrinfo_code will present its errno for comparison instead (status_code allows arbitrary payloads,so this is easy). Niall
On Tue, 24 Aug 2021 at 16:36, Niall Douglas via Boost-users
wrote: On 24/08/2021 13:57, Nils Frielinghaus via Boost-users wrote:
What can be easily demonstrated though, is, that the translate_addrinfo_error function converts EAI_SYSTEM with errno=0 (which is triggered by the above set-up) into a Boost error code that shows success. So this program will output "Success".
This is a long standing design quirk of error_code which we have fixed in proposed status_code, which is hoped to supersede error_code in a future C++ standard.
I think you are talking about a different problem. The issue here is that https://pubs.opengroup.org/onlinepubs/9699919799/functions/getaddrinfo.html says "[EAI_SYSTEM] A system error occurred; the error code can be found in errno." and apparently there is a implementation not honouring this contract. This is happening way before there is any error_code.
Earlier in the cited document https://pubs.opengroup.org/onlinepubs/9699919799/functions/getaddrinfo.html it reads "A zero return value for getaddrinfo() indicates successful completion; a non-zero return value indicates failure." This suggests to me that it is not correct to consider the return value EAI_SYSTEM with errno = 0 as a successful return value and that it would be better to handle this as a system error with unidentified cause.
On Wed, 25 Aug 2021 at 17:25, Nils Frielinghaus via Boost-users
This suggests to me that it is not correct to consider the return value EAI_SYSTEM with errno = 0 as a successful return value and that it would be better to handle this as a system error with unidentified cause.
Yes, treating it as a success would be wrong. The question is whether it should be treated as: - A valid error - A post-condition violation in getaddrinfo
This suggests to me that it is not correct to consider the return value EAI_SYSTEM with errno = 0 as a successful return value and that it would be better to handle this as a system error with unidentified cause.
Yes, treating it as a success would be wrong. The question is whether it should be treated as: - A valid error - A post-condition violation in getaddrinfo
Who would be willing or entitled to do this determination?
On Mon, 9 Aug 2021 at 16:42, Nils Frielinghaus via Boost-users
But one could also argue that returning EAI_SYSTEM indicates that something went wrong and that the output parameters are not to be accessed.
Sure, but... what would you do instead? You could try to report some kind of error. But without evidence of this being a very common issue I would not bother, at most I would put an assertion in there... crashing anyway. By ASIO handling this, returning a real error, it would be passing the problem the the users of ASIO which now need to handle it. IMHO the issue in that C library (NSS plugin?) simply needs to be fixed.
participants (4)
-
Cristian Morales Vega
-
Niall Douglas
-
Nils Frielinghaus
-
Vinnie Falco