Le 27/05/2017 à 01:02, Andrzej Krzemienski via Boost a écrit :
Hi Everyone, In order to test Boost.Outcome, I applied it to the problem I had to solve at work a while ago. For people who do not know what Boost.Outcome is for, this could be an introduction. For me it can be an opportunity to obtain feedback from Niall and Vicente, If I am using `expected` correctly, and they can get some feedback from me.
Thanks Andrzej for working on this. Having someone else that use the library in concrete cases is always a good thing.
Here is the task. I get an std::string as input: it is a data structure encoded in ASCII text:
"1-10|11-20|21-99;1-10|10-50|51-99"
Semicolons separate "distributions". Within distribution, vertical bars separate "ranges". Within ranges, dashes separarate integer numbers. There are some additional constrains. like ranges within distribution cannot overlap, no gaps between ranges are allowed, and a couple more.
I need to validate if the input matches the required format, and build a corresponding data structure. But the vaidation cannot be just binary valid-invalied answer: I want to give user a description of what exactly is wrong with her input, as detailed as possible.
So, I have function parse, that internally calls parse_distribution, which in turn calls parse_range. Each of them can return different parsing error that needs to be propagated up.
My original implementation used "out" function parameters for signalling errors. Code was unnecessarily long and it was easy to forget to check results from functions, but I didn't want to use exceptions to indicate validation failures, because there is nothin exceptional about it. You expect input validation to fail -- quite often.
This is a typical use case for the Parser Monad.
Now, this is a more philosophic question: when to use (or not) exceptions. One citerion I sometimes use is, "do I want my debugger to automatically break if this function reports faiure?"
So I didn't use exceptions for input validation, but the ugliness of the code forced me to go back to rewrite it with exceptions. The code was shorter, and "saer" in the sense that I could not forget ot pass the failure up the stack. I understand your concern.
Now with Boost.Outcome I have a third option. It is almost as concise as exceptions, but does not use exceptions.
Here is my implementation:
https://github.com/akrzemi1/__sandbox__/blob/master/outcome_practical_exampl...
Note a couple of things. I am creating a number of temporay strings under way, they all can throw bad_alloc. I am fine with that. I am not affraid of throwing an exception, it is only the validation failures that I want to report with `expected`. For other failures I want exceptions.
I like the separation of concerns here. You expects to have some input errors, but not be out of memory.
This is input validation. It does not have to be super fast.
In the public API of this file, I will not be exposing the information about parsing failues. `process` will consume the error information and just output something. So, the choice of `E` in `expected
` is internal to a file, and I can choose something specific to my use case. I provid my own type, which encodes different erroneous situations in an enum, and two ints for additional details. ``` struct BadInput { enum { no_distrib = 1, range_count_not_3, distrib_starts_with_no_1, distrib_ends_with_no_99, gap_between_ranges, overlap_between_ranges, inverted_range, no_2_numbers_in_range, not_a_number, } type; int val1; int val2; }; ``` It is quite similar to Outcome's error_code_extended.
Now, I also wanted to have an std::string member inside, but `expected` forbids it: It says as per LWG requirements `E` in `expected
` must not kave a throwing copy constructor. Is that right? In my use case it wouldn't harm anyone: if copying of `expected` throws, entire function `process` is exited, with a bad_alloc, which is fine. No `expected` with partial state will be ever observed. Do we need this restriction?
The question is what exception safety do you want. If we want that original is not changed after a exception while doing an assignment, we have no choice. We could get it with double-buffering, but I don't want to go this way now. How many would like an additional expected that use double_buffering to support all the operations and the never-empty warranties? However, we could support types that throw but we will not support assignment and emplace. I don't see any problem with copy/move constructors. I'll check the wording so that we restrict only when we use the operations that need the restriction. Niall, could you check from your side what operations need the nothrow contraints? Note that we can ensure almost the never-empty-warranty for optional_expected (if we consider empty_t as an alternative).
Also, at some point I will be returning an `expected
`, which makes me uneasy. Returning vectors wrapped in something: will I get runtime penalty for it? So, I naturally arrived at a pattern that Niall described in the other thread: ``` out::expected
parse (...) { out::expected ans {out::value}; vector<T> & vec = ans.value(); // will if statement be ellided?} // fill vec
return ans; } ```
(BTW, this constant out::value is not documented in the docs, unlike out::value_t.)
It has been mentioned a number of times that an optimizing compiler can remove a redundant if-statement. I know that the if-statement inside `value()` is redundant, but in this particular case, will compiler know it to? Nobody has called `has_value()` before.
You don't need. Use `*ans`. You know hat you have a value.
Is this
out::expected
Also, the following piece of code illustrates how concise the implementation of a single function can be:
``` // I use a shortet alias for macro BOOST_OUTCOME_TRYX. // It is not portable, though. It only works on Clang and GCC. #define TRY(ex) BOOST_OUTCOME_TRYX(ex)
out::expected
parse_range (const std::string& input) { tokenizer numbers_s {input, sep_3}; auto it = numbers_s.begin(), itEnd = numbers_s.end(); int count = std::distance(it, itEnd); if (count != 2) return out::make_unexpected(BadInput{BadInput::no_2_numbers_in_range, count});
Range r {TRY(parse_int(*it++)), TRY(parse_int(*it++))};
Humm, side effects inside an expression :(. UB? Better to define local variables and ensure sequential parsing.
if (r.lo > r.hi) return out::make_unexpected(BadInput{BadInput::inverted_range, r.lo, r.hi});
return r; } ```
It is almost as short as if I used exceptions, except for the TRY macro. Note in particular this line:
``` Range r {TRY(parse_int(*it++)), TRY(parse_int(*it++))}; ```
`parse_int()` returns `out::expected
`.Now, `TRY(parse_int(...))` is an expression and means, "if parse_int() failed immedaitely return function `parse_range` with the same BadInput, otherwise use its `value()` to initialize `r`. What do you think. Am I using the library correctly?
With some minor exception I've signaled, yes, I believe it is a good example. It will be worth comparing with the code that use a parser monad approach. But this is out of the scope of this thread. I will see with you in private. Best, Vicente