updated version of safe integer library
I've updated my safe integer library. Pointers to the most recent documentation and code can be found at www.blincubator.com . Or one can use documentation http://htmlpreview.github.io/?https://github.com/robertramey/safe_numerics/m... repository: https://github.com/robertramey/safe_numerics I've resolved added more tests. added more examples and explanation of them added facilities to enforce a guarantee of zero runtime penalty added new section to the documentation of the above added more documentation regarding the library implementation. and other stuff. There is still a little more to do. But there is nothing that I know about that would prevent users from using this library right now to implement guaranteed correct integer arithmetic in their applications. Of course I'm interested any feedback and/or observations anyone want's to offer. Robert Ramey
On 22 Dec 2015, at 17:09, Robert Ramey
wrote: I've updated my safe integer library. Pointers to the most recent documentation and code can be found at www.blincubator.com . Or one can use
documentation http://htmlpreview.github.io/?https://github.com/robertramey/safe_numerics/m...
Just a quick one to say that in the "problem" section of the introduction there is a typo. You wrote INT_MIN - x when you meant INT_MIN - y. Rather making your point that these tests are difficult and error-prone when done manually! Regards, Pete
On 12/22/15 2:50 PM, Pete Bartlett wrote:
On 22 Dec 2015, at 17:09, Robert Ramey
wrote: I've updated my safe integer library. Pointers to the most recent documentation and code can be found at www.blincubator.com . Or one can use
documentation http://htmlpreview.github.io/?https://github.com/robertramey/safe_numerics/m...
Just a quick one to say that in the "problem" section of the introduction there is a typo. You wrote INT_MIN - x when you meant INT_MIN - y.
Rather making your point that these tests are difficult and error-prone when done manually!
Very funny. Robert Ramey
On 12/22/2015 9:09 AM, Robert Ramey wrote:
Of course I'm interested any feedback and/or observations anyone want's to offer.
Hey there, First let me say that I haven't tried out your library yet, but from what I've read I think it's great. So I stumbled upon the comments in your default constructor: // default constructor constexpr explicit safe_base() { // this permits creating of invalid instances. This is inline // with C++ built-in but violates the premises of the whole library // choice are: // do nothing - violates premise of he library that all safe objects // are valid // initialize to valid value - violates C++ behavior of types. // add "initialized" flag. Preserves fixes the above, but doubles // "overhead" // still pending on this. } We've been having this debate over in the "[smart_ptr] Interest in the missing smart pointer (that can target the stack)" thread. It seems to have become a big thread, the relevant posts are near the bottom. I think we ultimately decided that both options should be provided. So for example safe<int> might have default initialization and almost_safe<int> might not. And even in the case where there is default initialization, there's the question of whether you should still require that the value be explicitly set before the first use. (This can be checked for and enforced in debug mode.) While writing some security conscious applications I've been incidentally writing a small library of safer substitutes for some C++ data types (https://github.com/duneroadrunner/SaferCPlusPlus). Among the substitutes it includes are those for int and size_t. Rather than doing comprehensive range checking like your library does, my types just do range checking only where I have the feeling it's particularly important and worth the performance cost. That's basically just when converting to different integer types, and in operator-=() for size_t. Would your library support limiting it's range checks for performance reasons? I guess the goal of my library is roughly to enable C++ programmers to, if they choose, write their code with "language safety" approaching that of Java's. But your library suggests that C++ could maybe even surpass Java as the choice for safe, secure and correct applications. So I wonder about the motivating cases for your library. I mean, was it designed to address specific use cases, or is it meant to be used more generally? I think your library is so novel to me, I don't have a good grasp on the range of application types that might end up using it. I might guess though, that some of the applications using your library to ensure against out of range arithmetic results might also benefit from using something like my library to ensure against, for example, invalid memory access or out of range vector element access. I guess what I'm wondering is, whether you think your library is an intrinsically independent one addressing a specific domain, or ultimately should be part of a larger set of tools for facilitating safety/security/correctness in C++? Noah
On 2/3/16 12:26 PM, Noah wrote:
On 12/22/2015 9:09 AM, Robert Ramey wrote:
Of course I'm interested any feedback and/or observations anyone want's to offer.
Hey there,
First let me say that I haven't tried out your library yet, but from what I've read I think it's great.
So I stumbled upon the comments in your default constructor:
// default constructor constexpr explicit safe_base() { // this permits creating of invalid instances. This is inline // with C++ built-in but violates the premises of the whole library // choice are: // do nothing - violates premise of he library that all safe objects // are valid // initialize to valid value - violates C++ behavior of types. // add "initialized" flag. Preserves fixes the above, but doubles // "overhead" // still pending on this. }
We've been having this debate over in the "[smart_ptr] ...
This is a fundamental question. The library has multiple goals a) safety as requirement that cannot be worked around. b) minimal runtime overhead c) drop in replacement for intrinsic integers. d) make the library idiot simple to use. These goals conflict in some cases. In most of those I've been able resolve these conflicts to my satisfaction. In many cases these conflicts are resolved by the usage of a policy class which permits the library user to make the trade off. But in the case of construction I couldn't really find a resolution in the bounds of the time that I was willing to spend on it. So I punted for later. I don't think it's insurmountable, it just requires more thought than first meets the eye. I'm not motivated to spend much more time with the library unless I get feed back indicating that people are interested in it are actually using it.
So for example safe<int> might have default initialization and almost_safe<int> might not. And even in the case where there is default initialization, there's the question of whether you should still require that the value be explicitly set before the first use. (This can be checked for and enforced in debug mode.)
So you're appreciating the possibilities and the trade-offs. Right now I don't have a lot to add to the comment. Sometimes a little feedback from users is all that's needed to make the resolution obvious.
While writing some security conscious applications I've been incidentally writing a small library of safer substitutes for some C++ data types (https://github.com/duneroadrunner/SaferCPlusPlus). Among the substitutes it includes are those for int and size_t. Rather than doing comprehensive range checking like your library does, my types just do range checking only where I have the feeling it's particularly important and worth the performance cost. That's basically just when converting to different integer types, and in operator-=() for size_t.
It sounds like it's the same library. I just pursued it to the bitter end where TMP takes you. This makes it a lot more complex - but guarantees that every operation checked if and only if this checking is necessary. Turns out that using ranged types can make runtime checking unnecessary in many or most cases. The documentation shows many examples where no runtime checking is necessary. In cases where it necessary, small changes in user code can make runtime checking unnecessary - all this while preserving safety. Also there are examples where one can use the library to trap at compile time any operation which could possible require a runtime check. This would point to minor changes (e.g. changing a datatype) so that one would absolutely know that no operation can fail at runtime - thereby making the inclusion of exception handling code unnecessary. It's all in the examples in the documentation.
Would your library support limiting it's range checks for performance reasons?
I'm not sure what you mean here. range checks only included
when it's possible to generate an error. So
safe
I guess the goal of my library is roughly to enable C++ programmers to, if they choose, write their code with "language safety" approaching that of Java's.
I think our libraries have the same goal. I think I just invested more time in mine.
But your library suggests that C++ could maybe even surpass Java as the choice for safe, secure and correct applications.
I don't know about java nor what it provides. When using this library for one's integer types, every operation is guaranteed to do one of the following: a) return an arithmetically correct result. b) trap with a compile time error c) throw an exception
So I wonder about the motivating cases for your library. I mean, was it designed to address specific use cases, or is it meant to be used more generally?
The statement immediately above describes it's scope. The documentation includes seven examples of situations where the library might be useful. I think your library is so novel to me, I don't have a good
grasp on the range of application types that might end up using it.
again - the scope is very general. It's intended as a drop-in replacement for any integer type in any program which a) must be demonstrably correct this excludes usage of unchecked operations b) must detect every user error requires exhaustive checking - like user input, copies, implicit conversions etc. c) must be efficient as possible subject to the constraints above this excludes interpreter like solutions - most of which aren't safe anyway. d) must have readable code which can be verified to execute the desired algorithm. This excludes subroutine libraries with functions like add(int j, int i), etc. . The requiring re-writing the whole program to use these special functions rather than using C/C++ arithmetic expressions. Manually "compiling" C/C++ expressions in terms of these functions is tedious, time consuming and error prone. This currently the only recommendation by experts in writing safety critical code.
I might guess though, that some of the applications using your library to ensure against out of range arithmetic results might also benefit from using something like my library to ensure against,
From what you've said, I think the libraries address the same problem. I just think I spent more time on mine. I didn't do it on someone else's payroll so I could afford (or not) to spend whatever it takes to take to logical end point.
for example, invalid memory access or
This is not addressed by the safe_integer library
out ofrange vector element access.
There is an example which illustrates this.
I guess what I'm wondering is, whether you think your library is an intrinsically independent one addressing a specific domain, or ultimately should be part of a larger set of tools for facilitating safety/security/correctness in C++?
Everything can be part of a larger set of tools. Writing safe, efficient, guaranteed correct programs requires additional tools besides safe integer a) safe float - This project is underway - but i haven't heard much progress lately. b) dimensions and units - uses the type system to enforce diminsional analysis and correct conversion between units systems. This is indispensable for writing correct engineering/scientific programs but almost never used. Boost has a good library for doing this, but unfortunately, the documentation is so opaque as to make the library all but unusable. c) thread, exception and memory allocation safety. These problems are already well addressed by C++ compiler and libraries. C++ is suited like no other language To write these tools. But there is one more thing ... Writing correct code is not considered a major problem by most programmers and organizations which depend on code. Code that works most of the time is considered good enough. In spite of the current problems like unintended acceleration, failures on ABS systems, blown missile launches, crashing mars probes - this is not considered a problem. Two papers were submitted to CppCon 2015 on safe integers (one by me and one on bounded integers - similar topic) and neither were considered interesting to other reviewers to potential attendees. I suspect they are right - and that's a problem - a huge problem. This problem affects all computer languages. There has never been an industrial strength solution to this problem - until now. And the response has been .... I've requested twice that this library be added to the boost review queue - no action. I've requested twice that a simpler version of this library which was formulated as a proposal be added to the list of standard library proposals - again - no actions. It's just not important to most of the world. I'm not bitter (though it might seem that I am - I'm really not). I'm just disgusted. (which is an entirely different thing. OK - I've had my fun, I'll let you go now. Robert Ramey
Writing correct code is not considered a major problem by most programmers and organizations which depend on code. Code that works most of the time is considered good enough.
Well, good enough is perfect in most projects. However, I do like simple drop-in wrappers that prevent stupid mistakes to ever compile, or that abort on overflows. About the construction topic, I think that a good compromise could be to choose a safe default (always initialize to 0) but to allow one to be explicitly unsafe: safe<int> i; // i == 0 safe<int> j(boost::uninitialized); // undefined It happens that the developer knows that initialization will be done later, or has already been done (mapped memory for example). Cheers,
On 2/3/16 4:00 PM, Raphaël Londeix wrote:
Writing correct code is not considered a major problem by most programmers and organizations which depend on code. Code that works most of the time is considered good enough.
Well, good enough is perfect in most projects. However, I do like simple drop-in wrappers that prevent stupid mistakes to ever compile, or that abort on overflows.
About the construction topic, I think that a good compromise could be to choose a safe default (always initialize to 0) but to allow one to be explicitly unsafe:
safe<int> i; // i == 0 safe<int> j(boost::uninitialized); // undefined
It happens that the developer knows that initialization will be done later, or has already been done (mapped memory for example).
The problem with this is that the usage of safe<int> changes the meaning of the program. Example: one has the following program int i; // i not initialized .... // program has weird behavior in order to find the cause of the weird behavior someone makes the following change: safe<int> i; // i now initialized to 0 ... // program has no weird behavior This means that usage of safe<int> hides errors - which is even worse than before. I'm actually most inclined to require an initialization. In this case the attempted fix would safe<int> i; // compile time error ... // program doesn't compile so in order to use safe integer one is forced to be explicit and use safe<int> i = 0; // compile time error ... // program has no weird behavior Then someone says - take out the safe stuff - it's slowing things down! (true or untrue doesn't matter). So the fix would be: int i = 0; // compile time error ... // program has no weird behavior The only problem is that someone is going to say: "Wait - I don't need initialization! it's non-optimal". He might be right" but I doubt it matters. but safe integer isn't exactly equivalent to int any more it has different behavior - I hate it when this happens. So the best would be to include an initialization bit inside safe<int> uh-oh another howl. Robert Ramey
On 2/3/2016 4:41 PM, Robert Ramey wrote:
The only problem is that someone is going to say: "Wait - I don't need initialization! it's non-optimal". He might be right" but I doubt it matters. but safe integer isn't exactly equivalent to int any more it has different behavior - I hate it when this happens. So the best would be to include an initialization bit inside safe<int> uh-oh another howl.
What's wrong with having an initialization bit in debug mode only? Are we worried about "hiding errors" outside of debug mode? Or are we worried about code that makes assumptions about the size of the int data type?
The problem with this is that the usage of safe<int> changes the meaning of the program.
int i; // i not initialized .... // program has weird behavior
In this particular case, the fact that the program was relying on UB means
that the program had no meaning at all. So it adds a meaning maybe :)
However I see your point ; you expect people to do something like:
#ifdef DEBUG
typedef safe<int> my_int;
#else
typedef int my_int;
#endif
But, you could instead market the following usage:
#ifdef DEBUG
typedef safe
On 2/4/16 1:08 AM, Raphaël Londeix wrote:
The problem with this is that the usage of safe<int> changes the meaning of the program.
int i; // i not initialized .... // program has weird behavior
In this particular case, the fact that the program was relying on UB means that the program had no meaning at all. So it adds a meaning maybe :)
However I see your point ; you expect people to do something like:
#ifdef DEBUG typedef safe<int> my_int; #else typedef int my_int; #endif
Note that the safe<T> is really safe
Note that the safe<T> is really safe
. there are several policies of each kind to choose from. One of the exception policies is ignore - but I don't think it currently actually works.
Yes, I know. The second part of my response was more about the fact that you actually have some policies, rather than their exact names. My point was that one of them (or a combination of them) do not incurs any overhead.
What I actually expect is:
a) some wierd bug can't be found. b) in desperation some intern just replaces all the ints with safe<T> c) problem is discovered - the safe... stuff is backed out and everyone (who is old enough) has a beer to congratulate themselves on how smart they are and the product is shipped. d) In some cases, someone might unintentionally leave the safe stuff in - since shipping is already way overdue.
I can't help it but I find your view very depressing ! Why not allow one to remove all the runtime checks ? This would let people use your code with the guarantee that it is always possible to remove the overhead, if needs be. Designing a library as a tool that you remove to ship your real product does not really help. I shamelessly quote myself:
IMHO, it makes no sense to design your library to allow users *not* using it.
e) Or maybe some crusty old geezer who is tired of fixing this stuff after doing 50 times will game the system by doing:
i) change all the int... to my_safe<int> ii) insert
template<typename T> using my_safe<T> = safe<T>;
then in one place in the program he can switch settings for all his integer types - without using macros.
That's exactly what I did, one typedef to rule them all. The ifdef DEBUG was only there to demonstrate a possible usage where checks are only enabled in debug builds.
It's also possible I could use policies to optionally include initialization checking - which I see as relatively expensive.
If you're saying that you might check if a value as been initialized before being used in a policy, then I would argue against it, and that was my whole point. I think that a safe builtin emulator should always be initialized, or *explicitly* left uninitialized[1]. That should have been the correct default for C++. I'm not sure how you will receive this, but as far as I remember, in D, all numeric types are initialized to 0 (or the value given to the constructor), but you can always remove that runtime cost by being explicit about it. If I remember correctly, the syntax is something like this: int i = void; But, with your library, it could be: boost::safe<int> i = boost::uninitialized; // or boost::none, or boost::whatever Which should raise an eyebrow in a code review, and has the advantage of being damn clear. Cheers, [1] This is the only point I am trying to make
On 2/3/2016 3:18 PM, Robert Ramey wrote:
Would your library support limiting it's range checks for performance reasons?
I'm not sure what you mean here.
Sorry, I meant for cases where performance is more of a priority, can people have range checking enabled for conversions, but disabled for arithmetic operations (except for maybe operator-=() in the size_t type). I assume that for arithmetic operations a lot of the range checking would occur at run-time. I'm assuming?
I might guess though, that some of the applications using your library to ensure against out of range arithmetic results might also benefit from using something like my library to ensure against,
From what you've said, I think the libraries address the same problem.
Oh sorry, I wasn't clear. The main point of my library isn't the replacements for integers. Probably the main element of my library is a safe drop-in replacement for native pointers. And also an almost completely safe implementation of std::vector and it's iterators. I was thinking of dumping the integer classes from my library and just using yours.
But there is one more thing ...
Writing correct code is not considered a major problem by most programmers and organizations which depend on code. Code that works most of the time is considered good enough. In spite of the current problems like unintended acceleration, failures on ABS systems, blown missile launches, crashing mars probes - this is not considered a problem. Two papers were submitted to CppCon 2015 on safe integers (one by me and one on bounded integers - similar topic) and neither were considered interesting to other reviewers to potential attendees. I suspect they are right - and that's a problem - a huge problem. This problem affects all computer languages. There has never been an industrial strength solution to this problem - until now. And the response has been .... I've requested twice that this library be added to the boost review queue - no action. I've requested twice that a simpler version of this library which was formulated as a proposal be added to the list of standard library proposals - again - no actions.
It's just not important to most of the world. I'm not bitter (though it might seem that I am - I'm really not). I'm just disgusted. (which is an entirely different thing.
Amen brother :) So maybe our libraries actually target slightly different domains. Your library targets the "correctness" issue, where my library targets more the "language safety" issue. Your library wants to make sure ABS systems work reliably. My library is more targeted at reducing the incidence of hackers taking over online banking servers (or your ip camera) through web server vulnerabilities. (While, as much as possible, maintaining performance.) You suggest, and you would know better than me, that your library has no audience because people just don't care. But I wonder if it actually would have an audience if there was more awareness. If it had better marketing. I suspect that my library might also face an awareness issue in that its natural customer is not currently a C++ programmer. Security conscious applications often aren't written in C++ because of it's reputation as an "unsafe" language. But if there were a "safe" C++ library, a single, standard, one-stop library for all your "safe" C++ needs, a library that was truly useful, that library might have a better chance of achieving wider recognition. Don't you think? I mean maybe in this case there's an intermediate step before becoming part of boost. And that is becoming part of a proven, useful "safe" C++ library. "Correct" being a subcategory of "safe". Also, while the performance cost may limit the (perceived) use cases for you library in deployed applications, the performance cost may be less of an issue in debug and test modes. So it might help if it was also marketed as a test library. Just brainstorming here. So I guess I'm wondering if you have any interest in general C++ language safety, or just in the correctness aspect. Or are you of the position that C++ language safety, outside of correctness, is already a solved issue?
On 2/3/16 9:05 PM, Noah wrote:
On 2/3/2016 3:18 PM, Robert Ramey wrote:
Would your library support limiting it's range checks for performance reasons?
I'm not sure what you mean here.
Sorry, I meant for cases where performance is more of a priority, can people have range checking enabled for conversions, but disabled for arithmetic operations (except for maybe operator-=() in the size_t type). I assume that for arithmetic operations a lot of the range checking would occur at run-time. I'm assuming?
yep - you're assuming too much. a) lot's of operations don't need checking: int8_t i, j i + j // can never overflow due to C++ type promotion rules int l, m l + m // can overflow. But the later case can be "fixed" by changing the promotion policy so that rather than int, the result is calculated in a larger intermediate type. etc. etc. It turns out that many cases can't fail anyway and others can be made safe with only minor changes. This is why it's impossible to do without a type aware library such as this one.
Oh sorry, I wasn't clear. The main point of my library isn't the replacements for integers. Probably the main element of my library is a safe drop-in replacement for native pointers. And also an almost completely safe implementation of std::vector and it's iterators. I was thinking of dumping the integer classes from my library and just using yours.
or maybe making a safe_pointer library. Feel free to borrow anything from the safe_integer library which might be useful.
So maybe our libraries actually target slightly different domains. Your library targets the "correctness" issue, where my library targets more the "language safety" issue. Your library wants to make sure ABS systems work reliably. My library is more targeted at reducing the incidence of hackers taking over online banking servers (or your ip camera) through web server vulnerabilities. (While, as much as possible, maintaining performance.)
OK - I can't really understand what your library does without studying it.
You suggest, and you would know better than me, that your library has no audience because people just don't care. But I wonder if it actually would have an audience if there was more awareness. If it had better marketing.
Maybe, we'll see. Maybe a few more big crashes might help.
I suspect that my library might also face an awareness issue in that its natural customer is not currently a C++ programmer. Security conscious applications often aren't written in C++ because of it's reputation as an "unsafe" language. But if there were a "safe" C++ library, a single, standard, one-stop library for all your "safe" C++ needs, a library that was truly useful, that library might have a better chance of achieving wider recognition. Don't you think?
maybe - it wouldn't hurt
I mean maybe in this case there's an intermediate step before becoming part of boost. And that is becoming part of a proven, useful "safe" C++ library. "Correct" being a subcategory of "safe". Also, while the performance cost may limit the (perceived) use cases for you library in deployed applications, the performance cost may be less of an issue in debug and test modes. So it might help if it was also marketed as a test library. Just brainstorming here.
LOL - aren't we all. Actually, I've worked on a fair number of embedded systems - which is my main inspiration for this. In the real world, there is huge pressure to "make it work". I get this. Then it is "tested" until the bugs stop appearing. It seems cost effective. And perhaps it is. Because, to make a safe C++/C program now would require writing C code in the style of assembler, checking all intermediate operations and doing conversions by hand. So the choice is pretty simple - ignore the issue or don't deliver the system. The ultimate customer doesn't seem to care. When the Mars Lander crashed into mars - wasting several hundred million dollars - due to a problem in conversion of english and metric units - an extensive investigation was undertaken. The final result was ... it was nobody's fault. To me this is incomprehensible. It seems as as a facet of the "hacker mentality" which is widely lauded these days. I think a big part of the problem is that I'm just to old and living in the distant past.
So I guess I'm wondering if you have any interest in general C++ language safety, or just in the correctness aspect. Or are you of the position that C++ language safety, outside of correctness, is already a solved issue?
I want to be able to write and read a program which clearly states what it is supposed to do and know that it will do exactly that or tell me why it can't do that. What I don't want is to write/read a program that no one can understand what it does. I don't want it to return an incorrect result at all. among other things things that means. a) if you write i + j you will get either the correct arithmetical result or some sort of exception that you are expected handle. b) if you allocate an object .... c) if you do anything at all - it either works as written or handles some sort of d) if I write x + y (floating point) I get either the expected result or a reason why it can't be returned. This is a very difficult goal to realize. Damian Vicino is working on this. I don't know his current progress. I believe that C++ already addresses some of this- memory allocation, multi-threading, etc. But other parts are only addressed in an ad hoc way - I want to see that changed. Robert Ramey
participants (5)
-
Damian Vicino
-
Noah
-
Pete Bartlett
-
Raphaël Londeix
-
Robert Ramey