I believe that modules are so important that you should drop everything and aim to release a modular version of Boost. I believe this because I used Boost’s serialization to write the General Theory of Databases in C++ (I already had that theory in C# and Java). While that succeeded I upgraded I++ to use C++ Modules. I couldn’t get Boost’s header based system working with the modular form of databases. So I had to drop Boost altogether.
I believe that if Boost is to remain relevant it must be expressed in Modular Form immediately. You should target a release for a week or two in the future. I sent Boost a starting module of about 15,000 lines of code. It compiles OK but needs work to link. You can pack the entire Boost library into a single .ixx file. You should target Microsoft’s compiler first then expand support as other compilers implement modules.
That is, you should aim to make a quantum leap to ISO C++ 20 standard immediately.
Cheers,
Benedict Bede McNamara,
1st Class Honours, Pure Mathematics.
From: boost-request@lists.boost.org
Sent: Friday, 13 May 2022 12:24 PM
To: boost@lists.boost.org
Subject: Boost Digest, Vol 6704, Issue 1
Send Boost mailing list submissions to
boost@lists.boost.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.boost.org/mailman/listinfo.cgi/boost
or, via email, send a message with subject or body 'help' to
boost-request@lists.boost.org
You can reach the person managing the list at
boost-owner@lists.boost.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Boost digest..."
The boost archives may be found at: http://lists.boost.org/Archives/boost/
Today's Topics:
1. Re: MySql review (Ruben Perez)
2. Re: Review of Boost.MySql (Ruben Perez)
3. Re: Future of C++ and Boost (Andrey Semashev)
4. Re: Future of C++ and Boost (Gavin Lambert)
5. Re: Boost MySQL Review Starts Today (Alan de Freitas)
----------------------------------------------------------------------
Message: 1
Date: Thu, 12 May 2022 20:50:22 +0200
From: Ruben Perez
Here is my review of Ruben Perez's proposed MySql library.
Hi Phil, thank you for taking your time to write a review.
Background ----------
I have previously implemented C++ wrappers for PostgreSQL and SQLite, so I have some experience of what an SQL API can look like. I know little about ASIO.
I have also recently used the AWS SDKs for C++ and Javascript to talk to DynamoDB; this has async functionality, which is interesting to compare.
I confess some minor disappointment that MySql, rather than PostgreSQL or SQLite, is the subject of this first Boost database library review, since those others have liberal licences that are closer to Boost's own licence than MySql (and MariaDB). But I don't think that should be a factor in the review.
Trying the library ------------------
I have tried using the library with
- g++ 10.2.1, Arm64, Debian Linux - ASIO from Boost 1.74 (Debian packages) - Amazon Aurora MySql-compatible edition
I've written a handful of simple test programs. Everything works as expected. Compilation times are a bit slow but not terrible.
The remainder of this review approximately follows the structure of the library documentation.
Introduction ------------
I note that "Ease of use" is claimed as the first design goal, which is good.
I think I failed to make the scope of the library clear enough in this aspect. The library is supposed to be pretty low level and close to the protocol, and not an ORM. I list ease of use here in the sense that * I have tried to abstract as much of the oddities of the protocol as possible (e.g. text and binary encodings). * The library takes care of SSL as part of the handshake, vs having the user have to take care of it. * The library provides helper connect() and close() functions to make things easier. * The object model is as semantic as I have been able to achieve, vs. having a connection object and standalone functions. * The value class offers stuff like conversions to make some use-cases simpler. I guess I listed that point in comparison to Beast or Asio, which are even lower level. Apologies if it caused confusion.
I feel that some mention should be made of the existing C / C++ APIs and their deficiencies. You should also indicate whether or not the network protocol you are using to communicate with the server is a "public" interface with some sort of stability guarantee. (I guess maybe it is, if it is common to MySql and MariaDB.)
Updated https://github.com/anarthal/mysql/issues/50 on comparison with other APIs. The network protocol is public and documented (although the documentation is pretty poor). It's indeed a pretty old protocol that is not being extended right now, and it's widely used by a lot of clients today, so there is very little risk there.
Tutorial --------
The code fragments should start with the necessary #includes, OR you should prominently link to the complete tutorial source code at the start.
Raised https://github.com/anarthal/mysql/issues/71 to track it.
You say that "this tutorial assumes you have a basic familiarity with Boost.Asio". I think that's unfortunate. It should be possible for someone to use much of the library's functionality knowing almost nothing about ASIO. Remember your design goal of ease-of-use. In fact, it IS possible to follow the tutorial with almost no knowledge of ASIO because I have just done so.
If you are to really take advantage of the library (i.e. use the asynchronous API), you will need some Asio familiarity. I'd say a very basic understanding is enough (i.e. knowing what a io_context is). If you think this comment is misleading, I can remove it. But I don't think this is the right place to provide a basic Asio tutorial.
You have this boilerplate at the start of the tutorial:
boost::asio::io_context ctx; boost::asio::ssl::context ssl_ctx (boost::asio::ssl::context::tls_client); boost::mysql::tcp_ssl_connection conn (ctx.get_executor(), ssl_ctx); boost::asio::ip::tcp::resolver resolver (ctx.get_executor()); auto endpoints = resolver.resolve(argv[3], boost::mysql::default_port_string); boost::mysql::connection_params params ( argv[1], // username argv[2] // password ); conn.connect(*endpoints.begin(), params); // I guess that should really be doing something more // intelligent than just trying the first endpoint, right?
The way to go here is providing an extra overload for connection::connect. Raised https://github.com/anarthal/mysql/issues/72 to track it.
I would like to see a convenience function that hides all of that:
auto conn = boost::mysql::make_connection( ...params... );
I guess this will need to manage a global, private, ctx object or something.
If you take a look to any other asio-based program, the user is always in charge of creating the io_context, and usually in charge of creating the SSL context, too. If you take a look to this Boost.Beast example, you will see similar stuff: https://www.boost.org/doc/libs/1_79_0/libs/beast/example/http/client/sync-ss... I'm not keen on creating a function that both resolves the hostname and connects the connection, as I think it encourages doing more name resolution than really required (you usually have one server but multiple connections). I may be wrong though, so I'd like to know what the rest of the community thinks on this.
.port = 3306, // why is that a string in yours?
It is not "mine", it's just how Asio works. Please have a look at https://www.boost.org/doc/libs/1_79_0/doc/html/boost_asio/reference/ip__basi...
make_connection("mysql://admin:12345@hostname:3306/dbname");
I guess you're suggesting that make_connection also perform the name resolution, the physical connect and the MySQL handshake? I'm not against this kind of URL-based way of specifying parameters. I've used it extensively in other languages. May be worth reconsidering it once Vinnie's Boost.Url gets accepted.
Now... why the heck does your connection_params struct use string_views? That ought to be a Regular Type, with Value Semantics, using std::strings. Is this the cult of not using strings because "avoid copying above all else"?
I may have been a little too enthusiastic about optimization here.
Another point about the connection parameters: you should provide a way to supply credentials without embedding them in the source code. You should aim to make the secure option the default and the simplest to use. I suggest that you support the ~/.my.cnf and /etc/my.cnf files and read passwords etc. from there, by default. You might also support getting credentials from environment variables or by parsing the command line. You could even have it prompt for a password.
I don't know of any database access library that does this. The official Python connector gets the password from a string. I think this is mixing concerns. Having the password passed as a string has nothing to do with having it embedded in the source code. Just use std::getenv, std::stdin or whatever mechanism your application needs and get a string from there, then pass it to the library. All the examples read the password from argv. Additionally, having passwords in plain text files like ~/.my.cnf and /etc/my.cnf is considered bad practice in terms of security, I wouldn't encourage it.
Does MySQL support authentication using SSL client certs? I try to use this for PostgreSQL when I can. If it does, you should try to support that too.
AFAIK you can make the server validate the client's certificate (that doesn't require extra library support), but you still have to pass a password.
About two thirds of the way through the tutorial, it goes from "Hello World" to retrieving "employees". Please finish the hello world example with code that gets the "Hello World" string from the results and prints it.
My bad, that's a naming mistake - it should be named hello_resultset, instead. It's doing the right thing with the wrong variable name. Updated https://github.com/anarthal/mysql/issues/71
Queries -------
I encourage you to present prepared queries first in the documentation and to use them almost exclusively in the tutorial and examples.
It can definitely make sense.
You say that "client side query composition is not available". What do you mean by "query composition"? I think you mean concatenating strings together'); drop table users; -- to form queries, right? Is that standard MySql terminology? I suggest that you replace the term with something like "dangerous string concatenation".
Yes, I mean that.
In any case, that functionality *is* available, isn't it! It's trivial to concatenate strings and pass them to your text query functions. You're not doing anything to block that. So what you're really saying is that you have not provided any features to help users do this *safely*. I think that's a serious omission. It would not be difficult for you to provide an escape_for_mysql_quoted_string() function, rather than having every user roll their own slightly broken version.
Definitely not trivial (please have a look at MySQL source code) but surely beneficial, see below. Tracked by https://github.com/anarthal/mysql/issues/69.
IIRC, in PostgreSQL you can only use prepared statements for SELECT, UPDATE, INSERT and DELETE statements; if you want to do something like
ALTER TABLE a ALTER COLUMN c SET DEFAULT = ? or CREATE VIEW v as SELECT * FROM t WHERE c = ?
You are right, these cases aren't covered by prepared statements. https://github.com/anarthal/mysql/issues/69 tracks it.
tcp_ssl_prepared_statement is verbose. Why does the prepared statement type depend on the underlying connection type?
Because it implements I/O operations (execute() and close()), which means that it needs access to the connection object, thus becoming a proxy object.
I have to change it if I change the connection type?! If that's unavoidable, I suggest putting a type alias in the connection type:
connection_t conn = ....; connection_t::prepared_statement stmt(.....);
Raised https://github.com/anarthal/mysql/issues/73
Does MySql allow numbered or named parameters? SQLite supports ?1 and :name; I think PostgreSQL uses $n. Queries with lots of parameters are error-prone if you just have ?. If MySql does support this, it would be good to see it used in some of the examples.
Not AFAIK, just regular positional placeholders.
Invoking the prepared statement seems unnecessarily verbose. Why can't I just write
auto result = my_query("hello", "world", 42);
Because this invokes a network operation. By Asio convention, you need a pair of sync functions (error codes and exceptions) and at least an async function, that is named the same as the sync function but with the "_async" suffix. I'm not against this kind of signature, building on top of what there already is: statement.execute("hello", "world", 42); statement.async_execute("hello", "world", 42, use_future); Which saves you a function call. Raised https://github.com/anarthal/mysql/issues/74
I also added Query variants where the result is expected to be
- A single value, e.g. a SELECT COUNT(*) statement. - Empty, e.g. INSERT or DELETE. - A single row. - Zero or one rows . - A single column.
I think this can be useful. I've updated https://github.com/anarthal/mysql/issues/22 to track this.
I don't see anything about the result of an INSERT or UPDATE. PostgreSQL tells me the number of rows affected, which I have found useful to return to the application.
Please have a look at https://anarthal.github.io/mysql/mysql/resultsets.html#mysql.resultsets.comp...
resultset, row and value ------------------------
I'm not enthusiastic about the behaviour nor the names of these types:
Resultset is how MySQL calls it. It's not my choice.
- resultset is not a set of results. It's more like a sequence of rows. But more importantly, it's lazy; it's something like an input stream, or an input range. So why not actually make it an input range, i.e. make it "a model of the input_range concept". Then we could write things like:
auto result = ...execute query... for (auto&& row: result) { ... }
How does this translate to the async world?
- row: not bad, it does actually represent a row; it's a shame it's not a regular type though.
- value: it's not a value! It doesn't have value semantics!
If the library gets rejected I will likely make values owning (regular).
I'm also uncertain that a variant for the individual values is the right solution here. All the values in a column should have the same type, right? (Though some can be null.) So I would make row a tuple. Rather than querying individual values for their type, have users query the column.
Are you talking about something like this? https://github.com/anarthal/mysql/issues/60
It seems odd that MySQL small integers all map to C++ 64-bit types.
It is done like this to prevent the variant from having too many alternatives - I don't think having that would add much value to the user. If I implement something like https://github.com/anarthal/mysql/issues/60, each int type will be mapped to its exact type.
I use NUMERIC quite a lot in PostgreSQL; I don't know if the MySql type is similar. I would find it inconvenient that it is treated as a string. Can it not be converted to a numeric type, if that is what the user wants?
MySQL treats NUMERIC and DECIMAL the same, as exact numeric types. What C++ type would you put this into? float and double are not exact so they're not fit.
I seem to get an assertion if I fail to read the resultset (resultset.hpp:70). Could this throw instead?
This assertion is not related to this. It's just checking that the resultset has a valid connection behind it, and is not a default constructed (invalid) resultset.
Or, should the library read and discard the unread results in this case?
Having a look into the Python implementation, it gives the option to do that. I think we can do a better job here. Track by https://github.com/anarthal/mysql/issues/14
But the lack of protocol support for multiple in-flight queries immediately becomes apparent. It almost makes me question the value of the library - what's the point of the async support, if we then have to serialise the queries?
As I pointed out in another email, it's a combination of lack of protocol support and lack of library support. Apologies if the documentation is not clear in this aspect. I think there is value in it, though, so you don't need to create 5000 threads to manage 5000 connections. The fact that the official MySQL client has added a "nonblocking" mode seems a good argument.
Should the library provide this serialisation for us? I.e. if I async_execute a query while another is in progress, the library could wait for the first to complete before starting the second.
I would go for providing that bulk interface I talk about in other emails.
Or, should the library provide a connection pool? (Does some other part of ASIO provide connection pool functionality that can be used here?)
Asio doesn't provide that AFAIK. It is definitely useful functionality, tracked by https://github.com/anarthal/mysql/issues/19
Transactions ------------
I have found it useful to have a Transaction class:
{ Transaction t(conn); // Issues "BEGIN" .... run queries .... t.commit(); // Issues "COMMIT" } // t's dtor issues "ROLLBACK" if we have not committed.
Again, how would this work in the async world? How does the destructor handle communication failures when issuing the ROLLBACK?
Klemens Morgenstern makes the point that MySql is a trademark of Oracle. Calling this "Boost.MySql" doesn't look great to me. How can you write "The Boost MySql-compatible Database Library" more concisely?
I'm not very original at naming, as you may have already noticed. Using Boost.Delfin was proposed at some point, but Boost.Mysql definitely expresses its purpose better.
Overall, I think this proposal needs a fair amount of API re-design and additional features to be accepted, and should be rejected at this time. It does seem to be a good start though!
Thanks to Ruben for the submission.
Thank you for sharing your thoughts, I think there
is a lot of useful information here.
------------------------------
Message: 2
Date: Thu, 12 May 2022 21:02:48 +0200
From: Ruben Perez
Indeed. But even your imagined batch interface only works well for queries/selects, while for inserts (or updates), the client does not need to just send a small textual SQL query, but potentially a bunch of data for the rows too. A true pipeline allows sending the rows for a second insert while the first insert is being processed.
It would work for inserts too, as values are either part of the query string or part of the statement execute packet. In both cases, part of the request.
Such a mode may be useful for schema creation OTOH. We have large schemas, with hundreds of tables, indexes, triggers, etc... Done from C++ code client-side, not via a DBA manually executing on the server using SQL files and the native CLI. For that use-case, the ability to send many DDLs in a single batch would definitely save on the round-trips to the server. We try hard to minimize roundtrips!
Thanks for sharing this use case, I definitely wasn't aware of it and seems a reason towards implementing multi-statement.
You don't need to necessarily use Google's protobufs library. There's https://github.com/mapbox/protozero for example, and similarly, a from-scratch implementation to just encode-decode a specific protocol can also be written.
I wasn't aware of this.Thanks.
The server sends several resultsets after that. I haven't focused a lot on this because it sounded risky (in terms of security) for me.
Not sure what's risky here. Maybe I'm missing something.
I was just imagining users concatenating queries. May be a misconception, yours is a legitimate use case.
Note that PostgreSQL's COPY is not file-specific, and a first-class citizen at the protocol level, depending on a pure binary format (with text mode too). I use COPY with STDIN / STDOUT "files", i.e. I prepare memory buffers and send them; and read memory buffers, and decode them. No files involved.
Unfortunately, that does not seem to be the case for MySQL. You issue the LOAD DATA statement via a regular query packet, and the server returns you another packet with the file path it wants. You then read it in the client and send it to the server in another packet. I've made it work with CSV files, and I'd say it's the only format allowed, AFAIK.
Maybe our use-case of very large data (with large blobs), *and* very numerous smaller data, that's often loaded en-masse, by scientific desktop applications, and increasingly mid-tier services for web-apps, is different from the more common ecommerce web-app use-case of many people.
It is good to hear different use cases than the regular web server we all have in mind. It will help me during further development.
When I evaluate DB performance, I tend to concentrate on "IO" performance, in terms of throughput and latency, independent of the speed of the SQL engine itself. There's nothing I can do about the latter, while the way one uses the Client API (or underlying protocol features) is under my control. So I do mostly INSERTs and SELECTs, with and without WHERE clauses (no complex JOINs, CTEs, etc...).
Because we are in the scientific space, we care about both many small rows (millions, of a few bytes to KBs each at most), and a few (hundreds / thousands) much larger rows with blobs (with MBs to GBs sizes). The truly large "blobs" (files) are kept outside the DB, since mostly read only (in the GBs to TBs sizes each, that can accumulate to 2.6 PB for all a client's data I heard just yesterday for example).
I'll also compare inserting rows 1-by-1, with and without prepared statements, to inserting multi-rows per-statement (10, 100, 1000 at a time), to the "bulk" interface (COPY for PostgreSQL, LOCAL for MySQL, Direct-Path load in OCI). For example with SQLite: https://sqlite.org/forum/forumpost/baf9c444d9a38ca6e59452c1c568044aaad50bbaa... (SQLite as no "bulk" interface, doesn't need one, since "embedded" thus "zero-latency")
For PostgreSQL, we also compared text vs binary modes (for binds and resultsets).
For blobs, throughput of reading and writing large blobs, whole and in-part, with different access patterns, like continuous ranges, or scattered inside).
A very important use-case for us, for minimizing round-trips, is how to load a subset of rows, given a list of primary keys (typically a surrogate key, like an integer or a uuid). For that, we bind a single array-typed value for the WHERE clause's placeholder to the query, and read the resultset, selecting 1%, 5%, 10%, etc... of the rows from the whole table. (selecting each row individually, even with a prepared query, is terribly slow).
Thanks for sharing. It will definitely help me during benchmarking.
Regards,
Ruben.
------------------------------
Message: 3
Date: Thu, 12 May 2022 23:39:41 +0300
From: Andrey Semashev
On May 12, 2022, at 11:34 AM, Robert Ramey via Boost
wrote: On 5/12/22 11:30 AM, Robert Ramey via Boost wrote:
On 5/12/22 9:55 AM, John Maddock via Boost wrote:
wow - that would be a big one. codecvt is the fundamental component to support wchar is it not? Does this mean that wchar is gone also? If so what replaced it? etc....
FWIW - I don't see any notice of such deprecation here: https://en.cppreference.com/w/cpp/header/codecvt
Its in green - at the top: codecvt_utf8 https://en.cppreference.com/w/cpp/locale/codecvt_utf8 (C++11)(deprecated in C++17)
Also here:
http://eel.is/c++draft/depr.locale.stdcvt#depr.codecvt.syn
------------------------------
Message: 4
Date: Fri, 13 May 2022 13:20:24 +1200
From: Gavin Lambert
On 5/12/22 9:55 AM, John Maddock wrote:
Watch out - all of <codecvt> is deprecated in C++17, I think you're relying only on <local> and may be OK though...
wow - that would be a big one.? codecvt is the fundamental component to support wchar is it not?? Does this mean that wchar is gone also?? If so what replaced it? etc....
wchar_t is still around (although char8_t and std::u8string are the new
hotness), it's just the conversion functions that are deprecated. I
guess you're just not supposed to convert anything any more.
More seriously, AFAIK there's no plans to actually remove it
until/unless an actual replacement gets standardised. But I think in
the meantime they'd rather you use something like the ICU library for
conversions instead.
Although it wouldn't surprise me if, in not wanting to take a dependency
on an external library (and not wanting to continue using an officially
deprecated standard function), a lot of libraries/apps will write their
own subtly-broken conversion routines instead...
------------------------------
Message: 5
Date: Thu, 12 May 2022 23:24:19 -0300
From: Alan de Freitas
Will the library bring additional out-of-the-box utility to Boost?
The library is very good news considering our recent discussions about the future of Boost, where providing more protocol implementations comes up frequently. I wish more people would make this kind of contribution.
What is your evaluation of the potential usefulness of the library?
Others have questioned the benefit of the library when compared to sqlpp11 or any wrapper around the C API. The main difference is other libraries are high-level but this is a discussion still worth having from the point of view of users. I was thinking about the transition cost from Boost.MySQL to any other SQL database, since many applications have the requirement/necessity of allowing different SQL databases. In sqlpp11, the user can just change the backend. The user could use Boost.MySQL as an sqlpp11 backend and that would have the same effect. However, I think Rub?n mentioned this is not possible at some point. I'm not sure whether this is just for the async functionality. In the same line, I wonder if a library for Postgres or Sqlite would be possible with a similar API, which could also solve the problem, although I'm not sure anyone would be willing to implement that. If we did, we could have the convenience of sqlpp11 and the ASIO async functionalities of Boost.Mysql for other DBs. The library really provides ease-of-use, when we consider what it provides and how low-level it is. However, unlike in other libraries like Boost.Beast, Boost.MySql users might not be sold into the Asio way of doing things. Applications that require access to databases might be making sparse database requests where the Asio asynchronous model is not as useful. Highlighting these differences in the docs is important. Asio takes some time to learn, and I guess for a user not used to Asio, already understanding Asio does not sound like the ease of use. The docs could focus on the protocol before moving on to the asynchronous functionality. I'm also a little worried about the maintainability of the library and protocol changes and how this could impact boost as a whole. Should we really announce it as compatible with MariaDB? What direction would the library take if they diverge? How often does the protocol change or is extended? Is Ruben going to stick around if the protocol changes? How hard would it be for someone to understand the code and implement extensions? Can a user be sure it's always going to provide the same features and be as reliable as the C API? I don't have the answer to these questions, but it's something that got me wondering. I guess this kind of question is going to come up for any library that is related to a protocol. I don't know if the name "MySql" can be used for the library, as it belongs to Oracle. I'm not saying it can't. I'm really saying I don't know. I'm not a lawyer and I don't understand the implications here. But this should be considered, investigated, and evidence should be provided. The library is also compatible with MariaDB and the name "MySql" might not reflect that. Maybe there's a small probability it might be compatible with some other similar DB protocol derived from MySql in the future? As others have mentioned, the protocol is strictly sequential for a single connection, and this might have some implications for the asynchronous operations the library provides. - No two asynchronous MySql query reads can happen concurrently. While this still has value among other Asio operations, like a server that needs the DB eventually, the user needs to be careful about that. Maybe it would be safer if all MySql operations were on some special kind of strand. Or maybe the library could provide some "mysql::transaction_strand" functionality to help ensure this invariant for individual queries in the future. - A second implication is that some applications might find the asynchronous functionalities in Boost.Mysql not as useful as asynchronous functionalities in other protocols, like the ones in Boost.Beast. This depends on how their applications are structured. Since this is the main advantage over the C API, these users may question the value of the library and the documentation should discuss this more explicitly. - These implications could become irrelevant if the library provides some kind of functionality to enable a non-blocking mode. I have no idea how the MySql client achieves that. ## API
What is your evaluation of the design? Will the choice of API abstraction model ease the development of software that must talk to a MySQL database?
I like how the API is very clean compared to the C API, even when including
the asynchronous functionality. This would be a reason for using the
library, even if I only used the synchronous functions.
I'm worried about the lack of possibility of reusing memory for the
results, as the interface depends on vector. This is not the usual Asio
pattern. These vectors look even weirder in the asynchronous callbacks:
- People have their containers/buffers and I would assume reading into some
kind of existing row buffer would be the default interface, as is the case
with other Asio read functions. In other words, read_many and read_all
should work more like read_one.
- Not returning these vectors is the common pattern in Asio: the initiating
function receives a buffer for storage and the callback returns how many
elements were read. Note that the buffer size already delimits how many
elements we should read.
- If we return vectors, async operations would need to instantiate the
vector with the custom allocator for the operation. The callback wouldn't
use std::vector<T> then. I would be std::vector
What is your evaluation of the implementation?
Did you try to use the library? With what compiler? Did you have any
I haven't analyzed the implementation very profoundly. I skimmed through the source code and couldn't find anything problematic. It would be useful if the experts could inspect the Asio composed ops more deeply. CMakeLists.txt: - I believe the CMakeLists.txt script is not in the format of other boost libraries in boost/libs so it won't work with the super-project as it is. - The example/CMakeLists.txt script refers to BOOST_MYSQL_INTEGRATION_TESTS. I don't think examples can be considered integration tests. Examples: - The examples are very nice. Especially the one with coroutines. They are also very limited. They are all about the same text queries, which shouldn't even be used in favor of prepared statements. - Many examples about continuation styles are not very useful because this is more of an Asio feature than a library feature. The library feature, so to speak, is supporting Asio tokens properly. The continuation styles could be exemplified in the exposition with some small snippets for users not used to Asio without the documentation losing any value. - Some examples are simple enough and don't require the reader to know the rest of the exposition. They are like a quick look into the library. These could come at the beginning, as in the Asio tutorials and Beast quick look section. - The first sync example could be simpler to involve just a hello world before moving on to other operations. - The page about the docker container should specify that the username and password are "root" and "". Tests: Some unit tests take a ***very*** long time. Enough to make coffee and a sandwich. And they seem to not be adding a lot of value in terms of coverage. For instance, "mysql/test/unit/detail/protocol/date.cpp(72): info: check '1974- 1-30' has passed" going through all possible dates multiple times took a long time. problems? No problems at all. GCC 11 and MSVC 19. ## Documentation
What is your evaluation of the documentation?
The documentation is complete. The main points that differentiate the library are - it's a complete rewrite of the protocol, - it's low-level and - it's based on Boost.Asio The documentation should emphasize these points as much as possible, especially the first one. This should be in the introduction, the motivation, slogans, logos, and wherever people can see it easily. The documentation should also provide arguments and evidence that these design goals are a good idea, as often discussed when the topic is the value of this library. Why is it worth rewriting the protocol? To what use cases are such a low-level library useful? Why should a person who already uses other libraries or the C API care about Asio now? Something that should also be highlighted is the difference between the library and other higher-level libraries, in particular, naming names. Minor issues: - There's no link in the documentation to the protocol specification. It would be interesting to know what the reference specification is. Or whether the protocol was inferred somehow. Is there any chance this protocol might change? What about divergences between MySql and MariaDB? How stable is the protocol? For what range of versions does it work? What's the policy when it changes? - Some links are broken (for instance, linking to https://anarthal.github.io/boost-mysql/index.html). - "All async operations in this library support per-operation cancellation". It's important to highlight this is per operation in the Asio sense of an operation but not in the MySql sense of an operation because the MySql connection is invalid after that. - "Boost.MySql has been tested with the following versions of MySQL". MariaDB is not a version of MySql. - Prepared statements should come first in the examples, to highlight them as the default pattern. - The documentation refers to topics that haven't been explained yet. Maybe "value" could be explained after "row", and "row" could be explained after "resultset" and "resultset" after "queries". - The section "Text Queries" is quite small in comparison to other sections. It could include some examples and snippets like other sections do. - "The following completion tokens can be used in any asyncrhonous operation within Boost.Mysql" -> "Any completion token..." - "When they fail, they throw a boost::system::system_error exception". Don't these functions just set the proper error_code, as usual with Asio and Beast? - The "MySQL to C++ mapping reference" section should be using a table. - A small subsection on transactions would be helpful even if there's no library functionality to help with that. - The documentation should include some comparisons that are not obvious to potential users. C/C++ APIs. The advantages of the Asio async model. Benchmarks if possible. ## Conclusion
How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I spent one day in this review. I read all the documentation, ran the tests, experimented with the examples, and had a reasonable look at the implementation.
Are you knowledgeable about the problem domain?
I'm reasonably educated about databases but not an expert. I've been working a lot with Asio.
Are there any immediate improvements that could be made after acceptance, if acceptance should happen?
While it's important to have a general variant type for row values, a simpler interface for tuples of custom types would be very welcome and would simplify things by a lot, while also avoiding allocations, since columns always have the same types. This feature is too obvious since users almost always know their column types at compile time and this demand is too recurrent in applications to ignore.
Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion.
I believe it should be conditionally accepted with the same conditions stated in other reviews: allowing for memory reuse in the read_* functions and fixing the "value" type. Best, Em ter., 10 de mai. de 2022 ?s 04:14, Richard Hodges via Boost < boost@lists.boost.org> escreveu:
Dear All,
The Boost formal review of the MySQL library starts Today, taking place from May 10th, 2022 to May 19th, 2022 (inclusive) - We are starting one day after the announced date and extending the period by one day to compensate.
The library is authored by Rub?n P?rez Hidalgo (@anarthal in the CppLang slack).
Documentation: https://anarthal.github.io/mysql/index.html Source: https://github.com/anarthal/mysql/
The library is built on the bedrock of Boost.Asio and provides both synchronous and asynchronous client connectors for the MySQL database system.
Boost.MySQL is written from the ground up, implementing the entire protocol with no external dependencies beyond the Boost library. It is compatible with MariaDB.
Connectivity options include TCP, SSL and Unix Sockets.
For async interfaces, examples in the documentation demonstrate full compatibility with all Asio completion handler styles, including:
Callbacks:- https://anarthal.github.io/mysql/mysql/examples/query_async_callbacks.html
Futures :- https://anarthal.github.io/mysql/mysql/examples/query_async_futures.html
Boost.Coroutine :- https://anarthal.github.io/mysql/mysql/examples/query_async_coroutines.html
C++20 Coroutines :-
https://anarthal.github.io/mysql/mysql/examples/query_async_coroutinescpp20....
Rub?n has also implemented the Asio protocols for deducing default completion token types :-
https://anarthal.github.io/mysql/mysql/examples/default_completion_tokens.ht...
Reviewing a database connector in depth will require setting up an instance of a MySQL database. Fortunately most (all?) Linux distributions carry a MySQL and/or MariaDB package. MySQL community edition is available for download on all platforms here: https://dev.mysql.com/downloads/
Rub?n has spent quite some time in order to bring us this library candidate. The development process has no doubt been a journey of discovery into Asio, its concepts and inner workings. I am sure he has become a fount of knowledge along the way.
From a personal perspective, I was very happy to be asked to manage this review. I hope it will be the first of many more reviews of libraries that tackle business connectivity problems without further dependencies beyond Boost, arguably one of the most trusted foundation libraries available.
Please provide in your review information you think is valuable to understand your choice to ACCEPT or REJECT including Describe as a Boost library. Please be explicit about your decision (ACCEPT or REJECT).
Some other questions you might want to consider answering:
- Will the library bring additional out-of-the-box utility to Boost? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - Will the choice of API abstraction model ease the development of software that must talk to a MySQL database? - Are there any immediate improvements that could be made after acceptance, if acceptance should happen? - Did you try to use the library? With which compiler(s)? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain?
More information about the Boost Formal Review Process can be found at: http://www.boost.org/community/reviews.html
The review is open to anyone who is prepared to put in the work of evaluating and reviewing the library. Prior experience in contributing to Boost reviews is not a requirement.
Thank you for your efforts in the Boost community. They are very much appreciated.
Richard Hodges - review manager of the proposed Boost.MySQL library
Rub?n is often available on CppLang Slack and of course by email should you require any clarification not covered by the documentation, as am I.
-- Richard Hodges hodges.r@gmail.com tg: @rhodges office: +44 2032 898 513 mobile: +376 380 212
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Alan Freitas https://github.com/alandefreitas ------------------------------ Subject: Digest Footer _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost ------------------------------ End of Boost Digest, Vol 6704, Issue 1 **************************************