Hi all, I'm Ruben, the author of Boost.MySQL, and I want to present (and maybe get some feedback about) a project I'm developing to showcase how to do async stuff with Boost server-side. This started because I want to write a connection_pool class for MySQL. It's a complex feature with a lot of trade-offs, and I wish I had more field experience to guide me when making the required decisions. We currently have great libs for async networking (Asio, Beast, Redis, MySQL). They've got examples and docs, but I think they sometimes fall short. Often users get lost since setting up an async server is not trivial, and we offer a lot of flexibility that can sometimes stun newcomers. So I've decided to blend these two needs together and create a project that allows Boost authors to try new things in a realistic environment, while demonstrating to users how they can use Boost to build an app. (Note: this is **not** a Boost library, and will never be). Code: https://github.com/anarthal/servertech-chat/ Docs: https://anarthal.github.io/servertech-chat/ Live demo: http://13.48.215.34/ It's still a work-in-progress. Although I've implemented some features, I've tried to focus on the skeleton, getting testing, deployments and C++ best practices in place. I plan on adding more stuff in the future, like showing how you can call 3rd party REST APIs using Boost. One cool feature is that, if you have an AWS account, you can fork the repo and get a live server in minutes (see https://anarthal.github.io/servertech-chat/03-fork-modify-deploy.html). I actually envision this as a series of projects, showcasing different applications and needs. I originally wrote this document (https://docs.google.com/document/d/1ZQrod1crs8EaNLLqSYIRMacwR3Rv0hC5l-gfL-jO...) describing how I saw it. If any other author wants to step forward and create their own ServerTech application, please let me know. I'd like to have some (hopefully constructive) feedback from you guys. I'd thank you if you give the live demo a try. And if someone's interested, there's a ton of work to do, so let me know. Note: if you've scanned through the code and you're asking "where is the connection_pool" - nowhere yet, that's still to come. Thanks, Ruben.
On Tue, Sep 5, 2023 at 12:43 PM Ruben Perez via Boost
I'm Ruben, the author of Boost.MySQL, and I want to present (and maybe get some feedback about) a project I'm developing to showcase how to do async stuff with Boost server-side.
... So I've decided to [...] create a project that
allows Boost authors to try new things in a realistic environment, while demonstrating to users how they can use Boost to build an app.
Very nice. From https://github.com/anarthal/servertech-chat/#architecture in describing the example app: The server is based on Boost.Beast, asynchronous (it uses stackful
coroutines) and single-threaded.
This is a good/appropriate introduction for programmers to experience developing what has the power of an asynchronous application but with the simplicity of a single-threaded application without the need for locks. A more advanced app, and what I would like to see personally, is an example and architectural discussion on design patterns involving how best to handle server requests that require more time/resources that may not be appropriate for a single-threaded server (e.g. a database server.) From a high-level perspective, my current thinking on this is: - Handle fast requests in the main single-threaded boost.asio event loop (assuming they don't access resources locked by the next bullet point.) - Handle longer requests by delegating to a separate thread pool that triggers a callback when done, without blocking the single-threaded event loop. The threads in the separate thread pool do "traditional locking" via mutexes, read/write locks, etc. Are there more modern approaches/techniques? In either case, a discussion about these issues would be valuable IMHO after the introductory example. Thank you, Matt
On Wed, Sep 6, 2023 at 5:11 PM Matt Pulver via Boost
On Tue, Sep 5, 2023 at 12:43 PM Ruben Perez via Boost < boost@lists.boost.org>
showcase how to do async stuff with Boost server-side.
A more advanced app, and what I would like to see personally, is an example and architectural discussion on design patterns involving how best to handle server requests that require more time/resources that may not be appropriate for a single-threaded server (e.g. a database server.)
I agree. This is also my case. While I can understand Ruben focuses on using async libs "all the way", unfortunately I don't have that luxury (or honestly the knowledge). In fact, some of the "protocols" / APIs I must support allow requesting large amount of data, thus I don't want a single request from one client to block all other clients. When everything is async, including talking to remote RDBMS's, maybe that one big request gets "suspended", allowing other (independent) requests to make progress, but in my case with a non-async DB layer to talk to, I don't think a single-threaded server is a viable solution. So while your initiative is great, and I'll try to study it, getting into a "less pure" hybrid app example, mixing async *and* sync (typically talking to DBs that did not get your async Boost.MySQL or Boost.Redis treatment), as Matt mentioned, would definitely help me. FWIW. In any case, thank you Ruben for trying to teach the rest of us the "async way" with Boost. --DD
I agree. This is also my case.
While I can understand Ruben focuses on using async libs "all the way", unfortunately I don't have that luxury (or honestly the knowledge). In fact, some of the "protocols" / APIs I must support allow requesting large amount of data, thus I don't want a single request from one client to block all other clients.
As I mentioned in a previous email (seems like mailman hasn't generated a URL I can link to), my approach would be to offload the individual pieces of sync or heavy work to a thread_pool. The main event loop would still be single-threaded, dispatching those pieces to the thread pool. I've added code demonstrating this approach to https://github.com/anarthal/servertech-chat/issues/44.
When everything is async, including talking to remote RDBMS's, maybe that one big request gets "suspended", allowing other (independent) requests to make progress, but in my case with a non-async DB layer to talk to, I don't think a single-threaded server is a viable solution.
Would you mind sharing protocols or RDBMs without a sync-only interface are you interested in? These could make good C++ libraries in the future. I envision a future where almost every system has an async interface, as it happens in Python or Node. Until that future comes, the thread_pool solution seems the best. Thanks, Ruben.
A more advanced app, and what I would like to see personally, is an example and architectural discussion on design patterns involving how best to handle server requests that require more time/resources that may not be appropriate for a single-threaded server (e.g. a database server.) From a high-level perspective, my current thinking on this is:
I guess you mean any protocol that does not have an async library, or a resource-intensive task such as image processing? If there is a specific task or protocol you'd like to see, please do mention it. Even if it does not fit in the chat application architecture, we can always use it as an idea for another Servertech app.
Handle fast requests in the main single-threaded boost.asio event loop (assuming they don't access resources locked by the next bullet point.) Handle longer requests by delegating to a separate thread pool that triggers a callback when done, without blocking the single-threaded event loop. The threads in the separate thread pool do "traditional locking" via mutexes, read/write locks, etc.
Are there more modern approaches/techniques?
I think this goes the right way, but I'd try to encapsulate it in a
class that exposes an async
interface. So let's say your troublesome call is `get_db_customer`,
which is a third party, sync
function that may block for a long time. I'd go for something like:
class db_client
{
// configure this with the number of threads you want
boost::asio::thread_pool pool_;
public:
customer get_customer(boost::asio::yield_context yield)
{
// A channel is a structure to communicate between coroutines.
// concurrent_channel is thread-safe
boost::asio::experimental::concurrent_channel
On Thu, Sep 7, 2023 at 7:31 AM Ruben Perez
A more advanced app, and what I would like to see personally, is an example and architectural discussion on design patterns involving how best to handle server requests that require more time/resources that may not be appropriate for a single-threaded server (e.g. a database server.) From a high-level perspective, my current thinking on this is:
I guess you mean any protocol that does not have an async library, or a resource-intensive task such as image processing? If there is a specific task or protocol you'd like to see, please do mention it.
The example I have in mind is an actual database server (not client) which uses Boost.Asio to handle SELECT, UPDATE, CREATE TABLE sql statements, etc. The challenges posed by a database server are quite general and will apply to many applications, and I would guess that any developer considering using Boost.Asio for handling asynchronous I/O operations will have questions along the lines of: - How should I handle longer requests that are delegated to a thread pool, especially with respect to resource contention? E.g. An UPDATE query to table A should lock table A so that asynchronous SELECT queries on A don't read inconsistent data. A few ideas that come to mind are: - Use traditional mutexes and lock guards https://en.cppreference.com/w/cpp/thread/lock_guard for each table. - Use unique locks https://en.cppreference.com/w/cpp/thread/unique_lock on writes, and shared locks https://en.cppreference.com/w/cpp/thread/shared_lock for reads. - Use a locking manager to avoid deadlocks (if scoped lock https://en.cppreference.com/w/cpp/thread/scoped_lock isn't sufficient.) - A completely different paradigm: Multiversion Concurrency Control https://www.postgresql.org/docs/current/mvcc-intro.html as PostgreSQL uses, in which changes are done to a separate version of the data. (This is quite a general concept and doesn't just apply to databases.) When completed, the version upgrade can be done in the main Boost.Asio thread and thus avoid locking (with caveats). Are there more modern/sophisticated techniques to deal with these issues? Or if they are outside the scope of Boost.Asio, it is still worth mentioning them so that we know to deal with them at the right level in the architecture. These are the types of considerations/discussions I am interested in.
The example I have in mind is an actual database server (not client) which uses Boost.Asio to handle SELECT, UPDATE, CREATE TABLE sql statements, etc.
The challenges posed by a database server are quite general and will apply to many applications, and I would guess that any developer considering using Boost.Asio for handling asynchronous I/O operations will have questions along the lines of:
- How should I handle longer requests that are delegated to a thread pool, especially with respect to resource contention? E.g. An UPDATE query to table A should lock table A so that asynchronous SELECT queries on A don't read inconsistent data.
Why would asio be a good idea for anything but the IO in a database server?
The example I have in mind is an actual database server (not client) which uses Boost.Asio to handle SELECT, UPDATE, CREATE TABLE sql statements, etc.
I'd say this is out of scope of the BoostServerTech project, at least as I had it in mind. A database server is a quite specialized and complex piece of software. I lack the time and knowledge to write such an example. And I think it would target a relatively small part of our community. That said, if you or any other person feels like doing it, you're welcome to step in.
The challenges posed by a database server are quite general and will apply to many applications, and I would guess that any developer considering using Boost.Asio for handling asynchronous I/O operations will have questions along the lines of:
How should I handle longer requests that are delegated to a thread pool, especially with respect to resource contention? E.g. An UPDATE query to table A should lock table A so that asynchronous SELECT queries on A don't read inconsistent data.
A few ideas that come to mind are:
Use traditional mutexes and lock guards for each table. Use unique locks on writes, and shared locks for reads. Use a locking manager to avoid deadlocks (if scoped lock isn't sufficient.) A completely different paradigm: Multiversion Concurrency Control as PostgreSQL uses, in which changes are done to a separate version of the data. (This is quite a general concept and doesn't just apply to databases.) When completed, the version upgrade can be done in the main Boost.Asio thread and thus avoid locking (with caveats).
Are there more modern/sophisticated techniques to deal with these issues? Or if they are outside the scope of Boost.Asio, it is still worth mentioning them so that we know to deal with them at the right level in the architecture.
The issues you can find here are more related to concurrency control and data structures than with asynchronous I/O. I'd say this is mostly out of Asio's scope. Regards, Ruben.
The issues you can find here are more related to concurrency control and data structures than with asynchronous I/O. I'd say this is mostly out of Asio's scope.
Out of Asio's scope, sure, but not necessarily out of scope for someone looking at ServerTech and wanting to use it as the foundation for their next business application. If the goal of the project is to educate and sell Asio, there's a lot of questions that need to be answered that are technically out of scope for Asio's goals itself. - Christian
participants (5)
-
Christian Mazakas
-
Dominique Devienne
-
Klemens Morgenstern
-
Matt Pulver
-
Ruben Perez