[fiber] on schedulers and wakeups

newer
Boost.Fiber mini-review September...

older
Critical. Help required. Boost.DLL...

Giovanni Deretta

8 Sep 2015 8 Sep '15

11:34 a.m.

Hi Oliver, I'm looking at the boost.fiber scheduling customization options. I have a few comments. My first concern is that scheduler is a global (per thread) property. This makes sense, but it limits its usability in libraries, unless the library completely owns a thread. It would be nice if schedulers where schedulable entities themselves, so that they could be nested (more at the end). Also the description of the scheduler interface does not specify any thread safety requirements. I assume that at least awakened must be thread safe as the scheduling might be caused by a signal coming from another thread. Any requirements should be specified properly. This leads to two additional points. First of all, there should be a way to retrieve the scheduler associated with a fiber: I haven't looked at the source, but the association must exist internally, otherwise cross thread scheduling wouldn't work. Second, there does not seem to be a way to allow signaled fibers to run in the context of the signaling thread. This seems an important optimization.I understand this is an explicit decision as currently it is not possible to portably migrate fibers, but the option should be left to the user if he knows it is safe in their setup. Possibly require 'awakened(fiber*x)' to call x->get_scheduler()->awakened(x) if it does not support running the fiber. My preference is to add a queryable boolean flag to each schedulable entity that states whether it is allowed to move from its scheduler (this makes a difference when you have multiple nested schedulers). Among other things this would also prevent work stealing. The flag would by default be set to prevent migration of course. Now, what do nested schedulers give you? In addition to composability, you can have the equivalent of an asio strand without explicit mutual exclusion. Let say you have a bunch of fibers that all access the same resource; you do not care where they run, as long as they never run concurrently. By binding all of them to the same scheduler, the guarantee is implicit. It should be possible to implement nested schedulers by having each fiber set the current scheduler pointer if it differs and store the previous one. On a scheduling event, if the current scheduler runs out of runnables, the signal/yield code would restore the previous scheduler and try again. Of course handling priorities and non strict fifo scheduling might be more complex. Note that the multi scheduler logic need to be implemented by Boost.Fiber itself. Finally, it might be possible (but I haven't tested this) to make fibers migrable on platforms where it normally isn't safe by allocating an actual thread for each one. The thread would be stopped as soon as it has been created, but a thread control block plush TLS entries would have been allocated. The existing context switch would need to be augmented with the ability to swap the thread context pointer. HTH, -- gpd

Show replies by date

Hartmut Kaiser

8 Sep 8 Sep

4:36 p.m.

...

I'm looking at the boost.fiber scheduling customization options. I have a few comments.

My first concern is that scheduler is a global (per thread) property. This makes sense, but it limits its usability in libraries, unless the library completely owns a thread. It would be nice if schedulers where schedulable entities themselves, so that they could be nested (more at the end).

Also the description of the scheduler interface does not specify any thread safety requirements. I assume that at least awakened must be thread safe as the scheduling might be caused by a signal coming from another thread. Any requirements should be specified properly. This leads to two additional points.

First of all, there should be a way to retrieve the scheduler associated with a fiber: I haven't looked at the source, but the association must exist internally, otherwise cross thread scheduling wouldn't work.

Second, there does not seem to be a way to allow signaled fibers to run in the context of the signaling thread. This seems an important optimization.I understand this is an explicit decision as currently it is not possible to portably migrate fibers, but the option should be left to the user if he knows it is safe in their setup. Possibly require 'awakened(fiber*x)' to call x->get_scheduler()->awakened(x) if it does not support running the fiber. My preference is to add a queryable boolean flag to each schedulable entity that states whether it is allowed to move from its scheduler (this makes a difference when you have multiple nested schedulers). Among other things this would also prevent work stealing. The flag would by default be set to prevent migration of course.

Now, what do nested schedulers give you? In addition to composability, you can have the equivalent of an asio strand without explicit mutual exclusion. Let say you have a bunch of fibers that all access the same resource; you do not care where they run, as long as they never run concurrently. By binding all of them to the same scheduler, the guarantee is implicit.

It should be possible to implement nested schedulers by having each fiber set the current scheduler pointer if it differs and store the previous one. On a scheduling event, if the current scheduler runs out of runnables, the signal/yield code would restore the previous scheduler and try again. Of course handling priorities and non strict fifo scheduling might be more complex. Note that the multi scheduler logic need to be implemented by Boost.Fiber itself.

Finally, it might be possible (but I haven't tested this) to make fibers migrable on platforms where it normally isn't safe by allocating an actual thread for each one. The thread would be stopped as soon as it has been created, but a thread control block plush TLS entries would have been allocated. The existing context switch would need to be augmented with the ability to swap the thread context pointer.

+1 to everything! Regards Hartmut --------------- http://boost-spirit.com http://stellar.cct.lsu.edu

Oliver Kowalke

6:21 p.m.

2015-09-08 13:34 GMT+02:00 Giovanni Deretta <gpderetta@gmail.com>:

...

My first concern is that scheduler is a global (per thread) property. This makes sense, but it limits its usability in libraries, unless the library completely owns a thread.

the scheduler is only entered if a function from boost.fiber is called. code outside of fiber code is ‎unaffected how does it limit the usability? It would be nice if schedulers where schedulable

...

entities themselves, so that they could be nested (more at the end).

in the current design each thread has one scheduler (the scheduler is hidden). a scheduleable entity is a fiber context (fiber's stack), that means that each boost::fibers::fiber is attached to one fiber context (detaching a fiber means decoupleing from the context). the scheduler maintains several queues (waiting ,ready ,...) containing fiber context's depending on their state (waiting, ready ,...). the scheduler_algorithm (customizable) defines how ready context's are sorted (round-robin, priority-queue,...) for resumption. if a fiber context becomes ready the scheduler calls scheduler_algorithm::awakened() and passes the fiber context to the algorithm. in order to resume the next context, the scheduler calls sched_algorithm::pick_next(). if a scheduler would be schedulable, it would have been a context - a scheduler would then schedule itself. I'm uncertain how your schedulable schedulers would fit in this pattern (probably not very well).

...

Also the description of the scheduler interface does not specify any thread safety requirements. I assume that at least awakened must be thread safe as the scheduling might be caused by a signal coming from another thread. Any requirements should be specified properly. This leads to two additional points.

in the context of migrating fibers between threads, yes. boost.fiber started in the review claiming that fiber-migration is not supported, thus the thread safety requirements are not mentioned.

...

First of all, there should be a way to retrieve the scheduler associated with a fiber:

yes, fibers have a pointer to its (thread-local) scheduler

...

Second, there does not seem to be a way to allow signaled fibers to run in the context of the signaling thread. This seems an important optimization.I understand this is an explicit decision as currently it is not possible to portably migrate fibers, but the option should be left to the user if he knows it is safe in their setup. Possibly require 'awakened(fiber*x)' to call x->get_scheduler()->awakened(x) if it does not support running the fiber.

signaling a fiber is done via an atomic (owned by the fiber) - if thread t1 signals fiber f2 running in thread t2, the scheduler of t2 encounters the changed state of f2 resumes f2. sched_algo::awakend() does not run/resume the fiber, instead it tells the scheduler that the passed fiber is ready to run. awakened(fiber*x) -> x->get_scheduler()->awakened(x) would create a loop, because awakened(fiber*x) is called from the scheduler that owns the fiber context and x->get_scheduler() is a pointer to this scheduler. fiber::set_ready() is used to signal the fiber - the function can by called from code running in the same thread or in another thread

...

My preference is to add a queryable boolean flag to each schedulable entity that states whether it is allowed to move from its scheduler (this makes a difference when you have multiple nested schedulers). Among other things this would also prevent work stealing. The flag would by default be set to prevent migration of course.

one of the previous version of boost.fiber has had the property thread_affinity for this purpose

...

Now, what do nested schedulers give you? In addition to composability, you can have the equivalent of an asio strand without explicit mutual exclusion. Let say you have a bunch of fibers that all access the same resource; you do not care where they run, as long as they never run concurrently. By binding all of them to the same scheduler, the guarantee is implicit.

wouldn't composeable schedulers require to be explicitly called by user code (start/stop of scheduling)?

Giovanni Piero Deretta

10:19 p.m.

On Tue, Sep 8, 2015 at 7:21 PM, Oliver Kowalke <oliver.kowalke@gmail.com> wrote:

...

2015-09-08 13:34 GMT+02:00 Giovanni Deretta <gpderetta@gmail.com>:

...
My first concern is that scheduler is a global (per thread) property. This makes sense, but it limits its usability in libraries, unless the library completely owns a thread.

the scheduler is only entered if a function from boost.fiber is called. code outside of fiber code is ‎unaffected how does it limit the usability?

Two different libraries can't use boost.fiber if they require a custom scheduler or they need to run their own threads. Basically the scheduler is a scarce resource and limits composability. Of course code that doesn't know about fibers isn't affected.

...

It would be nice if schedulers where schedulable

...
entities themselves, so that they could be nested (more at the end).

in the current design each thread has one scheduler (the scheduler is hidden). a scheduleable entity is a fiber context (fiber's stack), that means that each boost::fibers::fiber is attached to one fiber context (detaching a fiber means decoupleing from the context).

how do you detach a fiber by a context? What does it mean? Did you mean detach a fiber+context from a scheduler?

...

the scheduler maintains several queues (waiting ,ready ,...) containing fiber context's depending on their state (waiting, ready ,...).

why are the wait queues part of the scheduler itself? Shouldn't they be a property of the waitable primitive?

...

the scheduler_algorithm (customizable) defines how ready context's are sorted (round-robin, priority-queue,...) for resumption. if a fiber context becomes ready the scheduler calls scheduler_algorithm::awakened() and passes the fiber context to the algorithm. in order to resume the next context, the scheduler calls sched_algorithm::pick_next().

ok, that matches my understanding.

...

if a scheduler would be schedulable, it would have been a context - a scheduler would then schedule itself. I'm uncertain how your schedulable schedulers would fit in this pattern (probably not very well).

One option is scheduling schedulable entities. Something like schedulable { schedulable * next, * previous; void (*run)(schedulable*self); // or use a virtual }; but there are of course advantages in knowing that each schedulable is a contex as you can swapcontext to it.

...

...
Also the description of the scheduler interface does not specify any thread safety requirements. I assume that at least awakened must be thread safe as the scheduling might be caused by a signal coming from another thread. Any requirements should be specified properly. This leads to two additional points.

in the context of migrating fibers between threads, yes. boost.fiber started in the review claiming that fiber-migration is not supported, thus the thread safety requirements are not mentioned.

yes, but a running fiber bound to thread 1 can still signal a waiting fiber bound to thread 2 (via a condition variable for example), so there must be some thread safe mechanism for for transitioning a waiting fiber to runnable, i.e. a way to add it to the ready queue. I thought that 'awakened' was supposed to do that. Ok, I went and looked at the code. I see that on wakeup, the context is simply marked as ready. The main scheduler loop, after each scheduling event goes through the wait list and moves ready contextes to the main thread. It sleeps for a specific interval if there are no runnable contextes. This has multiple issues. First of all the periodic sleeping is extremely bad. If 'run' can't do anything right now, it shouldn't sleep a fixed interval, it should instead block waiting for an external signal (on some applications, busy waiting could be an option). Second, scanning the wait list for ready contextes at every rescheduling doesn't really scale; for such a simple scheduler, scheduling should be strictly O(1). Timers have similar issues. This is how I would expect a scheduling algorithm to work: context { atomic<context*> next; scheduler* preferred_scheduler; }; context * this_context; // currently running scheduler* this_scheduler; // current scheduler scheduler { intrusive_slist<context> ready; intrusive_slist<context> ready_next; atomic_intrusive_slist<context> remote_ready; event_count ec; atomic<bool> done; // can be used to interrupt the scheduler // this is the idle task, itself a fiber. Should run in the original thread stack void idle() { while (true) { auto n = get_next(); while (n == 0) { int ticket = ec.prepare_wait() if ((n = get_next()) { ec.retire_wait(); break; } ec.wait(); } ready_next.push_back(this_context); switch_context(n, this_context); // the idle fiber should also pump the timer queue and react to external interrupt signals. // when running inside asio, it will also pump the io_service (or return to it). // in a work sharing setup would also answer external requests for work. } } void remote_add(context *w) { remote_ready.push_back(waiter); ec.signal(); } }; context * get_next() { if (auto n = ready.pop()) { return n; } if (auto n = remote_ready.pop_all()) { ready_next.splice_back(n); // whole list not just one element } swap(ready_next, ready); return ready_next.pop(); } void yield() { auto n = get_next(); assert(n); ready_next.push_back(this_context); switch_context(n, this_context); } // make a context ready void signal(context* waiter) { if(waiter->preferred_scheduler != this_scheduler) { // cross thread waiter->preferred_scheduler->remote_add(waiter); } else { // no preference or local this->scheduler.next_ready.push_back(waiter); // alternatively push this_context to back of queue and yield to waiter immediately } } // put this context to sleep on some queue (a mutex, condar, future, barrier, etc.) void wait(wait_queue* w) { auto n = this_scheduler->get_next(); assert(n); // these two should be done under the wait_queue spinlock or in reverse order to be safe from concurent remote wakeups w.push_back(this_context); } -- gpd

Oliver Kowalke

9 Sep 9 Sep

4:09 a.m.

2015-09-09 0:19 GMT+02:00 Giovanni Piero Deretta <gpderetta@gmail.com>:

...

...
2015-09-08 13:34 GMT+02:00 Giovanni Deretta <gpderetta@gmail.com>:

...
My first concern is that scheduler is a global (per thread) property. This makes sense, but it limits its usability in libraries, unless the

On Tue, Sep 8, 2015 at 7:21 PM, Oliver Kowalke <oliver.kowalke@gmail.com> wrote: library

...
...
completely owns a thread.

the scheduler is only entered if a function from boost.fiber is called. code outside of fiber code is ‎unaffected how does it limit the usability?

Two different libraries can't use boost.fiber if they require a custom scheduler or they need to run their own threads. Basically the scheduler is a scarce resource and limits composability.

OK, good point! One of the design decisions for boost.fiber was that the scheduler itself is not visible (the user code does not instantiate it). the library should work similar to std::thread (os-scheduler not visible ...). of course if this decision is given up, the user creates (on stack or heap) a scheduler and if a new fiber is created the code must specify to which scheduler the fiber belongs to. my_scheduler ms; fiber f( ms, fn, arg1, arg2), f.join(); in the context of migrating fibers between threads: - in the current design only single fibers are moved between schedulers running in different threads - you suggest that the scheduler itself is moved between threads?!

...

...
It would be nice if schedulers where schedulable

...
entities themselves, so that they could be nested (more at the end).

in the current design each thread has one scheduler (the scheduler is hidden). a scheduleable entity is a fiber context (fiber's stack), that means that each boost::fibers::fiber is attached to one fiber context (detaching a fiber means decoupleing from the context).

how do you detach a fiber by a context? What does it mean? Did you mean detach a fiber+context from a scheduler?

boost::fibers::fiber has a member pointer to boost::fibers::context boost::fibers::scheduler manages boost::fibers::context * detaching means boost::fibers:.fiber relases its pointer to boost::fibers::context. the scheduler still manages (scheduling/lifetime) the detached context (boost::fibers::context is a control structure residing on the top of the fiber-stack)

Giovanni Deretta

10:16 a.m.

Oliver Kowalke <oliver.kowalke <at> gmail.com> writes:

...

2015-09-09 0:19 GMT+02:00 Giovanni Piero Deretta <gpderetta <at> gmail.com>:

...
On Tue, Sep 8, 2015 at 7:21 PM, Oliver Kowalke <oliver.kowalke <at> gmail.com> wrote:

...
2015-09-08 13:34 GMT+02:00 Giovanni Deretta <gpderetta <at> gmail.com>:

...
My first concern is that scheduler is a global (per thread) property. This makes sense, but it limits its usability in libraries, unless the library completely owns a thread.

the scheduler is only entered if a function from boost.fiber is called. code outside of fiber code is ‎unaffected how does it limit the usability?

Two different libraries can't use boost.fiber if they require a custom scheduler or they need to run their own threads. Basically the scheduler is a scarce resource and limits composability.

OK, good point! One of the design decisions for boost.fiber was that the scheduler itself is not visible (the user code does not instantiate it). the library should work similar to std::thread (os-scheduler not visible ...).

The point is that the scheduler is not completely invisible as it can be replaced. As the default scheduler is very basic, replacing it becomes pretty much a requirement.

...

of course if this decision is given up, the user creates (on stack or heap) a scheduler and if a new fiber is created the code must specify to which scheduler the fiber belongs to.

my_scheduler ms;

fiber f( ms, fn, arg1, arg2),

Specifying an explicit scheduler would be a nice addition, but if not specified, it should default to the current scheduler for the thread.

...

in the context of migrating fibers between threads:

- in the current design only single fibers are moved between schedulers running in different threads - you suggest that the scheduler itself is moved between threads?!

As I suggested elsethread, a child scheduler would appear to the parent scheduler as another fiber, so if that fiber is migrated, the whole scheduler is.

...

...
...
It would be nice if schedulers where schedulable

...
entities themselves, so that they could be nested (more at the end).

in the current design each thread has one scheduler (the scheduler is hidden). a scheduleable entity is a fiber context (fiber's stack), that means that each boost::fibers::fiber is attached to one fiber context (detaching a fiber means decoupleing from the context).

how do you detach a fiber by a context? What does it mean? Did you mean detach a fiber+context from a scheduler?

boost::fibers::fiber has a member pointer to boost::fibers::context boost::fibers::scheduler manages boost::fibers::context * detaching means boost::fibers:.fiber relases its pointer to boost::fibers::context.

the scheduler still manages (scheduling/lifetime) the detached context (boost::fibers::context is a control structure residing on the top of the fiber-stack)

Ok, I think I understand; I was thinking about fiber the abstract concept, while you of course meant the boost::fiber::fiber class, which can be detached like an std::thread. Although now I can't see how detaching is relevant to the nested scheduler discussion... -- gpd

Hartmut Kaiser

11:38 a.m.

...

...
2015-09-09 0:19 GMT+02:00 Giovanni Piero Deretta <gpderetta <at> gmail.com>:

...
On Tue, Sep 8, 2015 at 7:21 PM, Oliver Kowalke <oliver.kowalke <at> gmail.com> wrote:

...
2015-09-08 13:34 GMT+02:00 Giovanni Deretta <gpderetta <at> gmail.com>:

...
My first concern is that scheduler is a global (per thread)

...
...
...
...
makes sense, but it limits its usability in libraries, unless the

Oliver Kowalke <oliver.kowalke <at> gmail.com> writes: property. This library

...
...
...
...
completely owns a thread.

the scheduler is only entered if a function from boost.fiber is called. code outside of fiber code is ‎unaffected how does it limit the usability?

Two different libraries can't use boost.fiber if they require a custom scheduler or they need to run their own threads. Basically the scheduler is a scarce resource and limits composability.

OK, good point! One of the design decisions for boost.fiber was that the scheduler itself is not visible (the user code does not instantiate it). the library should work similar to std::thread (os-scheduler not visible ...).

The point is that the scheduler is not completely invisible as it can be replaced. As the default scheduler is very basic, replacing it becomes pretty much a requirement.

...
of course if this decision is given up, the user creates (on stack or

...
a scheduler and if a new fiber is created the code must specify to which scheduler

heap) the

...
fiber belongs to.

my_scheduler ms;

fiber f( ms, fn, arg1, arg2),

Specifying an explicit scheduler would be a nice addition, but if not specified, it should default to the current scheduler for the thread.

Hmmm, I would prefer if the scheduler and fiber interfaces were left orthogonal. A fiber is just a callable, its API shouldn't have a notion of where it is run. I'd like to see something like this instead: // asynchronous execution T fn(Arg...) {...} fiber_executor exec; future<T> f = executor_traits<fiber_executor>::async_execute(bind(fn, arg...)); or // synchronous execution executor_traits<fiber_executor>::execute(bind(fn, arg...)); (see N4406, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4406.pdf) That would also provide an uniform interface for other means of work-scheduling. Also users can easily implement their own executor. FWIW, we have implemented something similar in HPX and it works very well (see for instance http://stellar-group.github.io/hpx/docs/html/hpx/manual/parallel/executors.h...).

...

...
in the context of migrating fibers between threads:

- in the current design only single fibers are moved between schedulers running in different threads - you suggest that the scheduler itself is moved between threads?!

As I suggested elsethread, a child scheduler would appear to the parent scheduler as another fiber, so if that fiber is migrated, the whole scheduler is.

...
...
...
It would be nice if schedulers where schedulable

...
entities themselves, so that they could be nested (more at the end).

in the current design each thread has one scheduler (the scheduler is hidden). a scheduleable entity is a fiber context (fiber's stack), that means that each boost::fibers::fiber is attached to one fiber context (detaching a fiber means decoupleing from the context).

how do you detach a fiber by a context? What does it mean? Did you mean detach a fiber+context from a scheduler?

boost::fibers::fiber has a member pointer to boost::fibers::context boost::fibers::scheduler manages boost::fibers::context * detaching means boost::fibers:.fiber relases its pointer to boost::fibers::context.

the scheduler still manages (scheduling/lifetime) the detached context (boost::fibers::context is a control structure residing on the top of the fiber-stack)

Ok, I think I understand; I was thinking about fiber the abstract concept, while you of course meant the boost::fiber::fiber class, which can be detached like an std::thread.

Although now I can't see how detaching is relevant to the nested scheduler discussion...

Regards Hartmut --------------- http://boost-spirit.com http://stellar.cct.lsu.edu

Giovanni Deretta

12:50 p.m.

Hartmut Kaiser <hartmut.kaiser <at> gmail.com> writes:

...

Giovanni Piero Deretta <gpderetta <at> gmail.com>:

...
Oliver Kowalke <oliver.kowalke <at> gmail.com> writes:

...
of course if this decision is given up, the user creates (on stack or

...
a scheduler and if a new fiber is created the code must specify to which scheduler

heap) the

...
fiber belongs to.

my_scheduler ms;

fiber f( ms, fn, arg1, arg2),

Specifying an explicit scheduler would be a nice addition, but if not specified, it should default to the current scheduler for the thread.

Hmmm, I would prefer if the scheduler and fiber interfaces were left orthogonal. A fiber is just a callable,

A fiber itself is not just a callable though, it is logically a sequence of callables (each callable is the next continuation at wait and yield points). At each reschedule point, the fiber needs to know which executor need to use for the next continuation. The same issue happen with plain 'executed' function object that need to execute another continuation, it either needs to know the executor explicitly or there must be a default (possibly thread specific one). There are obviously advantages to using n4406 executor interface as a scheduler, especially as something like that is likely to be standardized. On the other hand a specialized fiber scheduler has its advantages, for example it never not need memory allocation to schedule a continuation as it can always use intrusive hooks to concatenate context objects which are guaranteed to persist; also knowing that the task you are yielding to is a fiber has some advantages as you can directly swap-context to it. It should always be possible to adapt an n4406-like executor to work as a fiber scheduler. Boost.Fiber might provide a generic adapter. -- gpd

Hartmut Kaiser

2:14 p.m.

...

Hartmut Kaiser <hartmut.kaiser <at> gmail.com> writes:

...
Giovanni Piero Deretta <gpderetta <at> gmail.com>:

...
Oliver Kowalke <oliver.kowalke <at> gmail.com> writes:

...
of course if this decision is given up, the user creates (on stack

...
a scheduler and if a new fiber is created the code must specify to which scheduler

or heap) the

...
fiber belongs to.

my_scheduler ms;

fiber f( ms, fn, arg1, arg2),

Specifying an explicit scheduler would be a nice addition, but if not specified, it should default to the current scheduler for the thread.

Hmmm, I would prefer if the scheduler and fiber interfaces were left orthogonal. A fiber is just a callable,

A fiber itself is not just a callable though, it is logically a sequence of callables (each callable is the next continuation at wait and yield points). At each reschedule point, the fiber needs to know which executor need to use for the next continuation.

If a fiber is not a callable, then exposing it using a std::thread-compatible interface does not really makes sense.

...

The same issue happen with plain 'executed' function object that need to execute another continuation, it either needs to know the executor explicitly or there must be a default (possibly thread specific one).

I don't see a reason not to expose the executor used to schedule the fiber. Something like auto exec = this_fiber::get_executor()

...

There are obviously advantages to using n4406 executor interface as a scheduler, especially as something like that is likely to be standardized.

N4406 is not a done deal. I expect it will be a compromise between N4406 and N4414. However I like N4406 much better as it gives a nice abstract interface (and as we have implemented it ;-). All of HPX's higher level parallelization constructs are implemented on top of this).

...

On the other hand a specialized fiber scheduler has its advantages, for example it never not need memory allocation to schedule a continuation as it can always use intrusive hooks to concatenate context objects which are guaranteed to persist; also knowing that the task you are yielding to is a fiber has some advantages as you can directly swap-context to it.

It should always be possible to adapt an n4406-like executor to work as a fiber scheduler. Boost.Fiber might provide a generic adapter.

Nod, all it needs is a specialization of the executor_traits<>. Regards Hartmut --------------- http://boost-spirit.com http://stellar.cct.lsu.edu

Giovanni Deretta

3:18 p.m.

Hartmut Kaiser <hartmut.kaiser <at> gmail.com> writes:

...

Giovanni Piero Deretta <gpderetta <at> gmail.com>:

...
Hartmut Kaiser <hartmut.kaiser <at> gmail.com> writes:

...
Giovanni Piero Deretta <gpderetta <at> gmail.com>:

...
Oliver Kowalke <oliver.kowalke <at> gmail.com> writes:

...
of course if this decision is given up, the user creates (on stack

...
a scheduler and if a new fiber is created the code must specify to which scheduler

or heap) the

...
fiber belongs to.

my_scheduler ms;

fiber f( ms, fn, arg1, arg2),

Specifying an explicit scheduler would be a nice addition, but if not specified, it should default to the current scheduler for the thread.

Hmmm, I would prefer if the scheduler and fiber interfaces were left orthogonal. A fiber is just a callable,

A fiber itself is not just a callable though, it is logically a sequence of callables (each callable is the next continuation at wait and yield points). At each reschedule point, the fiber needs to know which executor need to use for the next continuation.

If a fiber is not a callable, then exposing it using a std::thread-compatible interface does not really makes sense.

I can't parse that. Certainly I wouldn't expect boost::fiber::fiber to be a callable; then again, neither is std::thread.

...

...
The same issue happen with plain 'executed' function object that need to execute another continuation, it either needs to know the executor explicitly or there must be a default (possibly thread specific one).

I don't see a reason not to expose the executor used to schedule the fiber. Something like

auto exec = this_fiber::get_executor()

oh, sure, I'm completely in favor of that; but I also think that there should be a notion of an optional preferred fiber executor[1] (or scheduler), which will be used in preference to the thread local executor for plain yields and wakeup (for example a fiber::condition_variable::signal need to know where to schedule the woken up fibers, and the thread local executor might not be the correct place). [1] currently in Boost.Fiber this exists, and it is not optional. -- gpd

Hartmut Kaiser

8:04 p.m.

...

...
...
A fiber itself is not just a callable though, it is logically a sequence of callables (each callable is the next continuation at wait and yield points). At each reschedule point, the fiber needs to know which executor need to use for the next continuation.

If a fiber is not a callable, then exposing it using a std::thread-compatible interface does not really makes sense.

I can't parse that. Certainly I wouldn't expect boost::fiber::fiber to be a callable; then again, neither is std::thread.

I didn't make myself clear, sorry. What I meant is that a fiber 'represents a callable', not 'is a callable'. So does std::thread.

...

...
...
The same issue happen with plain 'executed' function object that need to execute another continuation, it either needs to know the executor explicitly or there must be a default (possibly thread specific one).

I don't see a reason not to expose the executor used to schedule the fiber. Something like

auto exec = this_fiber::get_executor()

oh, sure, I'm completely in favor of that; but I also think that there should be a notion of an optional preferred fiber executor[1] (or scheduler), which will be used in preference to the thread local executor for plain yields and wakeup (for example a fiber::condition_variable::signal need to know where to schedule the woken up fibers, and the thread local executor might not be the correct place).

[1] currently in Boost.Fiber this exists, and it is not optional.

We're in agreement here. Regards Hartmut --------------- http://boost-spirit.com http://stellar.cct.lsu.edu

Oliver Kowalke

6:19 a.m.

2015-09-09 0:19 GMT+02:00 Giovanni Piero Deretta <gpderetta@gmail.com>:

...

...
the scheduler maintains several queues (waiting ,ready ,...) containing fiber context's depending on their state (waiting, ready ,...).

why are the wait queues part of the scheduler itself? Shouldn't they be a property of the waitable primitive?

the scheduler has a wait queue as well as each waitable primitve has a wait queue. the wait queue inside the scheduler is especially required for detached fibers at termination (~scheduler())

...

...
if a scheduler would be schedulable, it would have been a context - a scheduler would then schedule itself. I'm uncertain how your schedulable schedulers would fit in this pattern (probably not very well).

One option is scheduling schedulable entities. Something like

schedulable { schedulable * next, * previous; void (*run)(schedulable*self); // or use a virtual };

but there are of course advantages in knowing that each schedulable is a contex as you can swapcontext to it.

the scheduler simply know which fiber can be resuemd next. if fiber f1 has to suspend (maybe joins another fiber) it calls the scheduler to select the next fiber f2 and transfer execution control to fiber f2 (context switch). the process of selecting the next fiber and calling the context switch are running inside fiber f1. that means that we have a call chain between fibers. I'm uncertain what schedulable::run() should execute if the schedulable is a scheduler and a context switch is called (what doesn't make sense to me).

...

Ok, I went and looked at the code. I see that on wakeup, the context is simply marked as ready. The main scheduler loop, after each scheduling event goes through the wait list and moves ready contextes to the main thread.

not main thread - it moves fibers, that have been signaled as ready, from the waiting queue to the ready queue (that is sched_algorithm::awakened() is used for). sched_algorithm::pick_next() selects a fiber from its internal ready-queue.

...

This is how I would expect a scheduling algorithm to work:

<snip> I've to think about your suggestion how it could applied to the current code

Giovanni Deretta

9:30 a.m.

Oliver Kowalke <oliver.kowalke <at> gmail.com> writes:

...

2015-09-09 0:19 GMT+02:00 Giovanni Piero Deretta <gpderetta <at> gmail.com>:

...
...
the scheduler maintains several queues (waiting ,ready ,...) containing fiber context's depending on their state (waiting, ready ,...).

why are the wait queues part of the scheduler itself? Shouldn't they be a property of the waitable primitive?

the scheduler has a wait queue as well as each waitable primitve has a wait queue. the wait queue inside the scheduler is especially required for detached fibers at termination (~scheduler())

...

...
...
if a scheduler would be schedulable, it would have been a context - a scheduler would then schedule itself. I'm uncertain how your schedulable schedulers would fit in this pattern (probably not very well).

One option is scheduling schedulable entities. [...]

but there are of course advantages in knowing that each schedulable is a contex as you can swapcontext to it.

the scheduler simply know which fiber can be resuemd next. if fiber f1 has to suspend (maybe joins another fiber) it calls the scheduler to select the next fiber f2 and transfer execution control to fiber f2 (context switch). the process of selecting the next fiber and calling the context switch are running inside fiber f1. that means that we have a call chain between fibers.

I'm uncertain what schedulable::run() should execute if the schedulable is a scheduler and a context switch is called (what doesn't make sense to me).

That's up to the scheduler. But the base class is not really necessary. A nested scheduler schedule loop can be run from another fiber (let's call it the scheduler fiber). To do that it would save the original (parent) thread local scheduler pointer and replace it with the nested scheduler. When control reaches back the scheduler fiber, it would restore the original scheduler and yield. If there are no parent schedulers, the scheduler fiber is simply the underlying thread main fiber. The parent pointer is null and the scheduler wouldn't be replaced. I ask is to remove that the assumption that the scheduler is fixed. If you think about it, the system has already two schedulers: the kernel scheduler and the boost.fiber scheduler. The boost.fiber scheduler runs inside a kernel level fiber (a.k.a a thread).

...

...
Ok, I went and looked at the code. I see that on wakeup, the context is simply marked as ready. The main scheduler loop, after each scheduling event goes through the wait list and moves ready contextes to the main thread.

not main thread - it moves fibers, that have been signaled as ready, from the waiting queue to the ready queue (that is sched_algorithm::awakened() is used for). sched_algorithm::pick_next() selects a fiber from its internal ready-queue.

sorry it was a typo, s/main thread/ready queue/. -- gpd

Oliver Kowalke

10 Sep 10 Sep

3:35 a.m.

2015-09-09 0:19 GMT+02:00 Giovanni Piero Deretta <gpderetta@gmail.com>:

...

context * this_context; // currently running scheduler* this_scheduler; // current scheduler

scheduler { intrusive_slist<context> ready; intrusive_slist<context> ready_next; atomic_intrusive_slist<context> remote_ready; event_count ec; atomic<bool> done; // can be used to interrupt the scheduler // this is the idle task, itself a fiber. Should run in the original thread stack void idle() { while (true) { auto n = get_next(); while (n == 0) { int ticket = ec.prepare_wait() if ((n = get_next()) { ec.retire_wait(); break; } ec.wait(); } ready_next.push_back(this_context); switch_context(n, this_context);

<snip> the scheduler with its idle() function would introduce extra context switches because idle() enqueues the scheduler to the ready_next-queue before it switches to the next fiber. seams that the scheduler (fiber) is resumed after each (worker)-fiber. boost.fiber tries to prevent this - it switches only between (worker)-fibers, the scheduler functions only like a store for the next fiebrs.

Oliver Kowalke

4:02 a.m.

2015-09-10 5:35 GMT+02:00 Oliver Kowalke <oliver.kowalke@gmail.com>:

...

the scheduler with its idle() function would introduce extra context switches because idle() enqueues the scheduler to the ready_next-queue before it switches to the next fiber. seams that the scheduler (fiber) is resumed after each (worker)-fiber. boost.fiber tries to prevent this - it switches only between (worker)-fibers, the scheduler functions only like a store for the next fiebrs.

the scheduler from boost.fiber has a similar 'idle()' function - if a fiber has to suspsend (yield,wait,...) it directly calls this scheduler function (without a context switch).

Giovanni Deretta

11 Sep 11 Sep

8:48 a.m.

Oliver Kowalke <oliver.kowalke <at> gmail.com> writes: <snip>

...

the scheduler with its idle() function would introduce extra context switches because idle() enqueues the scheduler to the ready_next-queue before it switches to the next fiber. seams that the scheduler (fiber) is resumed after each (worker)-fiber. boost.fiber tries to prevent this - it switches only between (worker)-fibers, the scheduler functions only like a store for the next fiebrs.

The idea is that yield and friends would switch to the idle fiber only when they reach the end of the ready queue. To be clear: there are two ready queues: ready and next_ready. Yield pops from ready and pushes into next_ready. ready is never empty, as the last element in the list is always the idle fiber. When the idle fiber is run, it moves the content of next_ready into ready and pushes itself at the end of ready. So yes, if you have N ready tasks yielding, there are N+1 context switches (but not 2*N) per iteration. The cost of the additional idle task is amortized over N tasks. On the other hand, yield becomes simply a push+pop+switch. All corner cases (empty queue, handle off thread signals, pump the timer queue) are handled inside the idle task. Of course a completely equivalent setup is to have an idle function and call it at the end of the ready queue. I do not claim that the idle fiber solution is superior; although it find it elegant, it was not really the core of my criticism of the current scheduler. -- gpd

Oliver Kowalke

9:02 a.m.

2015-09-11 10:48 GMT+02:00 Giovanni Deretta <gpderetta@gmail.com>:

...

So yes, if you have N ready tasks yielding, there are N+1 context switches (but not 2*N) per iteration. The cost of the additional idle task is amortized over N tasks.but

the idle-fiber pushes itself to ready_next before switch_context() is called that would mean that idle-fiber is called every second time

Oliver Kowalke

9:06 a.m.

2015-09-11 11:02 GMT+02:00 Oliver Kowalke <oliver.kowalke@gmail.com>:

...

2015-09-11 10:48 GMT+02:00 Giovanni Deretta <gpderetta@gmail.com>:

...
So yes, if you have N ready tasks yielding, there are N+1 context switches (but not 2*N) per iteration. The cost of the additional idle task is amortized over N tasks.but

the idle-fiber pushes itself to ready_next before switch_context() is called that would mean that idle-fiber is called every second time

OK - you are right, please ignore my posting

Oliver Kowalke

11:07 a.m.

2015-09-11 10:48 GMT+02:00 Giovanni Deretta <gpderetta@gmail.com>:

...

The idea is that yield and friends would switch to the idle fiber only when they reach the end of the ready queue.

the idle-fiber executes only function idle() - why not simply execute idle() instead of switch_context(n, this_context); at the end of scheduler::yield()? what are the reasons that idle() must run on an extra fiber-stack?

Giovanni Deretta

1:14 p.m.

Oliver Kowalke <oliver.kowalke <at> gmail.com> writes:

...

2015-09-11 10:48 GMT+02:00 Giovanni Deretta <gpderetta <at> gmail.com>:

...
The idea is that yield and friends would switch to the idle fiber only when they reach the end of the ready queue.

the idle-fiber executes only function idle() - why not simply execute idle() instead of switch_context(n, this_context); at the end of scheduler::yield()?

what are the reasons that idle() must run on an extra fiber-stack?

As I said elsewhere, there is no fundamental reason and, although I consider the idle fiber a better solution, I would be perfectly fine with a scheduler that doesn't have such a thing and simply called the idle function when appropriate. Note that unconditionally calling idle at the end of yield is not necessarily ideal though, as the idea is that it might execute more expensive operations that you want to do only at the end of an 'epoch' (i.e. when all ready fibers have executed once). Adding a conditional test on yield opens up the possibility of a misprediction. At that point the cost of an additional fiber switch is minimal. The fact is, often there is an existing idle fiber anyway; if you spawn a dedicated thread to run a scheduler, the thread has an implicit fiber which would go otherwise unused; if you are running on top of another scheduler (for example boost::asio::io_service or one of the proposed executors), the idle fiber is simply the context of the underlying scheduler callback; in this case after control reaches back the idle fiber, it is appropriate to return control to the underlying io_service and reschedule another callback (with asio::post for example); Nested scheduler support would fall of naturally and almost transparently on top of this model, together with the ability to temporary override the current thread local scheduler. A nested scheduler idle fiber would appear as just another fiber in the parent scheduler loop. To be clear, the major concerns I have with the current scheduler designs are: - lack of proper cross scheduler wakeup. - unconditional sleep when the scheduler is empty. - the handling of waiting tasks (including the clock wait queue). All three issues are tightly interwoven. Idle tasks, nested schedulers would all be nice to have for me, but not deal breakers. -- gpd

Oliver Kowalke

1:38 p.m.

2015-09-11 15:14 GMT+02:00 Giovanni Deretta <gpderetta@gmail.com>:

...

As I said elsewhere, there is no fundamental reason and, although I consider the idle fiber a better solution, I would be perfectly fine with a scheduler that doesn't have such a thing and simply called the idle function when appropriate.

I try to figure out what would be the benefits/disadvantages Note that unconditionally calling idle at the end of yield is not

...

necessarily ideal though, as the idea is that it might execute more expensive operations that you want to do only at the end of an 'epoch' (i.e. when all ready fibers have executed once). Adding a conditional test on yield opens up the possibility of a misprediction. At that point the cost of an additional fiber switch is minimal.

agreed - a terminated fiber would earlier release its resources

...

The fact is, often there is an existing idle fiber anyway; if you spawn a dedicated thread to run a scheduler, the thread has an implicit fiber which would go otherwise unused; if you are running on top of another scheduler (for example boost::asio::io_service or one of the proposed executors), the idle fiber is simply the context of the underlying scheduler callback; in this case after control reaches back the idle fiber, it is appropriate to return control to the underlying io_service and reschedule another callback (with asio::post for example);

OK - I'll take this into account for

...

- lack of proper cross scheduler wakeup.

- unconditional sleep when the scheduler is empty. - the handling of waiting tasks (including the clock wait queue).

...

your concerns are already addressed in another branch

...

Idle tasks, nested schedulers would all be nice to have for me, but not deal breakers.

I'll try a version with an idle fiber per scheduler

Giovanni Deretta

1:56 p.m.

Oliver Kowalke <oliver.kowalke <at> gmail.com> writes:

...

2015-09-11 15:14 GMT+02:00 Giovanni Deretta <gpderetta <at> gmail.com>:

...
- lack of proper cross scheduler wakeup. - unconditional sleep when the scheduler is empty. - the handling of waiting tasks (including the clock wait queue).

your concerns are already addressed in another branch

Is this the 'signal' on github branch? If possible, I would like to hold my final review until this work is, if not completed, at least in a state where the design forward is clear. -- gpd

Oliver Kowalke

6:33 p.m.

2015-09-11 15:56 GMT+02:00 Giovanni Deretta <gpderetta@gmail.com>:

...

Oliver Kowalke <oliver.kowalke <at> gmail.com> writes:

...
2015-09-11 15:14 GMT+02:00 Giovanni Deretta <gpderetta <at> gmail.com>:

...
- lack of proper cross scheduler wakeup. - unconditional sleep when the scheduler is empty. - the handling of waiting tasks (including the clock wait queue).

your concerns are already addressed in another branch

Is this the 'signal' on github branch? If possible, I would like to hold my final review until this work is, if not completed, at least in a state where the design forward is clear.

yes, branch 'signal' (but not completed yet)

Oliver Kowalke

10 Sep 10 Sep

5:57 a.m.

...

// make a context ready void signal(context* waiter) { if(waiter->preferred_scheduler != this_scheduler) { // cross thread waiter->preferred_scheduler->remote_add(waiter); } else { // no preference or local this->scheduler.next_ready.push_back(waiter); // alternatively push this_context to back of queue and yield to waiter immediately } }

to push a ready fiber to the scheduler from a waitable (mutex, condtion-variable ...) might be a better solution than signaling via an atomic. I'll branch and test your suggestion. thx

3594

Age (days ago)

3597

Last active (days ago)

List overview

Download

23 comments

4 participants

participants (4)

Giovanni Deretta
Giovanni Piero Deretta
Hartmut Kaiser
Oliver Kowalke