I have spent almost an hour debugging the following issue. I have a
"main" orthogonal region with a couple of states and an "error" region
with RUNNING and PAUSED states, like this:
struct PAUSED : public msm::front::interrupt_state {};
struct RUNNING : public msm::front::state<> {};
There are transitions PAUSED --new_job--> RUNNING and
RUNNING --failed--> PAUSED.
Then, in one of the main states, I have
struct MAIN_STATE1 : public msm::front::state<> {
template
void operator()(const Event& e, FSM& fsm) {
if (error_condition)
fsm.process_event(failed{});
}
}
and in the FSM I have
using initial_state = mpl::vector;
Now, when new_job is processed and MAIN_STATE1::on_entry called, the machine
is in (MAIN_STATE1, PAUSED). In this state, process_event(failed{}) is
ignored and not queued because PAUSED is interrupt state. Then, the next
orthogonal region is processed and the machine goes into (MAIN_STATE1,
RUNNING), which destroys the rest of the logic around which the FSM is
designed...
Adding
using active_state_switch_policy = msm::active_state_switch_after_exit;
to the machine did not help in any way.
Reordering the states in initial_state fixed the problem, but this is
a rather unsatisfactory fix because it doesn't solve the underlying
problem of transition atomicity. This is counter-intuitive, being on
the border between a bug and lack of clear documentation.
Specifically:
1. State switch should be atomic. By using active state switch after
exit, I had *expected* that first all exit handlers of the current
state are called, the state is atomically switched to the new state,
and then all entry handlers are called. What I have *observed* is
that on_entry handler of a state in one orthogonal region is called
before the other orthogonal region has switched *its* state.
Thus, with orthogonal regions, the machine can be in an "impossible"
intermediate combination of states. Because not all side-effects have
run, this has consequences for all actions/guards acting on state or
FSM data, which is illustrated by the following:
2. process_event eagerly evaluates is_event_handling_blocked_helper
for each orthogonal region in order. The event should be queued
unconditionally, and only after the transition has completed, the
machine should dequeue events and check whether processing is
blocked etc. This, I think, would make at least process_event to
appear atomic wrt state switch.
Which leads me to conclude that MSM is seductively declarative on the
surface, but that it's rather tricky operationally.