What you are asking for is more or less possible, but what do you plan on doing with this information?
Thanks for asking! After thinking more about the metric, it does not seem
helpful anymore.
There were two scenarios which I wanted to improve:
1. Estimate service capacity. For example, if io_context load goes above
80%, we should add new nodes to avoid latency spikes. But if we measure
io_context load every second, then 80% means that there may be very busy
800ms and idle 200ms. During busy 800ms there may be large latency spikes
(up to 800ms), io_context is overloaded but the metric does not show that.
2. In investigations of user-facing latency issues, knowing that io_context
was overloaded would be very helpful but the metric may not show that.
Scenario #2 is partially solved by the metric you suggested before (except
for cases with very short operations that start and end between metric
measurements).
Scenario #1 - for now I have no ideas for it.
Regards,
Dmitry
On Mon, Jul 3, 2023 at 15:43, Vinnie Falco
On Fri, Jun 30, 2023 at 7:31 AM Dmitry
wrote: I would like to measure io_context load before it became overloaded to estimate capacity.
What you are asking for is more or less possible, but what do you plan on doing with this information?