Hi I'm working on the GSoC Pipeline project, which is based on the N3534 [1] proposal. Work in progress can be found on GitHub [2]. A simple example of using the pipeline: pipeline::from(input_container) | transformation1 | t2 | t3 | output_container; When running this pipeline, items will be read from `input_container`, processed by the transformations and written to the `output_container`. Each segment should be applied in a parallel manner, this is the point of the pipeline. However, scheduling of these works is not trivial. Quoting the proposal: "The current pipeline framework uses one thread for each stage of the pipeline. To limit the use of resources, it should be possible to run with fewer threads, using work-stealing or work-sharing techniques." We are wondering if we could improve on this. Lets assume a thread pool of a single thread and the example pipeline above. On run(), the ideal would be that the thread reads *some* items from `input_container`, applies the transformations on them and pushes the results to `output_container`. That is: spending some time on each transformation then yield and pick up the next one. However, doing this would imply the transformations are reentrant. This additional constraint must be considered carefully. This behavior is implemented in the `development` branch [2]. Aside the two solutions above, Vicente J. Botet Escriba coined in the following idea: Let the threads work on a single transformation until the queue gets closed (no more input), then move to the next one. This is easy to implement and scales to as many threads as many segments are present. On the other hand, it kills the performance of the "online usecase": pipeline::from(read_message_from_socket) | process_message | send_message; It's not deterministic when the first segment will end, the pipeline will hang and won't provide any output. Also, in a slightly less strict scenario, when the end of input is specified, the latency could be just too high. To summarize, the following options are on the table: 1. Dedicate a thread to each segment (what to do with a fixed size threadpool?) 2. Constrain the transformations to be reentrant. 3. Run each transformations until there is input to be processed, from beginning to the end. We are kindly asking the recipients to share their ideas or opinions, ideally with a matching usecase. Thanks, Benedek [1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3534.html [2]: https://github.com/erenon/pipeline/tree/development