Re: [boost] [process] Formal Review starts today, 27 October

12 Nov 2016


      On Thu, 10 Nov 2016 10:05:00 +0100
Klemens Morgenstern <klemens.morgenstern@gmx.net> wrote:
...
...
What I noticed in the source is that some of the properties require
special logic because they are manipulating other properties
instead of the "executor". For example, the `shell` property is
manipulating the path and arg properties but nothing else.
Earlier I mentioned on working out a loose way of specifying a
`property`. I think such a specification would not consider
properties that only manipulate other properties to be a property
(since it could do that separately from the `child` constructor).  
Well that depends on how you model it, I consider the question if a
[...]  
 [...]  
 [...]  
process is launched through the shell or not to be a property of this;
it must be set prior to launching and it affects how it runs. There
is no need to model this exactly after the excutor; that's is
actually not possible.
Yes, whether a process is being launched in a shell must be set before
starting the process, but this is irrelevant. Is `shell` a property of
the `child` or a property of the path / arguments pair? In other words,
does the internal logic of the `child` constructor actually need to know
whether its launching a shell or just some other executable? The
implementation would suggest that `child` does not need to know.
...
Also it does not modify a property, but rather modifies an
initializer, which is build from several properties.
So do I understand you correctly, that you would want this syntax?
bp::system(cmd="foo");
bp::system(shell="foo");
If so - why? Why should'n you be able to launch the same process on
it's own or through the shell? It just makes more sense to me to
enable the shell as an additional feature.
You can launch the process on its own or through the shell with that
syntax. Its just that `shell` is modifying the path + args properties
given to the `child`.
...
...
[...]  
 [...]  
 [...]
...
Yes, and I disagreed with your analysis. I recall the primary
objection was poor performance - specifically that space for each
type of possible property was needed. This is not true, the binding
approach (at least how I saw it), would return a new callable type
after each binding. In other words, the binding was conceptually
calling `push_back` on a fusion like tuple. Moves + ref-qualifiers
meant that dynamic memory would not copied in many cases. So
`launcher("foo").group(...).stdout(...)` would return the
equivalent of `tuple<launcher, group, stdout>` while moving
`std::string` memory as needed.
After thinking about it, all that I (and I think Bjorn and Niall)
were trying to accomplish is separating the path and arguments from
the remainder of the properties. If the path and arguments were in
a single property, `shell` and `cmd` could return that property
instead of requiring custom logic internally.  
Well that is one reason, though I also assumed, that you wouldn't 
construct a tuple like that.
The primary reason I really dislike the idea, is that every
conceivable property would need to be in a member defninition of this
tuple, and that is horrible.
The current implementation isn't without boilerplate - traits and
processing hooks for fusion::for_each are needed for the properties
too. Although I do agree that launching the process inside of the
`child` constructor is preferable since the destructor should terminate
if not detached or joined. However, I think some of the properties need
some work.
...
...
...
How is it broken? Look at the code snippet, the second function call
takes a variadic argument list like the `child` constructor. The
string arguments are the only thing "bound" to an object. So
appending another property would be done like:
launch("ls, "-la", "/")(stdout(my_pipe1), stdin(my_pipe2));  
Oh ok, sry, misunderstood that one. But then why is that any better? 
Just fyi: since you can pass environment variable to the
[...]  
 [...]  
 [...]  
 [...]  
property-list, this doesn't even have to be the same executable.
...
That style is simply separating string arguments into a separate
variadic function call. Or another way of looking at it - there is a
single `property` for specifying the path and args that has a fixed
positional location.  
What about the cmd style, i.e. bp::system("gcc --version"); ?
[...]  
 [...]
...
Why would it immediately terminate the process? Or do you mean a
non-joined `child` should terminate the process like
`std::thread`?  
Exactly. Also this would produce a deadlock if waiting in ~child and
foo throws.
bopstream os;
bp::child c("foo", std_in < os);
os << foo() << endl; //boom, deadlock.
...
Admittedly, I did forget about that case - its probably best to
follow `std::thread` closely and terminate the current process if
not `join`ed or `detach`ed. I don't there is anything wrong with a
function returning a `child`, provided the behavior of ignoring the
`child` is well documented and defined. However, there is little
difference to the code snippet above and:
bp::child c1(bp::exec("ls", "-la", "/"), bp::stdout(my_pipe1));
   bp::child c2(bp::shell("ls", "-la", "/"), bp::stdout(my_pipe2));
   bp::child c3("ls", {"-la, "/"}, bp::stdout(my_pipe3));
Where the constructor in `c3` forwards to the constructor in `c1`;
`c2` uses the `bp::exec` object returned by `bp::shell`; `c1` uses
the `bp::exec` property to set the path and args of the executor.
Ok, so I consider exe/args, cmd and shell different properties, but
in the end they all get put into one initializer. 
(https://github.com/klemens-morgenstern/boost-process/blob/develop/include/bo...)
So if anything this could be made public, though at the time the
values get put into the exe-args initializrs they are already parsed
and formatted. I really don't see the benefit.
The benefit is it removes the special internal logic for a few specific
properties and makes what the code is _actually_ doing more clear. I've
also been pushing for a more formal specification of `properties`
because some of the IO variants are doing too much in `child`. The
constructor for `std::thread` does not throw once the thread is
created, but what about some of the IO properties for `process::child`?
Several of the `on_exec`/`on_success` hooks are doing various calls that
can throw - does `child` block internally in the constructor until the
other process completes? Does it detach? Terminate? And why is so much
being done inside of the `child` constructor when the `child` should
only need to know the pipe handles for communication. I'm mainly
thinking of the code for piping to future or mutable buffer.
...
[...]  
 [...]
...
Requiring a `{}` hardly seems detrimental to me, but I guess people
have different tastes. I hope you at least consider positional
arguments for the path and args. For the remainder - I agree the
variadic constructor is probably preferable since its behavior can
be closer to std::thread.  
The only reason you can't use that is that the parameter list is 
forwarded internally, hence the type-deduction fails. That syntax is 
already valid:
Yes I know that the initializer list deduction would fail. Which is why
I suggested the process arguments be positional.
...
bp::system("foo", std::vector<std::string>{"--arg", "bar"});
or
auto args = {"--arg", "bar"};
bp::system("foo", args);
Though I just checked my code, and that would work for 
`std::initializer_list<char*>` but not for
`std::initializer_list<const char*>`, so I'll add a bugfix and test
for the latter.
Lee

Re: [boost] [process] Formal Review starts today, 27 October

Lee Clagett