Re: [ANN] Format Lite
"Pavel Vozenilek"
"Jonathan Turkanis" wrote:
// specialization for debug formatting template<> void serialize
(....) { I believe this specialization is illegal. You could write
void serialize(formatted_text_oarchive&ar, const unsigned)
but I can't say whether this will work (I seem to remember Robert saying somewhere in the documentation that he was relying on the fact that non-templates are better matches than templates.)
Yes, this (nontemplated overloaded function) will work. I just wrote down an idea in haste.
I have two separate ideas for formatting libraries:
- one lightweight, which I posted, for input and output of ranges and tuples-like objects - one for output only, which allows much more customization; I see this as an inverse of Spirit
Your suggestion looks similar to the second (except that you want to support input), so let me sketch my idea (which has improved since I sketched it here last time), and then ask some questions about yours.
Formatted input would be optional (and maybe not practical).
I do not understand what are avantages of the "lightweight" approach (except compile time).
The aim of Format Lite was to present a facility which would be a candidate for standardization, filling a need that was expressed at the most recent library working group meeting. Towards this end, 1. I tried to keep the implementation as small as possible 2. I introduced just one new function template -- punctuate() -- in the public interface 3. I used the same syntax currently recommended for formatting user-defined types (overloading iostreams operators >> and <<) 4. I introduced no new class templates in the public interface 5. I introduced no new concepts. (Strictly speaking, I use Single Pass Range and Extensible Range, but these could be replaced by the standard library container concepts.) If people like the interface (and so far there's not much evidence), I think Format Lite would stand a reasonable chance of making it into TR2. To get Serialization standardized would require a much bigger push, IMO, although I'd like to see it happen -- perhaps with additional language support.
Is switch between lightweight and heavyweight solution easy?
Yes, since formatting with Format Lite would still be the default when a Style
provides no specific formatting options for a given range or tuple-like type. In
the following
vector< string > v = list_of( ... );
ostream out;
styled_ostream
[snip]
The advantages of this approach are:
- an arbitrary amount of contextual information, such as indentation and numbering, can be stored in the styled_stream and accessed directly by formatters - arbitrary user-defined types can be formatted non-intrusively - flexible formatting is built-in for sequences and tuple-like types (and user-defined types can choose to present themselves as sequences or tuple-like types to take advantage of this feature.)
I feel having formatting descendant of boost::archive stream could be made with the same features.
Good.
The advantages I see: - the whole infrastructure of Boost.Serialization is available and ready and it handles all situations like cycles. format_lite could concentrate on just formatting.
This is a big plus, obviously. (However, I remember Robert saying her prefered to keep formatting and serialization separate.)
Formatting, as I see it would just use Serialization as infrastructure. There would be no inpact on Serialization from Formatting.
You mean no changes to the library code -- we would just define additional archive concepts and types?
- formatting directives can be "inherited" from "higher level of data" to "lower levels". Newly added data would not need formatting of its own by default. Change on higher level would propagate itself "down".
Can you explain how this works?
I mean trick with RAII (its not really Serialization feature):
void serialize(formatting_archive& ar, const unsigned) { // change currently used formatting style formatting_setter raii(ar, "... formattng directives...") ar & data; <<== new formatting style will be used // destructor of raii object will revert formatting back }
I see. I think this is a characteristic of all schemes where stylistic info is stored in the stream or stream-like object.
- indentation for pretty printing could be handled (semi)automatically by formatting archive.
Would this involve modifying the archive interface? I'd like a formatter for a given type (or an overloaded serialize function) to be able to access these properties directly.
Yes, formatting archive could have any additional interface.
I see. But no changes to existing archive types.
- multiple formatting styles could be provided for any class.
It would be one formatting style for each archive type for which serialize has been specialized, correct? Would this allow styles for various types to be mixed freely?
Yes, serialize() function would be specialized.
What I meant to ask can be illustrated by an example. Suppose you have two classes, Duck and Goose. Duck and Goose each have two associated formatting styles. The choice of styles should be independent, so we would need four archive types to handle the various combinations. Now my question is: would Duck need four specializations of serialize, or just two? In my system, formatting options for Duck and Goose could be added to a Style independently; I want to know if overloading serialize can handle this.
I see three ways to customize output:
1. Formatting archive has its own configuration how to output data. This keeps overall style coherent and should be enough for most uses.
2. Specialization of serialize() could change formatting style. This may be used to fine tune the output here or there.
3. Specializations of serialize() may generate different outputs altogether.
E.g. if you have archives: class high_level_formatting_archive {...} class all_details_formatting_archive { ... }
you can omit details in
void serialize(high_level_formatting_archive& ar, const unsigned);
and use them all in
void serialize(all_details_formatting_archive& ar, const unsigned);
I think this (option 3) is not possible now with format_lite.
In fact, it only supports 1. That's part of what makes it 'lite'. If a class already provides standard library inserters and extractors (corresponding to 2, above), those provided by Format Lite will not be called. I have a couple of questions: 1. Is your idea flexible enough to allow pairs (a,b) to be formatted with the elements in reverse order? 2. If a type defines a member function serialize, can it be bypassed altogether in favor of an end-user supplied formatting style?
My inclination is to keep formatting separate from serialization, though, because they have different aims. If you believe they can be integrated, without making serialization harder to learn or sacrifying flexibility of formatting options, it should definitely be considered.
I see Serialization as just vehicle, ready and handy and almost invisible to Formatting users.
Simple data structures are very easy with Serialization and should be as easy as with format_lite now.
If user tries to format tricky structures (e.g. pImpl) he would need to dig into Serialization docs but at least there will be chance to make the whole thing work. The Serialization goes to great lengths to work under any situation and configuration.
This sounds quite reasonable, provided it is sufficiently flexible. I wonder how much of the Serialization infrastructure is really needed, though. Detecting cycles is defintely not something I want to reimplement; OTOH, I'm not sure it's needed for pretty-printing. I haven't looked at the Serialization implementation, but I did read the Java serialization specification several years ago. IIRC, when an object was encountered for the second time, some sort of placeholder would be inserted in the stream referencing the already serialized data. I assume the Serialization library does something like this. Would this really be desirable for human-readable output? Perhaps the formatting library should concern itself only with cycle-free data structures.
/Pavel
Jonathan
participants (1)
-
Jonathan D. Turkanis