Dear Craig,
--- In Boost-Users@y... #757, Craig Hicks
wrote: A graph can represent associations between two nodes by means of the edge (or edges) which connect those two nodes. The nodes are, needless to say, distinct objects. It's up to the programmer to design a node data type which can capture the necessary information to distinguish between nodes. Obviously what I'm saying is obvious, so I think it's my turn to say "I'm missing your point", and ask for clarification.
...
I don't know who's missing what, but I could have been more clear before and will attempt to be so now. I'll also provide a somewhat realistic example, which is what you originally asked for (in message #594).
Consider an application used by a school for keeping track of students, courses, teachers, rooms, books, meals, buses and other objects that are part of the school.
The application uses a Student class:
class Student { int id; string name; string address; Date birthdate; ... };
The Student class has a lot of data members, and there are a lot of (member and free) functions for handling students. Indeed, there is an extensive graphical user interface that allows the user to add, modify, delete and find students maintained by the application.
The application also uses a Course class:
class Course { int number; string name; string description; string instructor; string room; string meeting_times; ... };
Like the Student class, the Course class has a lot of data members, and there are a lot of (member and free) functions for handling course. And there is an extensive graphical user interface that allows the user to add, modify, delete and find courses maintained by the application.
What I want you to imagine is that there are two independent and well-developed classes that do not yet have an association between them. One might even imagine -- as is actually realistic -- that the two classes are parts of different applications written by different programmers.
Of course, requirements change and software evolves. At some point, there is a requirement to associate students and courses: to keep track of which students are enrolled in which classes. Obviously, this is a many-to-many association.
As I understand it, in order to use the BGL -- indeed to use any graph library with which I am familiar -- one would have to derive both the Student and Course classes from some common base class:
class SchoolObject { ... };
class Student : public SchoolObject { ... };
class Course : public SchoolObject { ... };
Then one could create a graph for which each node is a SchoolObject.
I was thinking more along the following lines: // reusable code for RTTI based multiclass association class NodeBase { /* common node interface */ }; template <class T> class NodeDerived : public NodeBase; { T *pT; }; // application code for multiclass association typedef NodeDerived<Student> NodeStudent; typedef NodeDerived<Course> NodeCourse; As you can see, I am thinking of a design where all the code that can gets factored into a seperate module for class-generic RTTI-based multiclass association. First and foremost, because this is RTTI, there is no compile time type checking of the derived types. But there is run-time checking, which can also pushed into the reusable class-generic RTTI-based code.
While this approach might work, it is unnatural. A Student is not a Course, and there really is no such concept as a "SchoolObject" in the application domain. Indeed, such an approach would allow links (graph edges) between students and students, or between courses and courses -- neither of which makes sense. Thus there would need to be source code to explicitly forbid such illogical links.
There would need to be such source code, but the hard work could be pushed into the reusable class-generic RTTI-based code.
There are two main problems with using the BGL (or any general-purpose graph library with which I am familiar).
First, a mathematically idealized graph generally does not distinguish between nodes of different types: Any node can be linked to any other node via an edge. The same can be said of computer implementations of graphs. In brief, a graph is typically an association between a type (class) and itself.
Yes, mathematically, one can consider bipartite graphs in which nodes can be classified into different classes. But the graph libraries I have seen do not implement this idea as a graph involving two different (C++) classes of nodes.
Second, the graph libraries I am familiar with emphasize abstract nodes linked by edges. The nodes and edges may each have data associated with them (the BGL does this using property maps), but the data associated with nodes and edges is considered secondary. In effect, the idea of a graph makes the association primary and the participating class(es) of nodes (objects) secondary.
But in typical applications -- as I try to indicate in my example -- the emphasis is the object. The classes of objects (students and courses) are primary and are predetermined. One does not have the flexibility to recode these classes to conform to, say, the BGL, the Graph Template Library (GTL) or some other graph library. The association between the classes is a relatively minor detail, and associations can be added, modified or removed from an application perhaps more frequently than the classes of objects themselves.
I am looking for a library that allows me to take the preexisting Student and Course classes and to add an Enrollment association between the two with but minimal modification of existing source code.
Are you saying it would be useful to have a bipartite graph (or N-partite graph) with each part holding separate types? Is that how Tilevich's association library worked? I can't recall.
Yes, as I note above, I am saying it would be useful to have a bipartite graph, i.e., a graph with nodes in two different classes.
Yes, this is how Tilevich does it in the Sept. 2001 C/C++ Users Journal article. He gives an example involving employees and projects, which is conceptually the same as my example involving students and courses:
ManyMany
Projects;
No question, Tilevich's code is superior in the case of two fixed classes, for those applications whose requirements are completely satisfied by said code.
I dare say the BGL (or any C++ based graph) could handle nodes with different data types via inheritance (all nodes contain a pointer to base data class).
Yes, I think it could. But as I note above, it is unnatural to derive fundamentally different classes from some artificial base class.
Moreover, in so doing, you allow for unnatural edges linking, say, a student to a student rather than to a course.
Regardless, the issue of classes is one of syntax. My statement was meant semantically, without regard to syntax.
1. Do you agree with the statement "Any implementation of an many-many assoication container must have an underlying graph lurking within (semantically speaking)"?
Yes, I agree, if one allows that the underlying graph may be a bipartite graph that has nodes in two different classes.
2. Do you agree with the statement "The BGL is capable of being the underlying mechanism for implementing a many-many class (e.g., one with the I/F defined by Tilevich's interface) ?
I'm fairly certain about 1, and think 2 is correct, but am not certain especially since I don't have Tilevich's class at hand right now.
Yes, it is capable of being the underlying mechanism.
But it is also true that it is possible to write an object-oriented program with inheritance, polymorphism and virtual functions in plain C. It is possible, but it is not very natural or convenient.
The point of the STL -- and I believe of Boost extensions of the STL -- is to make common but involved tasks *easy* by providing templates that are efficient and easy to use.
I will second that "*easy*" statement !!! Providing a "configuration", an interface that specializes a container to make it usable for a particular purpose counts as making it easy. That is definitely a requirement if BGL were to be used to to implement an association class.
The 3rd question is, if it can be implemented with BGL, is it worth it? You'd be able to reuse the exchangeability of data structures. You'd have a host of algorithms available. Anything new you developed might be reused for graphs in general. You'd have more time to concentrate on the I/F.
This question gets at my point about graphs emphasizing the association, i.e., the edges, whereas most C++ application classes that participate in associations are emphasized more than than the associations between them.
To put it a different way, much of the functionality that a graph library provides -- beyond being a mechanism for implementing the association -- is not desired.
If the user-programmer can't see it (hidden by interface), does it exist?
Would one do a depth-first or breadth-first search of the graph students and courses? What would that mean and what use would it be?
Would one want to find connected components, i.e., clusters of students and courses that form a kind of school-within-a-school? One could do that, but it would not be high on the list of requirements for a school management application.
What is desired is being able to access "neighbors": all the courses a particular student is enrolled in or all the students enrolled in a particular course. But oned oes *not* need an entire graph library to achieve this low-level functionality. One just needs an association library.
And that is what I seek: An efficient, flexible, easy-to-use association library that allows me to add, modify and delete different kinds of associations (1-to-1, 1-to-many, many-to-many, unidirectional, bidirectional, etc.) between classes A, B, C, etc., in which some associations are between a class and itself (hence a general graph) but other associations are between distinct classes.
Mostly I want to agree with you as far as the following is true: You have an application all of whose requirements, (present and envisioned future), are satisfied by Telivich's library. Otherwise, you might want to think about the cost/benefits of an RTTI version (with templates) versus a template only version.
Craig, thank you for your good questions and for giving me the opportunity to think out loud.
Thanks, Rob Zako
P.S. So far we have been discussing only *binary* associations. If we were to discuss ternary and higher-order associations, which link 3 or more objects together, then the idea of an association as a graph would break down.
Good point. Craig Hicks