Hi Adam, thank your for your extensive feedback. I'll try to address all questions.
One projects can be a lot of work and you propose to work on 2 at the same time. Are you certain you'd like to do it? Or is it maybe because both projects are somehow related, i.e. one is needed by the other?
There is a relation in the sense that I think that a fast (not necessarily Delaunay) triangulation algorithm and a triangle concept are beneficial for a random point generator that generates points uniformly distributed inside a polygon (and this is probably true for higher dimensions as well). For a large number of points, I found that directly sampling to triangles outperforms a rejections sampler. Other than that, my proposal to combine two projects is based on a conservative estimate of the total amount of work. It took roughly a day to implement an initial, imperfect variant of s-hull (for the programming competency test) and the triangulation algorithms commonly described in literature seem to be intuitively accessible enough to be implemented properly in a reasonable timeframe. The random samplers are, in my opinion, a simpler problem than the triangulation, so they can be completed afterward but still within the GSoC. I am certain that I can produce a good quality implementation with proper tests and documentation of the proposed features in the GSoC work period with reasonable effort.
What coordinate systems are you planning to support? Cartesian, geographic, etc. In case you decided to implement algorithm for one coordinate system, would the same algortihm theoretically work with different ones or would different algorithm be required?
Regarding the random generators: Generally, uniform sampling can be done in geographic coordinates using an equal-area transformation from the cylinder to the sphere, and I have considered that in the proposal (last paragraph before milestones). However, I imagine that in GIS applications one is usually interested in very small subsets of the sphere so rejection_sampling would be prohibitively expensive. In the programming competency test, I implemented a more efficient sampler for convex polygons by decomposing it into triangles and sample uniformly on triangles. I suppose the most obvious approaches would be to do the same thing for polygons on spheres (I've found a straightforward way to do that in literature: https://www.graphics.cornell.edu/pubs/1995/Arv95c.pdf ) or to at least chose some box or polygon in cartesian coordinates so that its image (under an equal area transformation) covers the interesting area of the sphere and not too much more, take uniform samples in it, map them to the sphere, and reject them if necessary. I'd be thankful for input from potential users of randomly generated points in geographic (or other) coordinate systems if there is more to consider. More on randomly generated points below. Regarding triangulation: In my literature research for the proposal, I concentrated on Delaunay triangulations in cartesian coordinates. I've looked into the spherical case and there is a lot of literature for that as well, for example, this paper on adapting Fortune's algorithm to spheres ( http://www.e-lc.org/tmp/Xiaoyu__Zheng_2011_12_05_14_35_11.pdf ). I've seen very intuitive approaches to that using complex hull algorithms and projections to the sphere. I don't know how well and with how much effort the algorithm I've linked in my proposal translates to triangulations on spheres but I will think about it. Do you know how much interest there would be in Delaunay triangulations on spheres and what the usual use case would be? If someone enters 3 points does he want a decomposition of the sphere into 4 triangles or would one usually be interested in the triangulation of a small subset of the sphere?
2. Triangulation
So triangulation is basically a triangle mesh represented by arrays of vertices and indexes, correct? If that's the case then maybe we need more general mesh concept?
I suppose it could be generalized. Edges and vertices would remain the same thing but faces could be arbitrary rings with more than three adjacent faces.
If the result of triangulation is a mesh then why do you need triangle concept as 3-point ring? Wouldn't a view/proxy used internally in your algorithm and storing iterators/pointers to vertices in the mesh be enough?
I imagine triangles are useful primitives in general and it would be beneficial to have the concept to specialize some existing algorithms for them (like area) so that their implementation can make use of the additional constraint of the triangle being a ring with only three points. I agree that internally in the algorithms for Delaunay triangulation, triangles would be represented by views with iterators to vertices but I thought that would not make a difference because those views and triangles that actually store vertices could both fulfill the same concept.
You mentioned Boost.Graph. Would your algorithm depend on this library? If possible we should avoid dependencies. If it doesn't depend then we could still have concepts compatible with other libraries so the same types can be easily adapted to work with both.
I would not depend on it because the algorithm linked in the proposal (as far as I can see) does not use any complicated graph operations. It would be nice to be compatible with BGL because I've already had cases in which I had to perform more complicated graph operations on meshes.
left-of-predicate in Boost.Geometry is called "side strategy" and is implemented for all supported coordinate systems.
I completely overlooked that one. I see that it is designed to optionally supports higher precision coordinate types. I will look further into that.
in-circle-predicate however is not. It also looks like more complex operation since you have to find a center of the circle and then check if a point is inside it. In Boost.Geometry this would be divided into centroid() and distance(). So my guess is that the in- circle-predicate should be divided into more basic building blocks. However I'm not sure if the center of the circle is the same as barycenter in non-cartesian coordinate systems. Furthermore Boost.Geometry doesn't have non-cartesian centroid yet. Even if you do not support geographic triangulation right now we should do theoretical work and prepare the interface for future extensions, i.e. know what basic coordinate-system-dependant operations will be needed in the future.
I implemented it in that two-stage way in my implementation of Delaunay Triangulation for the programming competency test. However, I found that this seems to be usually done in a single step (see https://www.cs.cmu.edu/~quake/robust.html ), I guess for performance and stability reasons. I noticed that with larger problem sizes (1'000'000 points and more) my Delaunay triangulation would not terminate for some runs (I generate the problems randomly) and some experiments (I did not look too deeply into it) suggest that my two-staged, non-robust in-circle-predicate is to blame here. I would trust the literature there and implement a simple approximate predicate that works for non-degenerate problems and an adaptive predicate that will be a little slower but still work for degenerate cases.
In case you wanted to work on centroid too, geographic centroid could be yet another whole project for GSoC, we were actually considering to add it. ;)
I am afraid that would lead to overstretching my proposal. ;)
3. Randomization
Just an idea about the generalization of the interface. I can imagine that someone may want to generate points in the interior of a polygon and someone else may want points in the interior or on the boundary, etc. With current design you'd have to add more function names for various conbinations, correct?
I would not create special generators for all conceivable cases. But I think handling some special cases is very benefecial. The one I produced for the programming competency test for project 4 uses a trivial fan triangulation and uses that to produce random samples directly rather than sampling from the bounding box and rejecting based on the within-predicate. I did not commit it but I did write a simple rejection sampler as well and found it to be much slower (I think about 2-3 times), even though the convex polygons are comparatively simple geometries to check within for and they cover a large portion of their bounding box so only a good portion of samples would not be rejected.
AFAIU the rejection_generator would be a special case of a generator capable to take spatial predicates defining returned points. In this case returning points that are let's say within(A) && ! covered_by(B) (so in the interior of A and in the exterior of B). So another possibility is to have a generator taking general spatial predicates. Furthermore, the above (points in interior/boundary/exterior) could also be achieved with this approach.
I totally agree with the suggestion that arbitrary spatial predicates should be supported for rejection sampling.
Note also that performing randomization for arbitrary geometries would be supported with this interface, also e.g. N random points within multipoint.
I agree but the probability of a uniformly distributed point lying inside a multipoint (except for a very coarse space) is vanishingly small. I think that point generation should at least be specialized by the type of domain (discrete, lines, volumes).
It is also possible to decide to go this way but do not implement all elements right now. In such case we could think about the interface and be prepared for extensions in the future. E.g. for now support only something like this:
namespace bg = boost::geometry; bg::rng::generate(bg::rng::within(A), output);
I like that interface. I would maybe drop the rng, because it's conceivable to have generators in the future for deterministic point generation as well, for example to create some regular lattice or nodes on edges. Kind regards, Tinko Bartels -- Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html