[gsoc19] Seeking a mentor for Boost.Geometry project related to triangulation and random sampling
Hi, I am interested in participating in this year's Google Summer of Code with a contribution to Boost.Geometry. I am looking for mentors and I understand that potential mentors for Boost.Geometry-related projects are Vissarion Fysikopoulos and Adam Wulkiewicz. My proposal is a project that is based on project 3 and 4 for Boost.Geometry in the Wiki. I propose to add concepts for triangles, triangulation, and point-generators for random sampling, as well as corresponding models and an algorithm for Delaunay-triangulation with parametrizable geometric predicates. The first draft of my proposal can be found here: https://docs.google.com/document/d/17M44Hpn0KMuECztCbkvTYIeN2EiXbMAUsVni22Kv... My implementation of the tasks in the competency tests can be found here: https://tinko92.github.io/boost_geometry_GSoC_2019_project_3_4_test/ I'd be thankful for any feedback. Kind regards, Tinko Bartels -- Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html
Hi Tinko, Tinko Bartels Via Boost wrote:
Hi,
I am interested in participating in this year's Google Summer of Code with a contribution to Boost.Geometry. I am looking for mentors and I understand that potential mentors for Boost.Geometry-related projects are Vissarion Fysikopoulos and Adam Wulkiewicz.
Thanks for your interest. Yes we're potential mentors for Boost.Geometry.
My proposal is a project that is based on project 3 and 4 for Boost.Geometry in the Wiki. I propose to add concepts for triangles, triangulation, and point-generators for random sampling, as well as corresponding models and an algorithm for Delaunay-triangulation with parametrizable geometric predicates.
The first draft of my proposal can be found here: https://docs.google.com/document/d/17M44Hpn0KMuECztCbkvTYIeN2EiXbMAUsVni22Kv...
My implementation of the tasks in the competency tests can be found here: https://tinko92.github.io/boost_geometry_GSoC_2019_project_3_4_test/
I'd be thankful for any feedback.
The proposal looks good and I have a few questions. 1. General One projects can be a lot of work and you propose to work on 2 at the same time. Are you certain you'd like to do it? Or is it maybe because both projects are somehow related, i.e. one is needed by the other? What coordinate systems are you planning to support? Cartesian, geographic, etc. In case you decided to implement algorithm for one coordinate system, would the same algortihm theoretically work with different ones or would different algorithm be required? 2. Triangulation So triangulation is basically a triangle mesh represented by arrays of vertices and indexes, correct? If that's the case then maybe we need more general mesh concept? If the result of triangulation is a mesh then why do you need triangle concept as 3-point ring? Wouldn't a view/proxy used internally in your algorithm and storing iterators/pointers to vertices in the mesh be enough? You mentioned Boost.Graph. Would your algorithm depend on this library? If possible we should avoid dependencies. If it doesn't depend then we could still have concepts compatible with other libraries so the same types can be easily adapted to work with both. left-of-predicate in Boost.Geometry is called "side strategy" and is implemented for all supported coordinate systems. in-circle-predicate however is not. It also looks like more complex operation since you have to find a center of the circle and then check if a point is inside it. In Boost.Geometry this would be divided into centroid() and distance(). So my guess is that the in-circle-predicate should be divided into more basic building blocks. However I'm not sure if the center of the circle is the same as barycenter in non-cartesian coordinate systems. Furthermore Boost.Geometry doesn't have non-cartesian centroid yet. Even if you do not support geographic triangulation right now we should do theoretical work and prepare the interface for future extensions, i.e. know what basic coordinate-system-dependant operations will be needed in the future. In case you wanted to work on centroid too, geographic centroid could be yet another whole project for GSoC, we were actually considering to add it. ;) 3. Randomization Just an idea about the generalization of the interface. I can imagine that someone may want to generate points in the interior of a polygon and someone else may want points in the interior or on the boundary, etc. With current design you'd have to add more function names for various conbinations, correct? In Boost.Geometry we use the standard topological definitions, e.g. here: https://en.wikipedia.org/wiki/DE-9IM. So one possibility is to have one function and take parameters defining where your random points should be, in the interior or boundary or exterior of a geometry, but... AFAIU the rejection_generator would be a special case of a generator capable to take spatial predicates defining returned points. In this case returning points that are let's say within(A) && ! covered_by(B) (so in the interior of A and in the exterior of B). So another possibility is to have a generator taking general spatial predicates. Furthermore, the above (points in interior/boundary/exterior) could also be achieved with this approach. Note also that performing randomization for arbitrary geometries would be supported with this interface, also e.g. N random points within multipoint. It is also possible to decide to go this way but do not implement all elements right now. In such case we could think about the interface and be prepared for extensions in the future. E.g. for now support only something like this: namespace bg = boost::geometry; bg::rng::generate(bg::rng::within(A), output); Just a thought. Adam
Hi Tinko, Adam,
thanks for the interest in the project!
In general I find your proposal good but a bit overambitious.
I am mainly covered by Adam's response but I have a few comments.
1. Apart from the two projects (triangulation and random generators)
your project includes one more, namely "robust geometric predicates"
that could be a solely GSOC project. This is not described in the
proposal but what I am understanding is something like this
http://www.cs.cmu.edu/~quake/robust.html (there are many solutions for
robust predicates I just named one) to be done for both "orientation"
(aka side) and "in-circle" predicate and for all 3 coordinate systems.
2. The center of the circle passing by three points is NOT the
centroid (barycenter) of those points! The former could be outside of
the triangle formed by the three points. Also it is not needed to
split the problem of in-circle predicate in "simpler" functions, at
least for cartesian this is very well studied. For other coordinate
systems it needs some work/research.
What I propose is to either focus on one topic (e.g. Delaunay
triangulations) and go further in coordinate system support etc. or do
Delaunay and random generators only for cartesian. In any case you can
include more topics as extras. In other words there will not be a
problem if you do more that you propose.
Best,
Vissarion
On Mon, 18 Mar 2019 at 19:39, Adam Wulkiewicz via Boost
Hi Tinko,
Tinko Bartels Via Boost wrote:
Hi,
I am interested in participating in this year's Google Summer of Code with a contribution to Boost.Geometry. I am looking for mentors and I understand that potential mentors for Boost.Geometry-related projects are Vissarion Fysikopoulos and Adam Wulkiewicz.
Thanks for your interest. Yes we're potential mentors for Boost.Geometry.
My proposal is a project that is based on project 3 and 4 for Boost.Geometry in the Wiki. I propose to add concepts for triangles, triangulation, and point-generators for random sampling, as well as corresponding models and an algorithm for Delaunay-triangulation with parametrizable geometric predicates.
The first draft of my proposal can be found here: https://docs.google.com/document/d/17M44Hpn0KMuECztCbkvTYIeN2EiXbMAUsVni22Kv...
My implementation of the tasks in the competency tests can be found here: https://tinko92.github.io/boost_geometry_GSoC_2019_project_3_4_test/
I'd be thankful for any feedback.
The proposal looks good and I have a few questions.
1. General
One projects can be a lot of work and you propose to work on 2 at the same time. Are you certain you'd like to do it? Or is it maybe because both projects are somehow related, i.e. one is needed by the other?
What coordinate systems are you planning to support? Cartesian, geographic, etc. In case you decided to implement algorithm for one coordinate system, would the same algortihm theoretically work with different ones or would different algorithm be required?
2. Triangulation
So triangulation is basically a triangle mesh represented by arrays of vertices and indexes, correct? If that's the case then maybe we need more general mesh concept?
If the result of triangulation is a mesh then why do you need triangle concept as 3-point ring? Wouldn't a view/proxy used internally in your algorithm and storing iterators/pointers to vertices in the mesh be enough?
You mentioned Boost.Graph. Would your algorithm depend on this library? If possible we should avoid dependencies. If it doesn't depend then we could still have concepts compatible with other libraries so the same types can be easily adapted to work with both.
left-of-predicate in Boost.Geometry is called "side strategy" and is implemented for all supported coordinate systems. in-circle-predicate however is not. It also looks like more complex operation since you have to find a center of the circle and then check if a point is inside it. In Boost.Geometry this would be divided into centroid() and distance(). So my guess is that the in-circle-predicate should be divided into more basic building blocks. However I'm not sure if the center of the circle is the same as barycenter in non-cartesian coordinate systems. Furthermore Boost.Geometry doesn't have non-cartesian centroid yet. Even if you do not support geographic triangulation right now we should do theoretical work and prepare the interface for future extensions, i.e. know what basic coordinate-system-dependant operations will be needed in the future.
In case you wanted to work on centroid too, geographic centroid could be yet another whole project for GSoC, we were actually considering to add it. ;)
3. Randomization
Just an idea about the generalization of the interface. I can imagine that someone may want to generate points in the interior of a polygon and someone else may want points in the interior or on the boundary, etc. With current design you'd have to add more function names for various conbinations, correct? In Boost.Geometry we use the standard topological definitions, e.g. here: https://en.wikipedia.org/wiki/DE-9IM. So one possibility is to have one function and take parameters defining where your random points should be, in the interior or boundary or exterior of a geometry, but...
AFAIU the rejection_generator would be a special case of a generator capable to take spatial predicates defining returned points. In this case returning points that are let's say within(A) && ! covered_by(B) (so in the interior of A and in the exterior of B). So another possibility is to have a generator taking general spatial predicates. Furthermore, the above (points in interior/boundary/exterior) could also be achieved with this approach.
Note also that performing randomization for arbitrary geometries would be supported with this interface, also e.g. N random points within multipoint.
It is also possible to decide to go this way but do not implement all elements right now. In such case we could think about the interface and be prepared for extensions in the future. E.g. for now support only something like this:
namespace bg = boost::geometry; bg::rng::generate(bg::rng::within(A), output);
Just a thought.
Adam
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hi Vissarion, sorry for the double post, I have only received your e-mail after posting my response to Adam.
Hi Tinko, Adam,
thanks for the interest in the project!
In general I find your proposal good but a bit overambitious.
I understand, and since this is the second advice in that direction I will try to limit the proposal.
I am mainly covered by Adam's response but I have a few comments.
1. Apart from the two projects (triangulation and random generators) your project includes one more, namely "robust geometric predicates" that could be a solely GSOC project. This is not described in the proposal but what I am understanding is something like this http://www.cs.cmu.edu/~quake/robust.html (there are many solutions for robust predicates I just named one) to be done for both "orientation" (aka side) and "in-circle" predicate and for all 3 coordinate systems.
This (what Shewchuck described as Adaptive A in his short paper "Robust Adaptive Floating-Point Geometric Predicates") was indeed the solution that I had in mind.
What I propose is to either focus on one topic (e.g. Delaunay triangulations) and go further in coordinate system support etc. or do Delaunay and random generators only for cartesian. In any case you can include more topics as extras. In other words there will not be a problem if you do more that you propose.
I understand. Both (Delaunay triangulation for more then cartesian coordinates and Delaunay + random generators for cartesian) are interesting. I have a slight preference towards doing Delaunay triangulation and random generators for cartesian coordinates and add extras if the work is done early in the GSoC. In that case, I would specify in my proposal that I propose these things for cartesian coordinates. This is just a slight preference, though, and if the maintainers would deem Delaunay for cartesian and other coordinates more useful, I would do that and rewrite my proposal accordingly. Thank you for your feedback as well and kind regards, Tinko Bartels -- Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html
Hi Adam, thank your for your extensive feedback. I'll try to address all questions.
One projects can be a lot of work and you propose to work on 2 at the same time. Are you certain you'd like to do it? Or is it maybe because both projects are somehow related, i.e. one is needed by the other?
There is a relation in the sense that I think that a fast (not necessarily Delaunay) triangulation algorithm and a triangle concept are beneficial for a random point generator that generates points uniformly distributed inside a polygon (and this is probably true for higher dimensions as well). For a large number of points, I found that directly sampling to triangles outperforms a rejections sampler. Other than that, my proposal to combine two projects is based on a conservative estimate of the total amount of work. It took roughly a day to implement an initial, imperfect variant of s-hull (for the programming competency test) and the triangulation algorithms commonly described in literature seem to be intuitively accessible enough to be implemented properly in a reasonable timeframe. The random samplers are, in my opinion, a simpler problem than the triangulation, so they can be completed afterward but still within the GSoC. I am certain that I can produce a good quality implementation with proper tests and documentation of the proposed features in the GSoC work period with reasonable effort.
What coordinate systems are you planning to support? Cartesian, geographic, etc. In case you decided to implement algorithm for one coordinate system, would the same algortihm theoretically work with different ones or would different algorithm be required?
Regarding the random generators: Generally, uniform sampling can be done in geographic coordinates using an equal-area transformation from the cylinder to the sphere, and I have considered that in the proposal (last paragraph before milestones). However, I imagine that in GIS applications one is usually interested in very small subsets of the sphere so rejection_sampling would be prohibitively expensive. In the programming competency test, I implemented a more efficient sampler for convex polygons by decomposing it into triangles and sample uniformly on triangles. I suppose the most obvious approaches would be to do the same thing for polygons on spheres (I've found a straightforward way to do that in literature: https://www.graphics.cornell.edu/pubs/1995/Arv95c.pdf ) or to at least chose some box or polygon in cartesian coordinates so that its image (under an equal area transformation) covers the interesting area of the sphere and not too much more, take uniform samples in it, map them to the sphere, and reject them if necessary. I'd be thankful for input from potential users of randomly generated points in geographic (or other) coordinate systems if there is more to consider. More on randomly generated points below. Regarding triangulation: In my literature research for the proposal, I concentrated on Delaunay triangulations in cartesian coordinates. I've looked into the spherical case and there is a lot of literature for that as well, for example, this paper on adapting Fortune's algorithm to spheres ( http://www.e-lc.org/tmp/Xiaoyu__Zheng_2011_12_05_14_35_11.pdf ). I've seen very intuitive approaches to that using complex hull algorithms and projections to the sphere. I don't know how well and with how much effort the algorithm I've linked in my proposal translates to triangulations on spheres but I will think about it. Do you know how much interest there would be in Delaunay triangulations on spheres and what the usual use case would be? If someone enters 3 points does he want a decomposition of the sphere into 4 triangles or would one usually be interested in the triangulation of a small subset of the sphere?
2. Triangulation
So triangulation is basically a triangle mesh represented by arrays of vertices and indexes, correct? If that's the case then maybe we need more general mesh concept?
I suppose it could be generalized. Edges and vertices would remain the same thing but faces could be arbitrary rings with more than three adjacent faces.
If the result of triangulation is a mesh then why do you need triangle concept as 3-point ring? Wouldn't a view/proxy used internally in your algorithm and storing iterators/pointers to vertices in the mesh be enough?
I imagine triangles are useful primitives in general and it would be beneficial to have the concept to specialize some existing algorithms for them (like area) so that their implementation can make use of the additional constraint of the triangle being a ring with only three points. I agree that internally in the algorithms for Delaunay triangulation, triangles would be represented by views with iterators to vertices but I thought that would not make a difference because those views and triangles that actually store vertices could both fulfill the same concept.
You mentioned Boost.Graph. Would your algorithm depend on this library? If possible we should avoid dependencies. If it doesn't depend then we could still have concepts compatible with other libraries so the same types can be easily adapted to work with both.
I would not depend on it because the algorithm linked in the proposal (as far as I can see) does not use any complicated graph operations. It would be nice to be compatible with BGL because I've already had cases in which I had to perform more complicated graph operations on meshes.
left-of-predicate in Boost.Geometry is called "side strategy" and is implemented for all supported coordinate systems.
I completely overlooked that one. I see that it is designed to optionally supports higher precision coordinate types. I will look further into that.
in-circle-predicate however is not. It also looks like more complex operation since you have to find a center of the circle and then check if a point is inside it. In Boost.Geometry this would be divided into centroid() and distance(). So my guess is that the in- circle-predicate should be divided into more basic building blocks. However I'm not sure if the center of the circle is the same as barycenter in non-cartesian coordinate systems. Furthermore Boost.Geometry doesn't have non-cartesian centroid yet. Even if you do not support geographic triangulation right now we should do theoretical work and prepare the interface for future extensions, i.e. know what basic coordinate-system-dependant operations will be needed in the future.
I implemented it in that two-stage way in my implementation of Delaunay Triangulation for the programming competency test. However, I found that this seems to be usually done in a single step (see https://www.cs.cmu.edu/~quake/robust.html ), I guess for performance and stability reasons. I noticed that with larger problem sizes (1'000'000 points and more) my Delaunay triangulation would not terminate for some runs (I generate the problems randomly) and some experiments (I did not look too deeply into it) suggest that my two-staged, non-robust in-circle-predicate is to blame here. I would trust the literature there and implement a simple approximate predicate that works for non-degenerate problems and an adaptive predicate that will be a little slower but still work for degenerate cases.
In case you wanted to work on centroid too, geographic centroid could be yet another whole project for GSoC, we were actually considering to add it. ;)
I am afraid that would lead to overstretching my proposal. ;)
3. Randomization
Just an idea about the generalization of the interface. I can imagine that someone may want to generate points in the interior of a polygon and someone else may want points in the interior or on the boundary, etc. With current design you'd have to add more function names for various conbinations, correct?
I would not create special generators for all conceivable cases. But I think handling some special cases is very benefecial. The one I produced for the programming competency test for project 4 uses a trivial fan triangulation and uses that to produce random samples directly rather than sampling from the bounding box and rejecting based on the within-predicate. I did not commit it but I did write a simple rejection sampler as well and found it to be much slower (I think about 2-3 times), even though the convex polygons are comparatively simple geometries to check within for and they cover a large portion of their bounding box so only a good portion of samples would not be rejected.
AFAIU the rejection_generator would be a special case of a generator capable to take spatial predicates defining returned points. In this case returning points that are let's say within(A) && ! covered_by(B) (so in the interior of A and in the exterior of B). So another possibility is to have a generator taking general spatial predicates. Furthermore, the above (points in interior/boundary/exterior) could also be achieved with this approach.
I totally agree with the suggestion that arbitrary spatial predicates should be supported for rejection sampling.
Note also that performing randomization for arbitrary geometries would be supported with this interface, also e.g. N random points within multipoint.
I agree but the probability of a uniformly distributed point lying inside a multipoint (except for a very coarse space) is vanishingly small. I think that point generation should at least be specialized by the type of domain (discrete, lines, volumes).
It is also possible to decide to go this way but do not implement all elements right now. In such case we could think about the interface and be prepared for extensions in the future. E.g. for now support only something like this:
namespace bg = boost::geometry; bg::rng::generate(bg::rng::within(A), output);
I like that interface. I would maybe drop the rng, because it's conceivable to have generators in the future for deterministic point generation as well, for example to create some regular lattice or nodes on edges. Kind regards, Tinko Bartels -- Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html
Hi, some more comments here.
One projects can be a lot of work and you propose to work on 2 at the same time. Are you certain you'd like to do it? Or is it maybe because both projects are somehow related, i.e. one is needed by the other?
There is a relation in the sense that I think that a fast (not necessarily Delaunay) triangulation algorithm and a triangle concept are beneficial for a random point generator that generates points uniformly distributed inside a polygon (and this is probably true for higher dimensions as well). For a large number of points, I found that directly sampling to triangles outperforms a rejections sampler.
As far as I know there are (at least) 3 ways to generate random points: a) rejection-acceptance, b) triangulation + simplex (triangle) sampling, c) random walks. You found (a) slower but I guess this depends on how you implement membership for the geometry (given a point answer if it is inside of outside geometry). With a fast membership rejection sampling should be very competitive. So I think it make sense to implement both a,b (and maybe some c methods that would be useful for sampling on boundary) and test for which cases the one is faster than the other. Finally, since you mention high dimensions, there (a), (b) are very slow as dimension grows (size oftriangulation has exponential dependence on dimension) so (c) is the only way, but this is another story...
Other than that, my proposal to combine two projects is based on a conservative estimate of the total amount of work. It took roughly a day to implement an initial, imperfect variant of s-hull (for the programming competency test) and the triangulation algorithms commonly described in literature seem to be intuitively accessible enough to be implemented properly in a reasonable timeframe. The random samplers are, in my opinion, a simpler problem than the triangulation, so they can be completed afterward but still within the GSoC. I am certain that I can produce a good quality implementation with proper tests and documentation of the proposed features in the GSoC work period with reasonable effort.
What coordinate systems are you planning to support? Cartesian, geographic, etc. In case you decided to implement algorithm for one coordinate system, would the same algortihm theoretically work with different ones or would different algorithm be required?
Regarding the random generators: Generally, uniform sampling can be done in geographic coordinates using an equal-area transformation from the cylinder to the sphere, and I have considered that in the proposal (last paragraph before milestones).
However, I imagine that in GIS applications one is usually interested in very small subsets of the sphere so rejection_sampling would be prohibitively expensive. In the programming competency test, I implemented a more efficient sampler for convex polygons by decomposing it into triangles and sample uniformly on triangles.
I suppose the most obvious approaches would be to do the same thing for polygons on spheres (I've found a straightforward way to do that in literature: https://www.graphics.cornell.edu/pubs/1995/Arv95c.pdf ) or to at least chose some box or polygon in cartesian coordinates so that its image (under an equal area transformation) covers the interesting area of the sphere and not too much more, take uniform samples in it, map them to the sphere, and reject them if necessary.
I'd be thankful for input from potential users of randomly generated points in geographic (or other) coordinate systems if there is more to consider. More on randomly generated points below.
Regarding triangulation: In my literature research for the proposal, I concentrated on Delaunay triangulations in cartesian coordinates. I've looked into the spherical case and there is a lot of literature for that as well, for example, this paper on adapting Fortune's algorithm to spheres ( http://www.e-lc.org/tmp/Xiaoyu__Zheng_2011_12_05_14_35_11.pdf ). I've seen very intuitive approaches to that using complex hull algorithms and projections to the sphere. I don't know how well and with how much effort the algorithm I've linked in my proposal translates to triangulations on spheres but I will think about it.
Do you know how much interest there would be in Delaunay triangulations on spheres and what the usual use case would be? If someone enters 3 points does he want a decomposition of the sphere into 4 triangles or would one usually be interested in the triangulation of a small subset of the sphere?
Delaunay triangulations on a sphere can be used for maritime zone determination (but I am not the right person to speak about applications ;) For a more robust approach you can have a look at this paper: https://link.springer.com/chapter/10.1007%2F978-3-642-13193-6_39 and this implementation in python: https://github.com/tylerjereddy/py_sphere_Voronoi I discuss about Delaunay and Voronoi without taking care of splitting the discussions since those two are equivalent and I think that your Delaunay implementation should also made it possible to the user to compute the Voronoi (probably two interfaces that share the same implementation). Best, Vissarion
2. Triangulation
So triangulation is basically a triangle mesh represented by arrays of vertices and indexes, correct? If that's the case then maybe we need more general mesh concept?
I suppose it could be generalized. Edges and vertices would remain the same thing but faces could be arbitrary rings with more than three adjacent faces.
If the result of triangulation is a mesh then why do you need triangle concept as 3-point ring? Wouldn't a view/proxy used internally in your algorithm and storing iterators/pointers to vertices in the mesh be enough?
I imagine triangles are useful primitives in general and it would be beneficial to have the concept to specialize some existing algorithms for them (like area) so that their implementation can make use of the additional constraint of the triangle being a ring with only three points. I agree that internally in the algorithms for Delaunay triangulation, triangles would be represented by views with iterators to vertices but I thought that would not make a difference because those views and triangles that actually store vertices could both fulfill the same concept.
You mentioned Boost.Graph. Would your algorithm depend on this library? If possible we should avoid dependencies. If it doesn't depend then we could still have concepts compatible with other libraries so the same types can be easily adapted to work with both.
I would not depend on it because the algorithm linked in the proposal (as far as I can see) does not use any complicated graph operations. It would be nice to be compatible with BGL because I've already had cases in which I had to perform more complicated graph operations on meshes.
left-of-predicate in Boost.Geometry is called "side strategy" and is implemented for all supported coordinate systems.
I completely overlooked that one. I see that it is designed to optionally supports higher precision coordinate types. I will look further into that.
in-circle-predicate however is not. It also looks like more complex operation since you have to find a center of the circle and then check if a point is inside it. In Boost.Geometry this would be divided into centroid() and distance(). So my guess is that the in- circle-predicate should be divided into more basic building blocks. However I'm not sure if the center of the circle is the same as barycenter in non-cartesian coordinate systems. Furthermore Boost.Geometry doesn't have non-cartesian centroid yet. Even if you do not support geographic triangulation right now we should do theoretical work and prepare the interface for future extensions, i.e. know what basic coordinate-system-dependant operations will be needed in the future.
I implemented it in that two-stage way in my implementation of Delaunay Triangulation for the programming competency test. However, I found that this seems to be usually done in a single step (see https://www.cs.cmu.edu/~quake/robust.html ), I guess for performance and stability reasons. I noticed that with larger problem sizes (1'000'000 points and more) my Delaunay triangulation would not terminate for some runs (I generate the problems randomly) and some experiments (I did not look too deeply into it) suggest that my two-staged, non-robust in-circle-predicate is to blame here.
I would trust the literature there and implement a simple approximate predicate that works for non-degenerate problems and an adaptive predicate that will be a little slower but still work for degenerate cases.
In case you wanted to work on centroid too, geographic centroid could be yet another whole project for GSoC, we were actually considering to add it. ;)
I am afraid that would lead to overstretching my proposal. ;)
3. Randomization
Just an idea about the generalization of the interface. I can imagine that someone may want to generate points in the interior of a polygon and someone else may want points in the interior or on the boundary, etc. With current design you'd have to add more function names for various conbinations, correct?
I would not create special generators for all conceivable cases. But I think handling some special cases is very benefecial. The one I produced for the programming competency test for project 4 uses a trivial fan triangulation and uses that to produce random samples directly rather than sampling from the bounding box and rejecting based on the within-predicate.
I did not commit it but I did write a simple rejection sampler as well and found it to be much slower (I think about 2-3 times), even though the convex polygons are comparatively simple geometries to check within for and they cover a large portion of their bounding box so only a good portion of samples would not be rejected.
AFAIU the rejection_generator would be a special case of a generator capable to take spatial predicates defining returned points. In this case returning points that are let's say within(A) && ! covered_by(B) (so in the interior of A and in the exterior of B). So another possibility is to have a generator taking general spatial predicates. Furthermore, the above (points in interior/boundary/exterior) could also be achieved with this approach.
I totally agree with the suggestion that arbitrary spatial predicates should be supported for rejection sampling.
Note also that performing randomization for arbitrary geometries would be supported with this interface, also e.g. N random points within multipoint.
I agree but the probability of a uniformly distributed point lying inside a multipoint (except for a very coarse space) is vanishingly small. I think that point generation should at least be specialized by the type of domain (discrete, lines, volumes).
It is also possible to decide to go this way but do not implement all elements right now. In such case we could think about the interface and be prepared for extensions in the future. E.g. for now support only something like this:
namespace bg = boost::geometry; bg::rng::generate(bg::rng::within(A), output);
I like that interface. I would maybe drop the rng, because it's conceivable to have generators in the future for deterministic point generation as well, for example to create some regular lattice or nodes on edges.
Kind regards, Tinko Bartels
-- Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Tinko Bartels Via Boost wrote:
I completely overlooked that one. I see that it is designed to optionally supports higher precision coordinate types. I will look further into that.
The whole library supports user-defined calculation type.
in-circle-predicate however is not. It also looks like more complex operation since you have to find a center of the circle and then check if a point is inside it. In Boost.Geometry this would be divided into centroid() and distance(). So my guess is that the in- circle-predicate should be divided into more basic building blocks. However I'm not sure if the center of the circle is the same as barycenter in non-cartesian coordinate systems. Furthermore Boost.Geometry doesn't have non-cartesian centroid yet. Even if you do not support geographic triangulation right now we should do theoretical work and prepare the interface for future extensions, i.e. know what basic coordinate-system-dependant operations will be needed in the future. I implemented it in that two-stage way in my implementation of Delaunay Triangulation for the programming competency test. However, I found that this seems to be usually done in a single step (see https://www.cs.cmu.edu/~quake/robust.html ), I guess for performance and stability reasons. I noticed that with larger problem sizes (1'000'000 points and more) my Delaunay triangulation would not terminate for some runs (I generate the problems randomly) and some experiments (I did not look too deeply into it) suggest that my two-staged, non-robust in-circle-predicate is to blame here.
I would trust the literature there and implement a simple approximate predicate that works for non-degenerate problems and an adaptive predicate that will be a little slower but still work for degenerate cases.
Vissarion pointed out in other email that I made a mistake with centroid being the center of the circle. Sorry about that. The two steps would be envelope(triangle, sphere_on_surface) and covered_by(point, sphere_on_surface). I wrote sphere_on_surface because we also have nsphere (n-Dimensional sphere) in the extensions and AFAIU this would be something different. In particular in geographic the shape of the sphere on the surface of the ellipsoid would be different and we wouldn't be able to represent it as a center point and radius. He also seems to agree with you that this should be done in one step. I have nothing against it if it's proven to be faster. My main concern was bviously the reusability of the code and the complexity of interface. 1-stage operation would require new type of strategy but that's ok.
Note also that performing randomization for arbitrary geometries would be supported with this interface, also e.g. N random points within multipoint. I agree but the probability of a uniformly distributed point lying inside a multipoint (except for a very coarse space) is vanishingly small. I think that point generation should at least be specialized by the type of domain (discrete, lines, volumes).
If the interface allowed it and predicates were passed they would define in which topological parts the randomized points would be. So if one wanted to get points within multipoint like this: generate(N, within(multipoint), output) the algorithm would return N points from a multipoint (in the interior of multipoint, see: https://en.wikipedia.org/wiki/DE-9IM). Note that it doesn't mean it would have to be rejection sampler because we can imagine a case when the predicate expression is analysed, new geometry is generated from it and then sampling is done in this geometry, e.g.: generate(N, within(polyA) && ! within(polyB), output) could first calculate difference(polyA, polyB, polyC) and then generate points that are in polyC (with special handling of boundaries because the boundary of polyB would be allowed). Not all predicates could be allowed, we could limit which kinds of geometries and predicates can be passed together, etc.
It is also possible to decide to go this way but do not implement all elements right now. In such case we could think about the interface and be prepared for extensions in the future. E.g. for now support only something like this:
namespace bg = boost::geometry; bg::rng::generate(bg::rng::within(A), output); I like that interface. I would maybe drop the rng, because it's conceivable to have generators in the future for deterministic point generation as well, for example to create some regular lattice or nodes on edges.
The problem is the name "within" which already exists in boost::geometry namespace as relational operation and also in boost::geometry::index namespace as rtree spatial predicate. So we'd have to find a different name or put it in a different namespace. One more general comment at the end related to all projects and non-cartesian coordinate systems. In general the approach in the library we perform calculations directly in the coordinate systems of the data or strategy. So we do not transform between them in the process. If the user wants to do transform e.g. geographic to cartesian, perform an operation and then transform back he can do it. Space reference systems transformations are implemented but not documented. See: https://github.com/boostorg/geometry/tree/develop/include/boost/geometry/srs https://github.com/boostorg/geometry/tree/develop/test/srs The problem with transformations is that they are contextual, i.e. there is no one transformation with standard set of parameters which would work for all possible geographic data. It depends on the use-case and should be choosen by the user. AFAIR you mentioned earlier projecting from sphere to cylinder. The greater the polygon/triangle the more distorted the coordinates would be. This would also be true for the distribution of the random points, it'd no longer be uniform. So we'd have to divide the problem into smaller triangles, perform separate projections for each of them, calculate in cartesian and the transform back the result. And its is still not clear what should be the accuracy of the operation i.e. size of the triangles and therefore their number (accuracy-performance tradeoff). My (not informed) guess is that this also depends on the use case so we end up at the user deciding about it anyway. Adam
I appologize for answering myself but I have one more side note. Adam Wulkiewicz wrote:
Tinko Bartels Via Boost wrote:
Note also that performing randomization for arbitrary geometries would be supported with this interface, also e.g. N random points within multipoint. I agree but the probability of a uniformly distributed point lying inside a multipoint (except for a very coarse space) is vanishingly small. I think that point generation should at least be specialized by the type of domain (discrete, lines, volumes).
If the interface allowed it and predicates were passed they would define in which topological parts the randomized points would be. So if one wanted to get points within multipoint like this:
generate(N, within(multipoint), output)
the algorithm would return N points from a multipoint (in the interior of multipoint, see: https://en.wikipedia.org/wiki/DE-9IM). Note that it doesn't mean it would have to be rejection sampler because we can imagine a case when the predicate expression is analysed, new geometry is generated from it and then sampling is done in this geometry, e.g.:
generate(N, within(polyA) && ! within(polyB), output)
could first calculate
difference(polyA, polyB, polyC)
and then generate points that are in polyC (with special handling of boundaries because the boundary of polyB would be allowed).
Not all predicates could be allowed, we could limit which kinds of geometries and predicates can be passed together, etc.
I probably overcomplicated this. If it's true that rejection sampling is slower in all cases then we can move the responsibility for creating the geometry to the user. So we do not have to calculate the difference, support the concatenation of predicates with operator&&, etc. since the user can generate an arbitrary geometry and pass it to the generator. All is needed is the support for single randomization predicate. So predicates like within, covered_by, touches, intersects are obvious, but how about disjoint? Does it have sense to generate a random point outside the geometry using the whole available exterior space? Not sure about the cartesian but in spherical and geographic I can imagine someone would like to do this. Adam
Hi Adam, hi Vissarion,
I've attempted to use your feedback to refine my proposal, and I hope that I
addressed that feedback as well as the general Boost guidelines
sufficiently.
I understand that projects 3 and 4 are opening up a considerable space of
possible designs, so I think, the most important thing would be to define
and document a general interface. I'll start with project 4:
I think it would be sensible to design random geometries to match the
existing interface for random sampling found in Boost and the STL. That
would mean, we have (pseudo-)random generators, random distributions, and
free generate-functions. The point of generators is to create pseudo-random
numbers, and the point of random distribution is to map them to values that
conform to some distribution. We do not need new generators to create random
geometries, we only need new distributions of geometries. Borrowing from the
existing requirements for distributions in Boost (
https://www.boost.org/doc/libs/1_66_0/doc/html/boost_random/reference.html#b...
) and STL, I propose
for a distribution class X, returning objects of type T, a param_type P, a
value u of X and possibly const values x, y of X, p a value of type P, and a
UniformRandomNumberGenerator e returning numbers of type U:
X::result_type returns T
X::param_type returns P
X(p) constructs a distribution from parameters
u.reset() ensures that the following returned values do not depend on the
previous ones
u(e) returns a value of T
u(e, p) equivalent to X(p)(e)
x.param() gets the parameters
x.param(p) sets the parameters
x == y, x != y compares the distributions
os << x, is >> u writes the state and parameters to os or restores them from
is.
I've omitted min() and max() because these concepts cannot be directly
translated from random numbers to geometries. Possible replacements could be
to return points that are min_corner and max_corner of the envelope of
possible values, or, more generally, there could be a domain()-member that
returns the space of all possible values, but I'd be unsure how to interpret
that for anything but points, and I'm not sure that this would be a useful
interface.
For use with existing generators and the described concept of geometry
distributions I propose two simple generate and generate_n functions that
optionally take a unary predicate, as Adam suggested. This predicate would
take one value of T and return true or false. I agree with Adam,
specializing on those would be overcomplicating things, and I'll address
uniform distributions below. I would simply run them on every generated
value before putting it into the output range, for maximum flexibility. That
way all predicates that Adam referenced could be supported.
Vissarion, you mentioned random walks. This design would allow for
non-independently distributed output as well, and the internal state could
be reset and preserved as with RandomNumberDistributions. I hope that it
also can be a basis for future distributions that could generate random
geometries other than points (for example for testing).
Based on that concept, the implemented distributions for project 4 could
look like this:
uniform_point_distribution
Hi Tinko,
On Sat, 23 Mar 2019 at 16:09, Tinko Bartels via Boost
Hi Adam, hi Vissarion,
I've attempted to use your feedback to refine my proposal, and I hope that I addressed that feedback as well as the general Boost guidelines sufficiently.
I understand that projects 3 and 4 are opening up a considerable space of possible designs, so I think, the most important thing would be to define and document a general interface. I'll start with project 4:
I think it would be sensible to design random geometries to match the existing interface for random sampling found in Boost and the STL. That would mean, we have (pseudo-)random generators, random distributions, and free generate-functions. The point of generators is to create pseudo-random numbers, and the point of random distribution is to map them to values that conform to some distribution. We do not need new generators to create random geometries, we only need new distributions of geometries. Borrowing from the existing requirements for distributions in Boost ( https://www.boost.org/doc/libs/1_66_0/doc/html/boost_random/reference.html#b... ) and STL, I propose
for a distribution class X, returning objects of type T, a param_type P, a value u of X and possibly const values x, y of X, p a value of type P, and a UniformRandomNumberGenerator e returning numbers of type U:
X::result_type returns T X::param_type returns P X(p) constructs a distribution from parameters u.reset() ensures that the following returned values do not depend on the previous ones u(e) returns a value of T u(e, p) equivalent to X(p)(e) x.param() gets the parameters x.param(p) sets the parameters x == y, x != y compares the distributions os << x, is >> u writes the state and parameters to os or restores them from is.
I've omitted min() and max() because these concepts cannot be directly translated from random numbers to geometries. Possible replacements could be to return points that are min_corner and max_corner of the envelope of possible values, or, more generally, there could be a domain()-member that returns the space of all possible values, but I'd be unsure how to interpret that for anything but points, and I'm not sure that this would be a useful interface.
For use with existing generators and the described concept of geometry distributions I propose two simple generate and generate_n functions that optionally take a unary predicate, as Adam suggested. This predicate would take one value of T and return true or false. I agree with Adam, specializing on those would be overcomplicating things, and I'll address uniform distributions below. I would simply run them on every generated value before putting it into the output range, for maximum flexibility. That way all predicates that Adam referenced could be supported. Vissarion, you mentioned random walks. This design would allow for non-independently distributed output as well, and the internal state could be reset and preserved as with RandomNumberDistributions. I hope that it also can be a basis for future distributions that could generate random geometries other than points (for example for testing).
Based on that concept, the implemented distributions for project 4 could look like this:
uniform_point_distribution
> where DomainTag is one of interior, boundary or exterior (exterior only makes sense when the exterior of the DomainGeometry has bounded measure, for example with polygons on spheres), DomainGeometry is the type of geometry that is supplied as domain and Point is the type of the output.
With respect to the concept above, T is Point, P is a struct holding an instance of DomainGeometry and reset does nothing.
@Adam, I tried to write it in a way that allows all the possibilities you mentioned but leaves the computation of the domain (which is the only parameter of a uniform distribution) to the user. The implementations can then deduce at compile-time that template parameters uniform_point_distribution
require sampling in some area, maybe by triangulation, perhaps by sampling from the envelope and rejection based on the within-predicate, that uniform_point_distribution only makes sense on something like a sphere and is equivalent to sampling inside the reversed ring, that uniform_point_distribution requires sampling along a range of segments, and that uniform_point_distribution could be done using uniform_int_distribution. I will keep your comment with regard to transformations in mind. Generally, transformations between spherical and cartesian coordinates will distort distributions, but in the particular case of uniform distributions, those will be preserved by equal-area transformations. The distortion of the polygon would be an issue though, in determining a suitable cover of it to sample from. If we get triangulation in spherical coordinates, we could use the simple algorithm described in this paper: https://dl.acm.org/citation.cfm?id=218500 . It describes an equal-area transformation from the unit square to a spherical triangle for uniform (and stratified) sampling in spherical triangles. I noticed, btw that Boost already supports generating random points on a sphere ( https://www.boost.org/doc/libs/1_69_0/doc/html/boost/random/uniform_on_spher... ).
Concerning project 3, I agree with Vissarions suggestion to make a general polygonal mesh that could represent Voronoi diagrams and to design the algorithms in a way that allows the calculation of either one or both of Delaunay triangulation and Voronoi diagram. I've read the article which you provided about spherical Delaunay Triangulations, and I've also looked into its reference 33 (Na, H.S., Lee, C.N., Cheong, O.: Voronoi diagrams on the sphere) in which the problem is solved by solving a corresponding problem in the plane. If the assessment of https://link.springer.com/chapter/10.1007%2F978-3-642-13193-6_39 is correct, providing a mostly shared implementation for Voronoi triangulation in the plane and on the sphere might be difficult and not desirable. As far as I can see, currently boost has also no support for algebraic numbers.
What do you mean? What is the assessment you refer to?
Having reconsidered the proposal, I think a reasonable goal for a 3 month GSoC work period is to document and define the concepts above, implement uniform distributions for points for the interiors of all documented geometries at least in cartesian coordinates, possibly in the different spherical coordinate systems as well, and implement a fast and robust Delaunay triangulator/Voronoi diagram generator for points the plane and in cartesian coordinates and make an assessment on whether the algorithm is suitable for adaption to Delaunay Triangulation on spheres.
I'd be happy to hear further suggestions, and I'll try to refine the proposal until next week and update the competency test to illustrate the use of the interface described above.
In general, the sketch of the proposal looks good to me. Looking forward to read your detailed draft. Best, Vissarion
Kind regards, Tinko Bartels
-- Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hi Vissarion, hi Adam, On Tue, 2019-03-26 at 17:06 +0200, Vissarion Fisikopoulos wrote:
Hi Tinko,
What do you mean? What is the assessment you refer to?
The assessment that a well-designed specialized solution outperforms the more general ones (in their conclusion). That's a very general statement, of course, never the less I'd be concerned that sharing the implementation for Delaunay Triangulation between cartesian and spherical coordinates would make the code more complicated and more challenging to optimize so that it can be competitive for either case. If I understood correctly (5 Implementation and Experiments, first paragraph) the implementation they based their's on was already very well suited for adaption to triangulation on the sphere but still required modification beyond using other geometric predicates. The algorithm I implemented for the competency test (based on shull) would also require modification beyond using other left-of and in-circle-predicates, for example the reverse of the convex hull would still need to be triangulated at the end and when adding points to the triangulation, additional logic would need to be applied to determining "visible" edges. I am sure the proposed algorithm would also require some modifications to work on the sphere. I suggest that I'll only promise an implementation for cartesian coordinates in the proposal and try to make it work for spherical coordinates if enough time is left after completing all the work.
In general, the sketch of the proposal looks good to me. Looking forward to read your detailed draft.
Best, Vissarion
Here it is
https://docs.google.com/document/d/137y91owur6kOIe1aK229Ap1d8ASeL18vSbeZv_dP...
.
Over the weekend I've put the proposed interface to a first test and
implemented a uniform distribution for some cases. The draft implementation
can be used like this for some geometries g1, g2
auto dist = bg::random::uniform_point_distribution
participants (3)
-
Adam Wulkiewicz
-
Tinko Bartels
-
Vissarion Fisikopoulos