I greatly approve of your stress testing approach. Try turning off rounding of the random vertices and then we'll really see where epsilon methods fall short.
I did that and both GPC and Clipper perform without problems because higher order (complex) intersections almost never occur. However, when I round the random vertices to 10 (in a bounding area of 500x700), and intersect two 100 edge polygons, the frequency of higher order intersections rises to an average of 5. When I round the random vertices to 100, the frequency of higher order intersections is close to 1000 (and GPC fails 95% of the time). However, what really maximally stresses both libraries is rounding random points as I've done to 10 (or 100!) and then minimally rotating them. This generates very near but not exactly parallel, overlapping edges at higher order (complex) intersections. Both Clipper and GPC can struggle under these conditions, though Clipper does a little better. This is what I'm trying to improve at the moment.