We design a framework for fast constrained clustering of big data. Constrained clusterings have a strong relationship with diagrams, making them popular for many applications like consolidating farmland, representing polycrystals, or generating superpixels. However, for large data sets, the size of the underlying linear program is beyond what is practically feasible to solve. To address this, we design coresets, small data sets that allow for a $(1+\epsilon)$ approximation of the clustering problem. We apply our framework in the context of representing polycrystals, enabling us to compute accurate representations of three-dimensional materials quickly. With the representation obtained, we build a new dynamic model, uncovering that the growth process converges towards simple diagram types. We also design a new superpixel generation algorithm called Power-SLIC based on constrained clustering outperforming existing state-of-the-art object-based superpixel algorithms. Combining Power-SLIC with our coresets, we obtain a significant runtime speed-up for high-resolution images.
«
We design a framework for fast constrained clustering of big data. Constrained clusterings have a strong relationship with diagrams, making them popular for many applications like consolidating farmland, representing polycrystals, or generating superpixels. However, for large data sets, the size of the underlying linear program is beyond what is practically feasible to solve. To address this, we design coresets, small data sets that allow for a $(1+\epsilon)$ approximation of the clustering prob...
»