Information Based Agglomerative Segmentation in Metric Spaces.

Chiaromonte F and Taylor J.
Journal of the Indian Society of Agricultural Statistics. May 2010; 64(1):33-44
Links: PDF

In this article, we introduce an approach to agglomerate points in a metric space into spatially contiguous groups which preserve both distance and frequency structure of the data. This is achieved using traditional distance criterion to define candidate mergers, and then selecting among these candidates as to maximize the mutual information between pre- and post- merger partitions. Our information based agglomerative segmentation is particularly effective when grouping data that does not present spatially separated clusters, and can therefore be employed for reducing data complexity in a number of scientific applications. We illustrate the procedure using a simulated data structure and an application to the analysis of multi-species genomic alignment data.