SOM, MDS and Sammon's mapping

A. Naud

W. Duch

Dates: 1996 - 1997

Result(s):

A comparison was made of the mappings obtained by the 3 methods in spaces of various dimensionality, for various datasets. The performance evaluation was based on a k-nn classifier with the leave one out method. Because SOM performs clustering as well as topology preservation, the results were much better for this method than for MDS and Sammon's mapping.

Problems:

1) For certain datasets, we had a convergence problem with Sammon's mapping when the output space dimension was too large (e.g. d > 3). The reason lies in the fact that the step size of the learning process is fixed. This problem was avoided by a replacement of the fixed alpha by a computed one that takes into account the curvature of the minimized function (which is approximated by its second order expansion).

2) Both MDS and Sammon's mapping are based on interpoint distances preservation, so their time complexity is N^2, where N is the number of points. This implies a limitation on the number of points that those methods are able to project. Practically, N <= 1000.

3) For the MDS case, the computation of the Hessian matrix is much easier if the denominator of Kruskal's original Stress function is replaced by a similar one in which the output space distances dij are replaced by the target distances d^ij.

Working log:

jan-may'97: Optimisation of the learning step size for MDS and Sammon's mapping.

jun'97: Benchmark results. Check the effect of different loss functions ?