Stress majorization

Stress majorization

Stress majorization is an optimization strategy used in multidimensional scaling (MDS) where, for a set of "n", "m"-dimensional data items, a configuration "X" of "n" points in "r(<sigma(X). Usually "r" is 2 or 3, i.e. the r imes n matrix "X" lists points in 2- or 3-dimensional Euclidean space so that the result may be visualised (i.e. an MDS plot). The function sigma is a loss or cost function that measures the squared differences between ideal (m-dimensional) distances and actual distances in "r"-dimensional space. It is defined as:

: sigma(X)=sum_{i

where w_{ij}ge 0 is a weight for the measurement between a pair of points (i,j), d_{ij}(X) is the euclidean distance between i and j and delta_{ij} is the ideal distance between the points (their separation) in the m-dimensional data space. Note that w_{ij} can be used to specify a degree of confidence in the similarity between points (e.g. 0 can be specified if there is no information for a particular pair).

A configuration X which minimises sigma(X) gives a plot in which points that are close together correspond to points that are also close together in the original m-dimensional data space.

There are many ways that sigma(X) could be minimised. For example, Kruskal [citation|last=Kruskal|first=J. B.|authorlink=Joseph Kruskal|title=Multidimensional scaling by optimizing goodness of fit to a nonmetrichypothesis|journal=Psychometrika|volume=29|pages=1–27|year=1964.] recommended an iterative steepest descent approach. However, a significantly better (in terms of guarantees on, and rate of, convergence) method for minimising stress was introduced by Jan de Leeuw.citation|last=de Leeuw|first=J.|contribution=Applications of convex analysis to multidimensional scaling|editor1=first=J. R.|editor1-last=Barra|editor2-first=F.|editor2-last=Brodeau|editor3-first=G.|editor3-last=Romie|editor4-first=B.|editor4-last=van Cutsem|title=Recent developments in statistics|pages=133-145|year=1977.] De Leeuw's "iterative majorization" method at each step minimises a simple convex function which both bounds sigma from above and touches the surface of sigma at a point Z, called the "supporting point". In convex analysis such a function is called a "majorizing" function. This iterative majorization process is also referred to as the SMACOF algorithm ("Scaling by majorizing a complicated function").

The SMACOF algorithm

The stress function sigma can be expanded as follows:

: sigma(X)=sum_{i

Note that the first term is a constant C and the second term is quadratic in X (i.e. for the Hessian matrix V the second term is equivalent to trX'VX) and therefore relatively easily solved. The third term is bounded by:

: sum_{i

where B(Z) has:

: b_{ij}=-frac{w_{ij}delta_{ij{d_{ij}(Z)} for d_{ij}(Z) e 0, i e j

and b_{ij}=0 for d_{ij}(Z)=0, i e j

and b_{ii}=-sum_{j=1,j e i}^n b_{ij}.

Proof of this inequality is by the Cauchy-Schwartz inequality, see Borgcitation|last1=Borg|first1=I.|last2=Groenen|first2=P.|title=Modern Multidimensional Scaling: theory and applications|publisher=Springer-Verlag|location=New York|year=1997.] (pp. 152--153).

Thus, we have a simple quadratic function au(X,Z) that majorizes stress:

: sigma(X)=C+,operatorname{tr}, X'VX - 2 ,operatorname{tr}, X'B(X)Xle C+,operatorname{tr}, X' V X - 2 ,operatorname{tr}, X'B(Z)Z = au(X,Z)

The iterative minimization procedure is then:

* at the kth step we set Zleftarrow X^{k-1}
* X^kleftarrow min_X au(X,Z)
* stop if sigma(X^{k-1})-sigma(X^{k}) otherwise repeat.

This algorithm has been shown to decrease stress monotonically (see de Leeuw).

Use in graph drawing

Stress majorization and algorithms similar to SMACOF also have application in the field of graph drawing. [citation|last1=Michailidis|first1=G.|last2=de Leeuw|first2=J.|title=Data visualization through graph drawing|journal=Computation Stat.|year=2001|volume=16|issue=3|pages=435–450.] [citation|first1=E.|last1=Gansner|first2=Y.|last2=Koren|first3=S.|last3=North|contribution=Graph Drawing by Stress Majorization|title=Proceedings of 12th Int. Symp. Graph Drawing (GD'04)|series=Lecture Notes in Computer Science|volume=3383|publisher=Springer-Verlag|pages=239–250|year=2004.] That is, one can find a reasonably aesthetically-appealing layout for a network or graph by minimizing a stress function over the positions of the nodes in the graph. In this case, the delta_{ij} are usually set to the graph-theoretic distances between nodes "i" and "j" and the weights w_{ij} are taken to be delta_{ij}^{-alpha}. Here, alpha is chosen as a trade-off between preserving long- or short-range ideal distances. Good results have been shown for alpha=2. [citation|last=Cohen|first=J.|title=Drawing graphs to convey proximity: an incremental arrangement method|journal=ACM Transactions on Computer-Human Interaction|volume=4|issue=3|year=1997|pages=197–229.]

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Force-based algorithms — Force based or force directed algorithms are a class of algorithms for drawing graphs in an aesthetically pleasing way. Their purpose is to position the nodes of a graph in two dimensional or three dimensional space so that all the edges are of… …   Wikipedia

  • Multidimensional scaling — (MDS) is a set of related statistical techniques often used in information visualization for exploring similarities or dissimilarities in data. MDS is a special case of ordination. An MDS algorithm starts with a matrix of item–item similarities,… …   Wikipedia

  • List of mathematics articles (S) — NOTOC S S duality S matrix S plane S transform S unit S.O.S. Mathematics SA subgroup Saccheri quadrilateral Sacks spiral Sacred geometry Saddle node bifurcation Saddle point Saddle surface Sadleirian Professor of Pure Mathematics Safe prime Safe… …   Wikipedia

  • Nonlinear dimensionality reduction — High dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lies on an embedded non linear manifold within… …   Wikipedia

  • List of numerical analysis topics — This is a list of numerical analysis topics, by Wikipedia page. Contents 1 General 2 Error 3 Elementary and special functions 4 Numerical linear algebra …   Wikipedia

  • List of mathematics articles (M) — NOTOC M M estimator M group M matrix M separation M set M. C. Escher s legacy M. Riesz extension theorem M/M/1 model Maass wave form Mac Lane s planarity criterion Macaulay brackets Macbeath surface MacCormack method Macdonald polynomial Machin… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”