Multidimensional Scaling Tool
Multidimensional Scaling (abbreviated MDS)is a method of separating univariate data based upon variance. Conceptually, MDS takes the dissimilarities, or distances, between items described in the data and generates a map between the items. The number of dimensions in this map are often provided prior to generation by the analyst. Usually, the highest variance dimension corresponds to the largest distances being described in the data. The map solution relies on univariate data, so the rotation and orientation of the map dimensions is not significant. MDS uses dimensional analysis similar to Principle Components. For more information see https://en.wikipedia.org/wiki/Multidimensional_scaling.
Two types of MDS are implemented in this tool: Classical MDS, and Isometric MDS. Classical MDS is the simple and fast approach. Classical MDS works by generating a map by reducing the error between given distances between items and the cartesian distance between the items on the map. Isometric MDS is slightly more complex. If the resulting map of Classical MDS is taken and then adjusted so that the map distances between item pairs are in the same largest-to-smallest order as the original data, that is Isometric MDS. This Isometric MDS is then useful when the exact distance units are less important than the rank of which item pairs are farthest apart or closest together.
An example use of Classical MDS would be the straight-line distance between cities across the USA to produce a map of the USA. An example use of Isometric MDS would be producing a multidimensional food chart based on how similar or different the nutritional value is between food items, where a ranking of the distances is more important than a specific unit coordinate. These methods are often used in a marketing research context to obtain the number and nature of the perceptual dimensions used by customers to judge the similarity between different items.
Gallery tool
This tool is not automatically installed with Alteryx Designer or the R tools. To use this tool, download it from the Alteryx Analytics Gallery.
Connect an input
A data stream configured in either of the following 2 ways:
- A 3 column stream with each entry representing item pair names and their dissimilarity.
- An MxM matrix with each column representing an item, each row representing an item, and each intersection representing the dissimilarity value. For more information see https://en.wikipedia.org/wiki/Distance_matrix.
Configure the tool
Use the Model Options tab to configure your model.
- Choose Input Type: Select whether to use the 3-column pairwise approach or distance matrix approach for input of dissimilarity information. You must define all pair distances in either case; otherwise an error is thrown.
- Number of Dimensions to Output: Select the number of dimensions that the map and data will contain in the Data and Plot outputs. Consideration of the level of variance should be made using the eigenvalue plot in the report to choose the best number of dimensions.
- Choose Multi-Dimensional Scaling Method: Choose between using Classical MDS or Isometric MDS algorithms.
Use the Plot Options tab to set controls for the output plot.
- Comma separated list of dimensions to flip: Any numbers in this list will be the dimensions that have their item coordinates multiplied by -1. The MDS algorithms pick dimension polarity arbitrarily, and sometimes can be helped by user input. For instance, in creating a map of the USA based on distances between cities, the direction may be reversed from what is known to be the case.
- Bar Plot of Eigenvalues: This check-mark decides whether or not the eigenvalues and explanation are included in the report output. This is for helping choose the number of dimensions to keep in the map of the data. Mainly, the bar plot helps with knowing at what point do additional dimensions incorporate only noise or spurious data into the map.
- Replace item names with numbers in graph for visibility?: The map may contain too many items to identify one name from another. This check-mark decides whether or not to convert all item names into number IDs (i.e., 'jack', 'jill', 'banana'... etc, versus x1,x2,x3, ... x987, x988, etc.).
Use the Graphics Options tab to set the controls for the graphical output.
- Plot size: Select inches or centimeters for the size of the graph.
-
Graph resolution: Select the resolution of the graph in dots per inch: 1x (96 dpi); 2x (192 dpi); or 3x (288 dpi). Lower resolution creates a smaller file and is best for viewing on a monitor. Higher resolution creates a larger file with better print quality.
- Base font size (points): Select the size of the font in the graph.
View the output
Connect a Browse tool to each output anchor to view results.
- D anchor: [Data] Contains entries for each item and each dimension's coordinate value.
- P anchor: [Plot] Contains report outputs with graphic settings as declared in the tool configuration: (Optional) table and graph depicting the variance of each dimension with explanation of what Eigen values are; Plots of each dimension pair (i.e. {1,2};{1,3};{1,4};{2,3};{2,4};{3,4}) with each item represented by name or (optionally) a numeric identifier.