stops {stops} | R Documentation |
A package for "structure optimized proximity scaling" (STOPS), a collection of methods that fit nonlinear distance transformations in multidimensional scaling (MDS) and trade-off the fit with structure considerations to find optimal parameters or optimal configurations. This includes the three variants of cluster optimized proximity scaling (COPS). The package contains various functions, wrappers, methods and classes for fitting, plotting and displaying different MDS models in a STOPS framework like Torgerson scaling, SMACOF, Sammon mapping, elastic scaling, symmetric SMACOF, spherical SMACOF, sstress, rstress, powermds, power elastic scaling, power sammon mapping, powerstress, isomap. All of these models can also be fit as MDS variants (i.e., no structuredness). The package further contains functions for optimization (Adaptive LJ and for Bayesian optimization with treed Gaussian process with jump to linear models) and functions for various structuredness indices
stops(dis, loss = c("strain", "stress", "smacofSym", "powerstress", "powermds", "powerelastic", "powerstrain", "elastic", "sammon", "sammon2", "smacofSphere", "powersammon", "rstress", "sstress", "isomap", "isomapeps", "bcstress", "lmds"), theta = 1, structures = c("cclusteredness", "clinearity", "cdependence", "cmanifoldness", "cassociation", "cnonmonotonicity", "cfunctionality", "ccomplexity", "cfaithfulness", "cregularity", "chierarchy", "cconvexity", "cstriatedness", "coutlying", "cskinniness", "csparsity", "cstringiness", "cclumpiness", "cinequality"), ndim = 2, weightmat = NULL, init = NULL, stressweight = 1, strucweight, strucpars, optimmethod = c("SANN", "ALJ", "pso", "Kriging", "tgp"), lower, upper, verbose = 0, type = c("additive", "multiplicative"), s = 5, initpoints = 10, itmax = 50, model, ...)
dis |
numeric matrix or dist object of a matrix of proximities |
loss |
which loss function to be used for fitting, defaults to stress |
theta |
parameters for the transformation functions. If smaller than the number of parameters for the MDS version the vector gets recycled (see the corresponding stop_XXX function). If larger than the number of parameters for the MDS method, an error is thrown. If completely missing theta is set to 1 and recycled. |
structures |
what c-structuredness should be considered; if missing no structure is considered. |
ndim |
number of dimensions of the target space |
weightmat |
(optional) a matrix of nonnegative weights; defaults to 1 for all off diagonals |
init |
(optional) initial configuration |
stressweight |
weight to be used for the fit measure; defaults to 1 |
strucweight |
weight to be used for the cordillera; defaults to -1/length(structures) |
strucpars |
(possibly named with the structure). List of each structures parameters as vectors with named elements or a list of lists for the structuredness indices, so its form is either |
optimmethod |
What optimizer to use. Currently supported are Bayesian optimization with Gaussian Process priors and Kriging ("Kriging"), Bayesian optimization with treed Gaussian processes ("tgp"), Adaptive LJ Search ("ALJ"), Particle Swarm optimization ("pso"), simulated annealing ("SANN"). Defaults to ALJ version. |
lower |
The lower contraints of the search region. Needs to be a numeric vector of the same length as the parameter vector theta. |
upper |
The upper contraints of the search region. Needs to be a numeric vector of the same length as the parameter vector theta. |
verbose |
numeric value hat prints information on the fitting process; >2 is pretty verbose. |
type |
which aggregation for the multi objective target function? Either 'additive' (default) or 'multiplicative' |
s |
number of particles if pso is used |
initpoints |
number of initial points to fit the surrogate model for bayesian optimization; default is 10 |
itmax |
maximum number of iterations; number of steps of Bayesian optimization if Kriging or tgp is used; default is 50. We recommend a higher number for ALJ (around 150) and a lower number for Bayesian Optimization (around 20). Note that with tgp the actual number of evaluation of the MDS method is between itmax and 6*itmax as tgp it samples 1-6 candidates from the posterior and uses the best candidate. |
model |
a character specifying the surrogate model to use. For Kriging it specifies the covariance kernel for the GP prior; see |
... |
additional arguments to be passed to the optimization procedure |
The stops package provides five categories of important functions:
Models & Algorithms:
stops() ... which fits STOPS models as described in Rusch et al. (2019). By setting cordweight or strucweight to zero they can also be used to fit metric MDS for many different models, see below.
powerStressMin()... a workhorse for fitting s-stress, r-stress (De Leeuw, 2014), Sammon mapping with power transformations (powersammon) and elastic scaling with power transformation (powerelastic). They can most conveniently be accessed via the stops functions and setting stressweight=1 and cordweight or strucweight=0 or by the dedicated functions starting with stops_foo where foo is the method and setting stressweight=1 and strucweight=0. It uses the nested majorization algorithm for r-stress of De Leeuw(2014).
bcStressMin()... a workhorse for fitting Box-Cox stress (Chen & Buja, 2013).
lmds()... a workhorse for the local MDS of Chen & Buja (2008).
Structuredness Indices: Various c-structuredness as c_foo(), where foo is the name of the structuredness. See Rusch et al. (2019).
Optimization functions:
ljoptim() ... An (adaptive) version of the Luus-Jakola random search
Wrappers and convenience functions:
conf_adjust(): procrustes adjustment of configurations
cmdscale(), sammon(): wrappers that return S3 objects
stop_smacofSym(), stop_sammon(), stop_cmdscale(), stop_rstress(), stop_powerstress(),stop_smacofSphere(), stop_sammon2(), stop_elastic(), stop_sstress(), stop_powerelastic(), stop_powersammon(), stop_powermds(), stop_isomap(), stop_isomapeps(), stop_bcstress(), stops_lmds(): stop versions of these MDS models.
stoploss() ... a function to calculate stoploss (Rusch et al., 2019)
Methods: For most of the objects returned by the high-level functions S3 classes and methods for standard generics were implemented, including print, summary, plot, plot3d, plot3dstatic.
References:
Rusch, T., Mair, P. \& Hornik, K. (2019) Structure based hyperparameter selection for nonlinear dimension reduction: The Structure Optimized Proximity Scaling (STOPS) framework, Report 2019/1, Discussion Paper Series, Center for Empirical Research Methods, WU Vienna University of Economics and Business. forthcoming
Authors: Thomas Rusch, Lisha Chen, Jan de Leeuw, Patrick Mair, Kurt Hornik
Maintainer: Thomas Rusch
A list with the components
stoploss: the weighted loss value TBD
data(kinshipdelta) strucpars<-list(list(c(epsilon=10,minpts=2)),NULL) dissm<-as.matrix(kinshipdelta) #STOPS with strain resstrain<-stops(dissm,loss="strain", structures=c("cclusteredness","cdependence"),strucpars=strucpars,optimmethod="ALJ",lower=0,upper=10) resstrain summary(resstrain) plot(resstrain) plot(resstrain,"Shepard") #STOPS with stress resstress<-stops(dissm,loss="stress", structures=c("cclusteredness","cdependence"),strucpars=strucpars,optimmethod="ALJ",lower=0,upper=10) resstress summary(resstress) plot(resstress) plot(res,"Shepard") #STOPS with powerstress respstress<-stops(dissm,,loss="powerstress", structures=c("cclusteredness","cdependence"),strucpars=strucpars,weightmat=dissm,optimmethod="ALJ",lower=c(0,0,1),upper=c(10,10,10)) respstress summary(respstress) plot(respstress) plot(respstress,"Shepard") #STOPS with bcstress resbcstress<-stops(dissm,loss="bcstress", structures=c("cclusteredness","cdependence"),strucpars=strucpars,optimmethod="ALJ", lower=c(0,1,0),upper=c(10,10,10)) resbcstress summary(resbcstress) plot(resbcstress) plot(resbcstress,"Shepard") #STOPS with lmds reslmds<-stops(dissm,loss="lmds", structures=c("cclusteredness","clinearity"),strucpars=strucpars,optimmethod="ALJ",lower=c(2,0),upper=c(10,2)) reslmds summary(reslmds) plot(reslmds) plot(reslmds,"Shepard") #STOPS with Isomap (the epsilon version) resiso<-stops(dissm,loss="isomapeps", structures=c("cclusteredness","clinearity"),strucpars=strucpars,optimmethod="ALJ",lower=47,upper=120) resiso summary(resiso) plot(resiso) plot(resiso,"Shepard") data(BankingCrisesDistances) strucpar<-list(c(eps=10,minpts=2),NULL) res1<-stops(BankingCrisesDistances[,1:69],loss="stress",verbose=0, structures=c("cclusteredness","clinearity"),strucpars=strucpar, lower=0,upper=10) res1 strucpar<-list(list(alpha=1,C=15,var.thr=1e-5,eps=NULL),list(alpha=1,C=15,var.thr=1e-5,eps=NULL)) res1<-stops(BankingCrisesDistances[,1:69],loss="stress",verbose=0, structures=c("cfunctionality","ccomplexity"),strucpars=strucpar, lower=0,upper=10) res1