Software

testOTM

[GitHub] [Vignette]

testOTM is an R package that computes multivariate ranks and quantiles defined through the theory of optimal transports. It also provides several applications of these statistics, most notably the two-sample multivariate goodness-of-fit testing. The user can use this package to visualize the optimal transport map between uniform probability measure and any data-set. The following interactive plot showcases an optimal transport map between $U[0,1]^3$ and a trivariate Gaussian sample:

An interactive optimal transport map between $U[0,1]^3$ and a (scaled) trivariate Gaussian sample of size $10$ (blue points). The cube (representing the support of $U[0,1]^3$) is partitioned into $10$ convex polyhedra, where every point within each polyhedron is transported to the corresponding sample point. The orange points are the centroids of the polyhedra and indicate the correspondence between sample points and the partitions.

alocv

[GitHub] [arXiv] [Blog post]

alocv is an R package that implement the approximate leave-one-out (ALO) cross-validation strategy for common regressors in an efficient way. Leave-one-out cross-validation (LOOCV) is an appealing method for parameter tuning. However, its high computational cost (requiring fitting the model $n$ times) often makes it infeasible in application. Our proposed method approximates the LOOCV estimations using only the full data fit and siginificantly reduced the time needed for risk estimation.

Risk estimates from ALO and LOOCV for a elastic net ($n=1000$ and $p=200$, with a search grid of length $72$), and a RBF kernel SVM ($n=300$ and $p=50$, with a search grid of length $15$) on simulated data-sets. ALO took only $0.9$ seconds for the former and $0.6$ seconds for the latter, while LOOCV took $32$ and $130$ seconds, respectively.

Risk estimates from ALO and LOOCV for a elastic net ($n=1000$ and $p=200$, with a search grid of length $72$), and a RBF kernel SVM ($n=300$ and $p=50$, with a search grid of length $15$) on simulated data-sets. ALO took only $0.9$ seconds for the former and $0.6$ seconds for the latter, while LOOCV took $32$ and $130$ seconds, respectively.