Abstract
We introduce a new estimator for the vector of coefficients \$\beta\$ in the
linear model \$y=X\beta+z\$, where \$X\$ has dimensions \$np\$ with \$p\$
possibly larger than \$n\$. SLOPE, short for Sorted L-One Penalized Estimation,
is the solution to \\min\_bınR^p12\Vert y-Xb\Vert
\_\ell\_2^2+łambda\_1b\_(1)+łambda\_2\vert
b\vert\_(2)+\cdots+łambda\_pb\vert\_(p),\ where
\$łambda\_1\gełambda\_2\ge\cdots\gełambda\_p\ge0\$ and \$\vert
b\vert\_(1)\geb\vert\_(2)\ge\cdots\geb\vert\_(p)\$ are the
decreasing absolute values of the entries of \$b\$. This is a convex program and
we demonstrate a solution algorithm whose computational complexity is roughly
comparable to that of classical \$\ell\_1\$ procedures such as the Lasso. Here,
the regularizer is a sorted \$\ell\_1\$ norm, which penalizes the regression
coefficients according to their rank: the higher the rank - that is, stronger
the signal - the larger the penalty. This is similar to the Benjamini and
Hochberg J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300 procedure (BH) which
compares more significant \$p\$-values with more stringent thresholds. One
notable choice of the sequence \$\łambda\_i\\$ is given by the BH critical
values \$łambda\_BH(i)=z(1-iq/2p)\$, where \$qın(0,1)\$ and
\$z(\alpha)\$ is the quantile of a standard normal distribution. SLOPE aims to
provide finite sample guarantees on the selected model; of special interest is
the false discovery rate (FDR), defined as the expected proportion of
irrelevant regressors among all selected predictors. Under orthogonal designs,
SLOPE with \$łambda\_BH\$ provably controls FDR at level \$q\$.
Moreover, it also appears to have appreciable inferential properties under more
general designs \$X\$ while having substantial power, as demonstrated in a series
of experiments running on both simulated and real data.
Users
Please
log in to take part in the discussion (add own reviews or comments).