Cultural advice

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.

Aboriginal and Torres Strait Islander peoples are advised that ANU Library collections may include images, names, voices, and other representations of deceased persons.

Material in the collection may contain terms, language or views that reflect the period in which the item was created and may be considered inappropriate today.

Influence Diagnostics for High-Dimensional Lasso Regression

Loading...
Thumbnail Image

Date

Authors

Rajaratnam, Bala
Roberts, Steven
Sparks, Doug
Yu, Honglin

Journal Title

Journal ISSN

Volume Title

Publisher

American Statistical Association

Abstract

The increased availability of high-dimensional data, and appeal of a “sparse” solution has made penalized likelihood methods commonplace. Arguably the most widely utilized of these methods is ℓ1 regularization, popularly known as the lasso. When the lasso is applied to high-dimensional data, observations are relatively few; thus, each observation can potentially have tremendous influence on model selection and inference. Hence, a natural question in this context is the identification and assessment of influential observations. We address this by extending the framework for assessing estimation influence in traditional linear regression, and demonstrate that it is equally, if not more, relevant for assessing model selection influence for high-dimensional lasso regression. Within this framework, we propose four new “deletion methods” for gauging the influence of an observation on lasso model selection: df-model, df-regpath, df-cvpath, and df-lambda. Asymptotic cut-offs for each measure, even when p→∞ , are developed. We illustrate that in high-dimensional settings, individual observations can have a tremendous impact on lasso model selection. We demonstrate that application of our measures can help reveal relationships in high-dimensional real data that may otherwise remain hidden. Supplementary materials for this article are available online.

Description

Citation

Source

Journal of Computational and Graphical Statistics

Book Title

Entity type

Access Statement

Open Access

License Rights

Creative Commons Attribution-NonCommercial-NoDerivatives License

Restricted until

abcd