http://www.stanford.edu/~boyd/papers/admm/lasso/lasso_example.html
http://www-stat.stanford.edu/~susan/courses/b494/index/node35.html
http://www-stat.stanford.edu/~susan/courses/s227/node5.html
http://www-stat.stanford.edu/~susan/courses/b494/index/node29.html
lasso
Regularized least-squares regression using lasso or elastic net algorithms
Syntax
B = lasso(X,Y)
[B,FitInfo] = lasso(X,Y)
[B,FitInfo] = lasso(X,Y,Name,Value)
Description
B = lasso(X,Y) returns fitted least-squares regression coefficients for a set of regularization coefficients Lambda.
[B,FitInfo] = lasso(X,Y) returns a structure containing information about the fits.
[B,FitInfo] = lasso(X,Y,Name,Value) fits regularized regressions with additional options specified by one or more Name,Value pair arguments.
Input Arguments
Name-Value Pair Arguments
Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
'Alpha' | Scalar value from 0 to 1 (excluding 0) representing the weight of lasso (L1) versus ridge (L2) optimization. Alpha = ۱ represents lasso regression, Alpha close to 0 approaches ridge regression, and other values represent elastic net optimization. See Definitions.Default: 1 |
'CV' | Method lasso uses to estimate mean squared error:
Default: 'resubstitution' |
'DFmax' | Maximum number of nonzero coefficients in the model. lasso returns results only for Lambda values that satisfy this criterion.Default: Inf |
'Lambda' | Vector of nonnegative Lambda values. See Definitions.
Default: Geometric sequence of NumLambda values, the largest just sufficient to produce B = ۰ |
'LambdaRatio' | Positive scalar, the ratio of the smallest to the largest Lambda value when you do not set Lambda.If you set LambdaRatio = ۰, lasso generates a default sequence of Lambda values, and replaces the smallest one with 0.Default: 1e-4 |
'MCReps' | Positive integer, the number of Monte Carlo repetitions for cross validation.
Default: 1 |
'NumLambda' | Positive integer, the number of Lambda values lasso uses when you do not set Lambda. lasso can return fewer than NumLambda fits if the if the residual error of the fits drops below a threshold fraction of the variance of Y.Default: 100 |
'Options' | Structure that specifies whether to cross validate in parallel, and specifies the random stream or streams. Create the Options structure with statset. Option fields:
|
'PredictorNames' | Cell array of strings representing names of the predictor variables, in the order in which they appear in X.Default: {} |
'RelTol' | Convergence threshold for the coordinate descent algorithm (see Friedman, Tibshirani, and Hastie [3]). The algorithm terminates when successive estimates of the coefficient vector differ in the L2 norm by a relative amount less than RelTol.Default: 1e-4 |
'Standardize' | Boolean value specifying whether lasso scales X before fitting the models.Default: true |
'Weights' | Observation weights, a nonnegative vector of length n, where n is the number of rows of X. lasso scales Weights to sum to 1.Default: 1/n * ones(n,1) |
Output Arguments
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% File: Lasso.m
% Author: Jinzhu Jia
%
% One simple (Gradiant Descent) implementation of the Lasso
% The objective function is:
% ۱/۲*sum((Y – X * beta -beta0).^2) + lambda * sum(abs(beta))
% Comparison with CVX code is done
% Reference: To be uploaded
%
% Version 4.23.2010
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [beta0,beta] = Lasso(X,Y,lambda)
n = size(X,1);
p = size(X,2);
if(size(Y,1) ~= n)
fprintf( ‘Error: dim of X and dim of Y are not match!!\n’)
quit cancel
end
beta0 = 0;
beta= zeros(p,1);
crit = 1;
step = 0;
while(crit > 1E-5)
step = step + 1;
obj_ini = 1/2*sum((Y – X * beta -beta0).^2) + lambda * sum(abs(beta));
beta0 = mean(Y – X*beta);
for(j = 1:p)
a = sum(X(:,j).^2);
b = sum((Y – X*beta).* X(:,j)) + beta(j) * a;
if lambda >= abs(b)
beta(j) = 0;
else
if b>0
beta(j) = (b – lambda) / a;
else
beta(j) = (b+lambda) /a;
end
end
end
obj_now = 1/2*sum((Y – X * beta -beta0).^2) + lambda * sum(abs(beta));
crit = abs(obj_now – obj_ini) / abs(obj_ini);
end
————————————————————————-
%% Lasso regularization example
% Copyright (c) 2011, The MathWorks, Inc.
%% Introduction to using LASSO
using the lasso functionality introduced
% in R2011b. It is motivated by
% This demo explains how to star
tan example in Tibshiranis original paper
% on the lasso.
via the lasso.
% J. Royal. Statist. Soc B., Vol. 58, No. 1, pages 267-288)
% Tibshirani, R. (1996). Regression shrinkage and selection . % The data set that were working with in this demo is a wide
ferent
% variables and only 20 observations. 5 out of the 8 variables h
% dataset with correlated variables. This data set includes 8 di fave % coefficients of zero. These variables have zero impact on the model. The
up workspace and set random seed
clear all
clc
% Set the random num
% other three variables have non negative values and impact the model %% Clean ber stream rng(1981); %% Creating data set with specific characteristics % Create eight X variables
with one another
% The covariance matrix is spec
% The mean of each variable will be equal to zero mu = [0 0 0 0 0 0 0 0]; % The variable are correlate dified as i = 1:8; matrix = abs(bsxfun(@minus,i',i)); covariance = repmat(.5,8,8).^matrix;
, covariance, 20);
% Create a hyperplane that describes Y = f(X)
Beta = [3; 1
% Use these parameters to generate a set of multivariate normal random numbers X = mvnrnd(m u.5; 0; 0; 2; 0; 0; 0]; ds = dataset(Beta); % Add in a noise vector Y = X * Beta + 3 * randn(20,1); %% Use linear regression to fit the model b = regress(Y,X);
s, ‘PlotType’,
ds.Linear = b; %% Use a lasso to fit the model [B Stats] = lasso(X,Y, 'CV', 5); disp(B) disp(Stats) %% Create a plot showing MSE versus lamba lassoPlot(B, Sta t 'CV') %% Identify a reasonable set of lasso coefficients % View the regression coefficients associated with Index1SE ds.Lasso = B(:,Stats.Index1SE); disp(ds)
(۱۰۰,۱);
Coeff_Num = zeros(100,1);
Betas = zeros(8,100);
cv
%% Create a plot showing coefficient values versus L1 norm lassoPlot(B, Stats) %% Run a Simulation % Preallocate some variables MSE = zeros(100,1); mse = zero s_Reg_MSE = zeros(1,100); for i = 1 : 100 X = mvnrnd(mu, covariance, 20); Y = X * Beta + randn(20,1); [B Stats] = lasso(X,Y, 'CV', 5);
E(i) = Stats.MSE(:, Shrink);
regf = @(XTRAIN, ytrain, XTEST)(XTEST*
Shrink = Stats.Index1SE - ceil((Stats.Index1SE - Stats.IndexMinMSE)/2); Betas(:,i) = B(:,Shrink) > 0; Coeff_Num(i) = sum(B(:,Shrink) > 0); M Sregress(ytrain,XTRAIN)); cv_Reg_MSE(i) = crossval('mse',X,Y,'predfun',regf, 'kfold', 5); end Number_Lasso_Coefficients = mean(Coeff_Num); disp(Number_Lasso_Coefficients) MSE_Ratio = median(cv_Reg_MSE)/median(MSE);
disp(MSE_Ratio)