Multiple linear regression
b = regress(y,X)
[b,bint] = regress(y,X)
[b,bint,r] = regress(y,X)
[b,bint,r,rint] = regress(y,X)
[b,bint,r,rint,stats] = regress(y,X)
[...] = regress(y,X,alpha)
b = regress(y,X)returns
a p-by-1 vectorbof coefficient
estimates for a multilinear regression of the responses inyon
the predictors inX.Xis an n-by-p matrix
of p predictors at each of n observations.yis
an n-by-1 vector of observed responses.
inXoryas missing values,
and ignores them.
If the columns ofXare linearly dependent,regressobtains
a basic solution by setting the maximum number of elements ofbto
[b,bint] = regress(y,X)returns
a p-by-2 matrixbintof 95%
confidence intervals for the coefficient estimates. The first column
ofbintcontains lower confidence bounds for each
of the p coefficient estimates; the second column
contains upper confidence bounds.
If the columns ofXare linearly dependent,regressreturns
zeros in elements ofbintcorresponding to the
zero elements ofb.
[b,bint,r] = regress(y,X)returns
an n-by-1 vectorrof residuals.
[b,bint,r,rint] = regress(y,X)returns
an n-by-2 matrixrintof intervals
that can be used to diagnose outliers. If the intervalrint(i,:)for
observationidoes not contain zero, the corresponding
residual is larger than expected in 95% of new observations, suggesting
In a linear model, observed values ofyare
random variables, and so are their residuals. Residuals have normal
distributions with zero mean but with different variances at different
values of the predictors. To put residuals on a comparable scale,
they are "Studentized," that is, they are divided by
an estimate of their standard deviation that is independent of their
value. Studentized residuals have t distributions
with known degrees of freedom. The intervals returned inrintare
shifts of the 95% confidence intervals of these t distributions,
centered at the residuals.
[b,bint,r,rint,stats] = regress(y,X)returns
a 1-by-4 vectorstatsthat contains, in order,
the R2 statistic,
the F statistic and its p value,
and an estimate of the error variance.
[...] = regress(y,X,alpha)uses
a100*(1-alpha)% confidence level to computebintandrint.
Load data on cars; identify weight and horsepower as predictors,
mileage as the response:
load carsmall x1 = Weight; x2 = Horsepower; % Contains NaN data y = MPG;
Compute regression coefficients for a linear model with an interaction
X = [ones(size(x1)) x1 x2 x1.*x2]; b = regress(y,X) % Removes NaN data b = 60.7104 -0.0102 -0.1882 0.0000
Plot the data and the model:
scatter3(x1,x2,y,'filled') hold on x1fit = min(x1):100:max(x1); x2fit = min(x2):10:max(x2); [X1FIT,X2FIT] = meshgrid(x1fit,x2fit); YFIT = b(1) + b(2)*X1FIT + b(3)*X2FIT + b(4)*X1FIT.*X2FIT; mesh(X1FIT,X2FIT,YFIT) xlabel('Weight') ylabel('Horsepower') zlabel('MPG') view(50,10)
 Chatterjee, S., and A. S. Hadi. "Influential
Observations, High Leverage Points, and Outliers in Linear Regression." Statistical
Science. Vol. 1, 1986, pp. 379–416.