2.7 多元回归分析的SAS程序
在阅读以下内容之前,请先阅读第一章“SAS软件基本操作”。
2.7.1 多元回归方程计算
多元回归方程的SAS程序与一元回归方程的SAS程序类似,只是变量个数有所增加,这里不再详述,只给出一个例子。
例2.20 计算表2-23中萎蔫度Y在蛋白和脯氨酸含量上的多元回归方程。
解:
options linesize = 76;
data mulreg;
infile ‘a:\2-8data.dat’;
input y r1 r7 r8 r15 l3 l9 pro;
run;
proc reg;
model y = r1 r7 r8 r15 l3 l9 pro;
run;
输出结果见表2-25。
表2-25 例2.20的多元回归分析
The SAS System
Model: MODEL1
Dependent Variable: Y
Analysis of Variance
|
|
Sum of |
Mean |
|
|
Source |
DF |
Squares |
Square |
F Value |
Prob>F |
|
|
|
|
|
|
Model |
7 |
0.01213 |
0.00173 |
5.532 |
0.0140 |
Error |
8 |
0.00251 |
0.00031 |
|
|
C Total |
15 |
0.01464 |
|
|
|
Root MSE |
0.01770 |
R-square |
0.8288 |
Dep Mean |
0.99496 |
Adj R-sq |
0.6790 |
C.V. |
1.77883 |
|
|
Parameter Estimates
|
|
Parameter |
Standard |
T for H0: |
|
Variable |
DF |
Estimate |
Error |
Parameter=0 |
Prob>|T| |
|
|
|
|
|
|
INTERCEP |
1 |
0.940788 |
0.02246040 |
41.887 |
0.0001 |
R1 |
1 |
0.000298 |
0.00019724 |
1.510 |
0.1695 |
R7 |
1 |
-0.000099683 |
0.00008626 |
-1.156 |
0.2812 |
R8 |
1 |
-0.000079812 |
0.00005456 |
-1.463 |
0.1816 |
R15 |
1 |
0.000060935 |
0.00008158 |
0.747 |
0.4765 |
L3 |
1 |
0.000090482 |
0.00006817 |
1.327 |
0.2211 |
L9 |
1 |
0.000106 |
0.00008214 |
1.287 |
0.2339 |
PRO |
1 |
-0.004809 |
0.04792476 |
-0.100 |
0.9225 |
表中的R2为复相关系数的平方。由参数估计列可以得到回归方程。
2.7.2 逐步回归分析
在11.3.1中已经介绍过,逐步回归分析过程是不断向方程中引入变量和剔除变量的过程。因此逐步回归的SAS程序,只要在全回归的MODEL语句中加入有关选项即可。
例2.21 对表2-23中的数据进行逐步回归分析。
解:对例2.20的过程步做如下修改:
proc reg;
model y = r1 r7 r8 r15 l3 l9 pro / selection = stepwise
slentry = 0.20 slstay = 0.20;
run;
MODEL语句中的选项“SELECTION=”规定所选模型,这里选用逐步回归。选项“SLENTRY=”(或SLE=)规定变量被选入模型中的显著水平,缺省值是0.15;选项“SLSTAY=”(或SLS=)规定变量被保留在模型中的显著水平,缺省值是0.15。
输出结果见表2-26。
表2-26 例2.21的逐步回归分析
The SAS System
Stepwise Procedure for Dependent Variable Y
Step 1 Variable R15 Entered R-square = 0.60429217 C(p) = 6.48903162
|
DF |
Sum of Squares |
Mean Square |
F |
Prob>F |
|
|
|
|
|
|
Regression |
1 |
0.00884429 |
0.00884429 |
21.38 |
0.0004 |
Error |
14 |
0.00579149 |
0.00041368 |
|
|
Total |
15 |
0.01463578 |
|
|
|
|
Parameter |
Standard |
Type II |
|
|
Variable |
Estimate |
Error |
Sum of Squares |
F |
Prob>F |
|
|
|
|
|
|
INTERCEP |
0.96898231 |
0.00757696 |
6.76555855 |
16354.6 |
0.0001 |
R15 |
0.00015140 |
0.00003274 |
0.00884429 |
21.38 |
0.0004 |
Bounds on condition number: 1, 1
------------------------------------------------------------------------------
Step 2 Variable R8 Entered R-square = 0.70914670 C(p) = 3.58981428
|
DF |
Sum of Squares |
Mean Square |
F |
Prob>F |
|
|
|
|
|
|
Regression |
2 |
0.01037891 |
0.00518946 |
15.85 |
0.0003 |
Error |
13 |
0.00425686 |
0.00032745 |
|
|
Total |
15 |
0.01463578 |
|
|
|
|
Parameter |
Standard |
Type II |
|
|
Variable |
Estimate |
Error |
Sum of Squares |
F |
Prob>F |
|
|
|
|
|
|
INTERCEP |
0.98232148 |
0.00913292 |
3.78821006 |
11568.8 |
0.0001 |
R8 |
-0.00010454 |
0.00004829 |
0.00153463 |
4.69 |
0.0496 |
R15 |
0.00011603 |
0.00003340 |
0.00395240 |
12.07 |
0.0041 |
Bounds on condition number: 1.31443, 5.257719
------------------------------------------------------------------------------
Step 3 Variable R1 Entered R-square = 0.75550496 C(p) = 3.42377348
|
DF |
Sum of Squares |
Mean Square |
F |
Prob>F |
|
|
|
|
|
|
Regression |
3 |
0.01105740 |
0.00368580 |
12.36 |
0.0006 |
Error |
12 |
0.00357838 |
0.00029820 |
|
|
Total |
15 |
0.01463578 |
|
|
|
|
Parameter |
Standard |
Type II |
|
|
Variable |
Estimate |
Error |
Sum of Squares |
F |
Prob>F |
|
|
|
|
|
|
INTERCEP |
0.95471821 |
0.02026903 |
0.66159029 |
2218.63 |
0.0001 |
R1 |
0.00024768 |
0.00016420 |
0.00067849 |
2.28 |
0.1573 |
R8 |
-0.00008465 |
0.00004793 |
0.00093014 |
3.12 |
0.1028 |
R15 |
0.00009292 |
0.00003536 |
0.00205865 |
6.90 |
0.0221 |
Bounds on condition number: 1.618286, 13.8859
------------------------------------------------------------------------------
The SAS System
All variables left in the model are significant at the 0.2000 level.
No other variable met the 0.2000 significance level for entry into the model.
Summary of Stepwise Procedure for Dependent Variable Y
|
Variable |
Number |
Partial |
Model |
|
|
|
Step |
Entered Removed |
In |
R**2 |
R**2 |
C(p) |
F |
Prob>F |
|
|
|
|
|
|
|
|
1 |
R15 |
1 |
0.6043 |
0.6043 |
6.4890 |
21.3796 |
0.0004 |
2 |
R8 |
2 |
0.1049 |
0.7091 |
3.5898 |
4.6866 |
0.0496 |
3 |
R1 |
3 |
0.0464 |
0.7555 |
3.4238 |
2.2753 |
0.1573 |
根据需要通过改变“SLE=”和“SLS=”的值,确定方程中保留变量的个数。 除去上面介绍的PROC REG过程外,还可以用PROC GLM过程进行回归分析,关于PROC GLM过程,这里不再介绍了。
|