analyse de la variance a un facteur exemple : la cuisson de beignets

Transcription

ANALYSE DE LA VARIANCE
A UN FACTEUR
EXEMPLE : LA CUISSON DE BEIGNETS ( SNEDECOR ET
COCHRAN 1971 )
UTILISATION DU LOGICIEL STATGRAPHICS
Il s’agit d’une expérience réalisée pour déterminer si la quantité de
matière grasse absorbée par les beignets pendant leur cuisson dépend de la
matière grasse utilisée. Pour chacune des quatre matières grasses, on a
constitué six fournées de 24 beignets chacune.
POIDS
164
172
168
177
156
195
1
1
1
1
1
1
178
191
197
182
185
177
2
2
2
2
2
2
175
193
178
171
163
176
3
3
3
3
3
3
155
166
149
164
170
168
4
4
4
4
4
4
TYPE
CUISSON DE BEIGNETS - STATGRAPHICS
1
Scatterplot by Level Code
200
poids
190
180
170
160
150
140
1
2
3
4
type
Box-and-Whisker Plot
type
1
2
3
4
140
150
160
170
180
190
200
poids
2
TABLEAU 1: Summary Statistics for poids
type
Count
Average
-----------------------------------------------------------1
6
172,0
2
6
185,0
3
6
176,0
4
6
162,0
-----------------------------------------------------------Total
24
173,75
type
Median
Variance
-----------------------------------------------------------1
170,0
178,0
2
183,5
60,4
3
175,5
97,6
4
165,0
67,6
-----------------------------------------------------------Total
173,5
158,891
type
Standard deviation Minimum
-----------------------------------------------------------1
13,3417
156,0
2
7,77174
177,0
3
9,87927
163,0
4
8,22192
149,0
-----------------------------------------------------------Total
12,6052
149,0
type
Maximum
-----------------------------1
195,0
2
197,0
3
193,0
4
170,0
-----------------------------Total
197,0
TABLEAU 2: ANOVA Table for poids by type
Analysis of Variance
----------------------------------------------------------------------------Source
Sum of Squares
Df Mean Square
F-Ratio
P-Value
----------------------------------------------------------------------------Between groups
1636,5
3
545,5
5,41
0,0069
Within groups
2018,0
20
100,9
----------------------------------------------------------------------------Total (Corr.)
3654,5
23
The StatAdvisor
--------------The ANOVA table decomposes the variance of poids into two
components: a between-group component and a within-group component.
The F-ratio, which in this case equals 5,40634, is a ratio of the
between-group estimate to the within-group estimate. Since the
P-value of the F-test is less than 0,05, there is a statistically
significant difference between the mean poids from one level of type
to another at the 95,0% confidence level. To determine which means
are significantly different from which others, select Multiple Range
Tests from the list of Tabular Options.
3
TABLEAU 3: Variance Check
Tests des variances
Test
Test
Test
Test
C de Cochran: 0,441031
Proba. = 0,359552
de Bartlett: 1,09946
Proba. = 0,625776
de Hartley:
2,94702
de Levene: 0,343405
Proba. = 0,79422
The StatAdvisor
--------------The three statistics displayed in this table test the null
hypothesis that the standard deviations of poids within each of the 4
levels of type is the same. Of particular interest are the two
P-values. Since the smaller of the P-values is greater than or equal
to 0,05, there is not a statistically significant difference amongst
the standard deviations at the 95,0% confidence level.
TABLEAU 4: Table of Means for poids by type with 95,0 percent
LSD intervals
-------------------------------------------------------------------------------Stnd. error
type
Count
Mean
(pooled s)
Lower limit
Upper limit
-------------------------------------------------------------------------------1
6
172,0
4,10081
165,951
178,049
2
6
185,0
4,10081
178,951
191,049
3
6
176,0
4,10081
169,951
182,049
4
6
162,0
4,10081
155,951
168,049
-------------------------------------------------------------------------------Total
24
173,75
The StatAdvisor
--------------This table shows the mean poids for each level of type. It also
shows the standard error of each mean, which is a measure of its
sampling variability. The standard error is formed by dividing the
pooled standard deviation by the square root of the number of
observations at each level. The table also displays an interval
around each mean. The intervals currently displayed are based on
Fisher's least significant difference (LSD) procedure. They are
constructed in such a way that if two means are the same, their
intervals will overlap 95,0% of the time. You can display the
intervals graphically by selecting Means Plot from the list of
Graphical Options. In the Multiple Range Tests, these intervals are
used to determine which means are significantly different from which
others.
4
Means and 95,0 Percent LSD Intervals
200
poids
190
180
170
160
150
1
2
3
4
type
TABLEAU 5: Multiple Range Tests for poids by type
-------------------------------------------------------------------------------Method: 95,0 percent LSD
type
Count
Mean
Homogeneous Groups
-------------------------------------------------------------------------------4
6
162,0
X
1
6
172,0
XX
3
6
176,0
XX
2
6
185,0
X
--------------------------------------------------------------------------------
Contrast
Difference
+/- Limits
-------------------------------------------------------------------------------1 - 2
* -13,0
12,0974
1 - 3
-4,0
12,0974
1 - 4
10,0
12,0974
2 - 3
9,0
12,0974
2 - 4
* 23,0
12,0974
3 - 4
* 14,0
12,0974
-------------------------------------------------------------------------------* denotes a statistically significant difference.
The StatAdvisor
--------------This table applies a multiple comparison procedure to determine
which means are significantly different from which others. The bottom
half of the output shows the estimated difference between each pair of
means. An asterisk has been placed next to 3 pairs, indicating that
these pairs show statistically significant differences at the 95,0%
confidence level. At the top of the page, 3 homogenous groups are
identified using columns of X's. Within each column, the levels
containing X's form a group of means within which there are no
statistically significant differences. The method currently being
used to discriminate among the means is Fisher's least significant
difference (LSD) procedure. With this method, there is a 5,0% risk
ofcalling each pair of means significantly different when the actual
difference equals 0.
5
TABLEAU 6: Table of Means for poids by type
with 95,0 percent Scheffe intervals
-------------------------------------------------------------------------------Stnd. error
type
Count
Mean
(pooled s)
Lower limit
Upper limit
-------------------------------------------------------------------------------1
6
172,0
4,10081
163,159
180,841
2
6
185,0
4,10081
176,159
193,841
3
6
176,0
4,10081
167,159
184,841
4
6
162,0
4,10081
153,159
170,841
-------------------------------------------------------------------------------Total
24
173,75
The StatAdvisor
--------------This table shows the mean poids for each level of type. It also
shows the standard error of each mean, which is a measure of its
sampling variability. The standard error is formed by dividing the
pooled standard deviation by the square root of the number of
observations at each level. The table also displays an interval
around each mean. The intervals currently displayed are based on
Scheffe's multiple comparison procedure. They are constructed in such
a way that if all the means are the same, all the intervals will
overlap at least 95,0% of the time. The Tukey or Bonferroni intervals
will usually be tighter. You can display the intervals graphically by
selecting Means Plot from the list of Graphical Options. In the
Multiple Range Tests, these intervals are used to determine which
means are significantly different from which others.
6
TABLEAU 7: Multiple Range Tests for poids by type
-------------------------------------------------------------------------------Method: 95,0 percent Scheffe
type
Count
Mean
Homogeneous Groups
-------------------------------------------------------------------------------4
6
162,0
X
1
6
172,0
XX
3
6
176,0
XX
2
6
185,0
X
--------------------------------------------------------------------------------
Contrast
Difference
+/- Limits
-------------------------------------------------------------------------------1 - 2
-13,0
17,682
1 - 3
-4,0
17,682
1 - 4
10,0
17,682
2 - 3
9,0
17,682
2 - 4
*23,0
17,682
3 - 4
14,0
17,682
The StatAdvisor
means. An asterisk has been placed next to 1 pair, indicating that
this pair shows a statistically significant difference at the 95,0%
used to discriminate among the means is Scheffe's multiple comparison
procedure.With this method, there is no more than a 5,0% risk of
calling one or more pairs significantly different when their actual
difference equals 0. The Tukey or Bonferroni procedures will usually
be more powerful.
Means and 95,0 Percent Scheffe Intervals
200
poids
190
180
170
160
150
1
2
3
4
type
7
TABLEAU 8: Multiple Range Tests for poids by type, méthode de Bonferroni
-------------------------------------------------------------------------------Method: 95,0 percent Bonferroni
type
Count
Mean
Homogeneous Groups
-------------------------------------------------------------------------------4
6
162,0
X
1
6
172,0
XX
3
6
176,0
XX
2
6
185,0
X
-------------------------------------------------------------------------------Contrast
Difference
+/- Limits
-------------------------------------------------------------------------------1 - 2
-13,0
16,9756
1 - 3
-4,0
16,9756
1 - 4
10,0
16,9756
2 - 3
9,0
16,9756
2 - 4
*23,0
16,9756
3 - 4
14,0
16,9756
The StatAdvisor
means. An asterisk has been placed next to 1 pair, indicating that
this pair shows a statistically significant difference at the 95,0%
used to discriminate among the means is Bonferroni's multiple
comparison procedure. With this method, there is a 5,0% risk of
calling one or more pairs significantly different when their actual
difference equals 0.
TABLEAU 9: Multiple Range Tests for poids by type,méthode de Tukey
-------------------------------------------------------------------------------Method: 95,0 percent Tukey HSD
type
Count
Mean
Homogeneous Groups
-------------------------------------------------------------------------------4
6
162,0
X
1
6
172,0
XX
3
6
176,0
XX
2
6
185,0
X
-------------------------------------------------------------------------------Contrast
Difference
+/- Limits
-------------------------------------------------------------------------------1 - 2
-13,0
16,2376
1 - 3
-4,0
16,2376
1 - 4
10,0
16,2376
2 - 3
9,0
16,2376
2 - 4
*23,0
16,2376
3 - 4
14,0
16,2376
8
TABLEAU 10 :Kruskal-Wallis Test for poids by type
type
Sample Size
Average Rank
-----------------------------------------------------------1
6
11,25
2
6
19,5
3
6
13,5833
4
6
5,66667
-----------------------------------------------------------Test statistic = 11,8322
P-Value = 0,00798013
The StatAdvisor
--------------The Kruskal-Wallis test tests the null hypothesis that the medians
of poids within each of the 4 levels of type are the same. The data
from all the levels is first combined and ranked from smallest to
largest. The average rank is then computed for the data at each
level. Since the P-value is less than 0,05, there is a statistically
significant difference amongst the medians at the 95,0% confidence
level. To determine which medians are significantly different from
which others, select Box-and-Whisker Plot from the list of Graphical
Options and select the median notch option.
Kruskal-Wallis Test for poids by type
Box-and-Whisker Plot
type
1
2
3
4
140
150
160
170
180
190
200
poids
9
EXEMPLE : LA CUISSON DE BEIGNETS
LOGICIEL SAS
Programme
options ls = 80 ;
data BEIGNETS ;
input numero $ matgra typemat ;
cards;
1
164
1
2
172
1
3
168
1
4
177
1
5
156
1
6
195
1
7
178
2
8
191
2
9
197
2
10
182
2
11
185
2
12
177
2
13
175
3
14
193
3
15
178
3
16
171
3
17
163
3
18
176
3
19
155
4
20
166
4
21
149
4
22
164
4
23
170
4
24
168
4
proc print ;
proc GLM ;
class typemat ;
model matgra = typemat ;
means typemat / BON CLM LINES ;
means typemat /DUNCAN ;
means typemat /LSD LINES CLM ;
means typemat / SCHEFFE CLM LINES ;
means typemat / TUKEY lines ;
contrast ' ANIMAL vs VEGETAL '
typemat 1 1 -1 -1 ;
run;
proc npar1way WILCOXON ;
class typemat ;
var matgra ;
run;
CUISSON DE BEIGNETS – LOGICIEL SAS
1
UTILISATION DE LA PROCEDURE GLM
General Linear Models Procedure
Class Level Information
Class
Levels
TYPEMAT
4
Values
1 2 3 4
Number of observations in data set = 24
Dependent Variable: MATGRA
Source
DF
Sum of
Squares
Mean
Square
Model
Error
Corrected Total
3
20
23
1636.5000000
2018.0000000
3654.5000000
545.5000000
100.9000000
R-Square
C.V.
Root MSE
MATGRA Mean
0.447804
5.781237
10.044899
173.75000
Source
DF
Type I SS
Mean Square
F Value
Pr > F
TYPEMAT
3
1636.5000000
545.5000000
5.41
0.0069
Source
DF
Type III SS
Mean Square
F Value
Pr > F
TYPEMAT
3
1636.5000000
545.5000000
5.41
0.0069
F Value
Pr > F
5.41
0.0069
2
Bonferroni T Confidence Intervals for variable: MATGRA
Alpha= 0.05
Confidence= 0.95 df= 20 MSE= 100.9
Critical Value of T= 2.74
Half Width of Confidence Interval= 11.25417
TYPEMAT
N
2
3
1
4
6
6
6
6
Simultaneous
Lower
Confidence
Limit
173.746
164.746
160.746
150.746
Simultaneous
Upper
Confidence
Limit
Mean
185.000
176.000
172.000
162.000
196.254
187.254
183.254
173.254
Bonferroni (Dunn) T tests for variable: MATGRA
NOTE: This test controls the type I experimentwise error rate, but
generally has a higher type II error rate than REGWQ.
Alpha= 0.05 df= 20 MSE= 100.9
Minimum Significant Difference= 16.976
Means with the same letter are not significantly different.
Bon Grouping
Mean
N
TYPEMAT
A
A
A
A
A
185.000
6
2
176.000
6
3
172.000
6
1
162.000
6
4
B
B
B
B
B
3
Duncan's Multiple Range Test for variable: MATGRA
NOTE: This test controls the type I comparisonwise error rate, not
the experimentwise error rate
Alpha= 0.05
df= 20
MSE= 100.9
Number of Means
2
3
4
Critical Range 12.10 12.70 13.08
Duncan Grouping
Mean
N
TYPEMAT
A
A
A
185.000
6
2
176.000
6
3
C
C
C
172.000
6
1
162.000
6
4
B
B
B
T Confidence Intervals for variable: MATGRA
Alpha= 0.05
Confidence= 0.95 df= 20 MSE= 100.9
TYPEMAT
N
Lower
Confidence
Limit
Mean
Upper
Confidence
Limit
2
3
1
4
6
6
6
6
176.446
167.446
163.446
153.446
185.000
176.000
172.000
162.000
193.554
184.554
180.554
170.554
T tests (LSD) for variable: MATGRA
NOTE: This test controls the type I comparisonwise error rate not the
experimentwise error rate.
Alpha= 0.05 df= 20 MSE= 100.9
Least Significant Difference= 12.097
T Grouping
Mean
N
TYPEMAT
A
A
A
185.000
6
2
176.000
6
3
C
C
C
172.000
6
1
162.000
6
4
B
B
B
4
Scheffe's Confidence Intervals for variable: MATGRA
Alpha= 0.05 Confidence= 0.95 df= 20 MSE= 100.9
Critical Value of F= 2.86608
TYPEMAT
N
2
3
1
4
6
6
6
6
Simultaneous
Lower
Confidence
Limit
171.115
162.115
158.115
148.115
Mean
Simultaneous
Upper
Confidence
Limit
185.000
176.000
172.000
162.000
198.885
189.885
185.885
175.885
Scheffe's test for variable: MATGRA
NOTE: This test controls the type I experimentwise error rate but
generally has a higher type II error rate than REGWF for all
pairwise comparisons
Alpha= 0.05 df= 20 MSE= 100.9
Critical Value of F= 3.09839
Scheffe Grouping
Mean
N
TYPEMAT
A
A
A
A
A
185.000
6
2
176.000
6
3
172.000
6
1
162.000
6
4
B
B
B
B
B
5
Tukey's Studentized Range (HSD) Test for variable: MATGRA
NOTE: This test controls the type I experimentwise error rate, but
generally has a higher type II error rate than REGWQ.
Alpha= 0.05 df= 20 MSE= 100.9
Critical Value of Studentized Range= 3.958
Tukey Grouping
Mean
N
TYPEMAT
A
A
A
A
A
185.000
6
2
176.000
6
3
172.000
6
1
162.000
6
4
B
B
B
B
B
Test de contraste
Contrast
ANIMAL vs VEGETAL
DF
Contrast SS
Mean Square
F Value
Pr > F
1
541.5000000
541.5000000
5.37
0.0313
N P A R 1 W A Y
P R O C E D U R E
Wilcoxon Scores (Rank Sums) for Variable MATGRA
Classified by Variable TYPEMAT
TYPEMAT
N
Sum of
Scores
Expected
Under H0
Std Dev
Under H0
Mean
Score
1
2
3
4
6
6
6
6
67.500000
117.000000
81.500000
34.000000
75.0
75.0
75.0
75.0
14.9869508
14.9869508
14.9869508
14.9869508
11.2500000
19.5000000
13.5833333
5.6666667
Average Scores Were Used for Ties
Kruskal-Wallis Test (Chi-Square Approximation)
CHISQ = 11.832
DF = 3
Prob > CHISQ = 0.0080
6

analyse de la variance a un facteur exemple : la cuisson de beignets

Transcription

Documents pareils

Tables statistiques

Mélanie Chichery

Diapositive 1