analyse de la variance a un facteur exemple : la cuisson de beignets
Transcription
analyse de la variance a un facteur exemple : la cuisson de beignets
ANALYSE DE LA VARIANCE A UN FACTEUR EXEMPLE : LA CUISSON DE BEIGNETS ( SNEDECOR ET COCHRAN 1971 ) UTILISATION DU LOGICIEL STATGRAPHICS Il s’agit d’une expérience réalisée pour déterminer si la quantité de matière grasse absorbée par les beignets pendant leur cuisson dépend de la matière grasse utilisée. Pour chacune des quatre matières grasses, on a constitué six fournées de 24 beignets chacune. POIDS 164 172 168 177 156 195 1 1 1 1 1 1 178 191 197 182 185 177 2 2 2 2 2 2 175 193 178 171 163 176 3 3 3 3 3 3 155 166 149 164 170 168 4 4 4 4 4 4 TYPE CUISSON DE BEIGNETS - STATGRAPHICS 1 Scatterplot by Level Code 200 poids 190 180 170 160 150 140 1 2 3 4 type Box-and-Whisker Plot type 1 2 3 4 140 150 160 170 180 190 200 poids CUISSON DE BEIGNETS - STATGRAPHICS 2 TABLEAU 1: Summary Statistics for poids type Count Average -----------------------------------------------------------1 6 172,0 2 6 185,0 3 6 176,0 4 6 162,0 -----------------------------------------------------------Total 24 173,75 type Median Variance -----------------------------------------------------------1 170,0 178,0 2 183,5 60,4 3 175,5 97,6 4 165,0 67,6 -----------------------------------------------------------Total 173,5 158,891 type Standard deviation Minimum -----------------------------------------------------------1 13,3417 156,0 2 7,77174 177,0 3 9,87927 163,0 4 8,22192 149,0 -----------------------------------------------------------Total 12,6052 149,0 type Maximum -----------------------------1 195,0 2 197,0 3 193,0 4 170,0 -----------------------------Total 197,0 TABLEAU 2: ANOVA Table for poids by type Analysis of Variance ----------------------------------------------------------------------------Source Sum of Squares Df Mean Square F-Ratio P-Value ----------------------------------------------------------------------------Between groups 1636,5 3 545,5 5,41 0,0069 Within groups 2018,0 20 100,9 ----------------------------------------------------------------------------Total (Corr.) 3654,5 23 The StatAdvisor --------------The ANOVA table decomposes the variance of poids into two components: a between-group component and a within-group component. The F-ratio, which in this case equals 5,40634, is a ratio of the between-group estimate to the within-group estimate. Since the P-value of the F-test is less than 0,05, there is a statistically significant difference between the mean poids from one level of type to another at the 95,0% confidence level. To determine which means are significantly different from which others, select Multiple Range Tests from the list of Tabular Options. CUISSON DE BEIGNETS - STATGRAPHICS 3 TABLEAU 3: Variance Check Tests des variances Test Test Test Test C de Cochran: 0,441031 Proba. = 0,359552 de Bartlett: 1,09946 Proba. = 0,625776 de Hartley: 2,94702 de Levene: 0,343405 Proba. = 0,79422 The StatAdvisor --------------The three statistics displayed in this table test the null hypothesis that the standard deviations of poids within each of the 4 levels of type is the same. Of particular interest are the two P-values. Since the smaller of the P-values is greater than or equal to 0,05, there is not a statistically significant difference amongst the standard deviations at the 95,0% confidence level. TABLEAU 4: Table of Means for poids by type with 95,0 percent LSD intervals -------------------------------------------------------------------------------Stnd. error type Count Mean (pooled s) Lower limit Upper limit -------------------------------------------------------------------------------1 6 172,0 4,10081 165,951 178,049 2 6 185,0 4,10081 178,951 191,049 3 6 176,0 4,10081 169,951 182,049 4 6 162,0 4,10081 155,951 168,049 -------------------------------------------------------------------------------Total 24 173,75 The StatAdvisor --------------This table shows the mean poids for each level of type. It also shows the standard error of each mean, which is a measure of its sampling variability. The standard error is formed by dividing the pooled standard deviation by the square root of the number of observations at each level. The table also displays an interval around each mean. The intervals currently displayed are based on Fisher's least significant difference (LSD) procedure. They are constructed in such a way that if two means are the same, their intervals will overlap 95,0% of the time. You can display the intervals graphically by selecting Means Plot from the list of Graphical Options. In the Multiple Range Tests, these intervals are used to determine which means are significantly different from which others. CUISSON DE BEIGNETS - STATGRAPHICS 4 Means and 95,0 Percent LSD Intervals 200 poids 190 180 170 160 150 1 2 3 4 type TABLEAU 5: Multiple Range Tests for poids by type -------------------------------------------------------------------------------Method: 95,0 percent LSD type Count Mean Homogeneous Groups -------------------------------------------------------------------------------4 6 162,0 X 1 6 172,0 XX 3 6 176,0 XX 2 6 185,0 X -------------------------------------------------------------------------------- Contrast Difference +/- Limits -------------------------------------------------------------------------------1 - 2 * -13,0 12,0974 1 - 3 -4,0 12,0974 1 - 4 10,0 12,0974 2 - 3 9,0 12,0974 2 - 4 * 23,0 12,0974 3 - 4 * 14,0 12,0974 -------------------------------------------------------------------------------* denotes a statistically significant difference. The StatAdvisor --------------This table applies a multiple comparison procedure to determine which means are significantly different from which others. The bottom half of the output shows the estimated difference between each pair of means. An asterisk has been placed next to 3 pairs, indicating that these pairs show statistically significant differences at the 95,0% confidence level. At the top of the page, 3 homogenous groups are identified using columns of X's. Within each column, the levels containing X's form a group of means within which there are no statistically significant differences. The method currently being used to discriminate among the means is Fisher's least significant difference (LSD) procedure. With this method, there is a 5,0% risk ofcalling each pair of means significantly different when the actual difference equals 0. CUISSON DE BEIGNETS - STATGRAPHICS 5 TABLEAU 6: Table of Means for poids by type with 95,0 percent Scheffe intervals -------------------------------------------------------------------------------Stnd. error type Count Mean (pooled s) Lower limit Upper limit -------------------------------------------------------------------------------1 6 172,0 4,10081 163,159 180,841 2 6 185,0 4,10081 176,159 193,841 3 6 176,0 4,10081 167,159 184,841 4 6 162,0 4,10081 153,159 170,841 -------------------------------------------------------------------------------Total 24 173,75 The StatAdvisor --------------This table shows the mean poids for each level of type. It also shows the standard error of each mean, which is a measure of its sampling variability. The standard error is formed by dividing the pooled standard deviation by the square root of the number of observations at each level. The table also displays an interval around each mean. The intervals currently displayed are based on Scheffe's multiple comparison procedure. They are constructed in such a way that if all the means are the same, all the intervals will overlap at least 95,0% of the time. The Tukey or Bonferroni intervals will usually be tighter. You can display the intervals graphically by selecting Means Plot from the list of Graphical Options. In the Multiple Range Tests, these intervals are used to determine which means are significantly different from which others. CUISSON DE BEIGNETS - STATGRAPHICS 6 TABLEAU 7: Multiple Range Tests for poids by type -------------------------------------------------------------------------------Method: 95,0 percent Scheffe type Count Mean Homogeneous Groups -------------------------------------------------------------------------------4 6 162,0 X 1 6 172,0 XX 3 6 176,0 XX 2 6 185,0 X -------------------------------------------------------------------------------- Contrast Difference +/- Limits -------------------------------------------------------------------------------1 - 2 -13,0 17,682 1 - 3 -4,0 17,682 1 - 4 10,0 17,682 2 - 3 9,0 17,682 2 - 4 *23,0 17,682 3 - 4 14,0 17,682 -------------------------------------------------------------------------------* denotes a statistically significant difference. The StatAdvisor --------------This table applies a multiple comparison procedure to determine which means are significantly different from which others. The bottom half of the output shows the estimated difference between each pair of means. An asterisk has been placed next to 1 pair, indicating that this pair shows a statistically significant difference at the 95,0% confidence level. At the top of the page, 2 homogenous groups are identified using columns of X's. Within each column, the levels containing X's form a group of means within which there are no statistically significant differences. The method currently being used to discriminate among the means is Scheffe's multiple comparison procedure.With this method, there is no more than a 5,0% risk of calling one or more pairs significantly different when their actual difference equals 0. The Tukey or Bonferroni procedures will usually be more powerful. Means and 95,0 Percent Scheffe Intervals 200 poids 190 180 170 160 150 1 2 3 4 type CUISSON DE BEIGNETS - STATGRAPHICS 7 TABLEAU 8: Multiple Range Tests for poids by type, méthode de Bonferroni -------------------------------------------------------------------------------Method: 95,0 percent Bonferroni type Count Mean Homogeneous Groups -------------------------------------------------------------------------------4 6 162,0 X 1 6 172,0 XX 3 6 176,0 XX 2 6 185,0 X -------------------------------------------------------------------------------Contrast Difference +/- Limits -------------------------------------------------------------------------------1 - 2 -13,0 16,9756 1 - 3 -4,0 16,9756 1 - 4 10,0 16,9756 2 - 3 9,0 16,9756 2 - 4 *23,0 16,9756 3 - 4 14,0 16,9756 -------------------------------------------------------------------------------* denotes a statistically significant difference. The StatAdvisor --------------This table applies a multiple comparison procedure to determine which means are significantly different from which others. The bottom half of the output shows the estimated difference between each pair of means. An asterisk has been placed next to 1 pair, indicating that this pair shows a statistically significant difference at the 95,0% confidence level. At the top of the page, 2 homogenous groups are identified using columns of X's. Within each column, the levels containing X's form a group of means within which there are no statistically significant differences. The method currently being used to discriminate among the means is Bonferroni's multiple comparison procedure. With this method, there is a 5,0% risk of calling one or more pairs significantly different when their actual difference equals 0. TABLEAU 9: Multiple Range Tests for poids by type,méthode de Tukey -------------------------------------------------------------------------------Method: 95,0 percent Tukey HSD type Count Mean Homogeneous Groups -------------------------------------------------------------------------------4 6 162,0 X 1 6 172,0 XX 3 6 176,0 XX 2 6 185,0 X -------------------------------------------------------------------------------Contrast Difference +/- Limits -------------------------------------------------------------------------------1 - 2 -13,0 16,2376 1 - 3 -4,0 16,2376 1 - 4 10,0 16,2376 2 - 3 9,0 16,2376 2 - 4 *23,0 16,2376 3 - 4 14,0 16,2376 -------------------------------------------------------------------------------* denotes a statistically significant difference. CUISSON DE BEIGNETS - STATGRAPHICS 8 TABLEAU 10 :Kruskal-Wallis Test for poids by type type Sample Size Average Rank -----------------------------------------------------------1 6 11,25 2 6 19,5 3 6 13,5833 4 6 5,66667 -----------------------------------------------------------Test statistic = 11,8322 P-Value = 0,00798013 The StatAdvisor --------------The Kruskal-Wallis test tests the null hypothesis that the medians of poids within each of the 4 levels of type are the same. The data from all the levels is first combined and ranked from smallest to largest. The average rank is then computed for the data at each level. Since the P-value is less than 0,05, there is a statistically significant difference amongst the medians at the 95,0% confidence level. To determine which medians are significantly different from which others, select Box-and-Whisker Plot from the list of Graphical Options and select the median notch option. Kruskal-Wallis Test for poids by type Box-and-Whisker Plot type 1 2 3 4 140 150 160 170 180 190 200 poids CUISSON DE BEIGNETS - STATGRAPHICS 9 EXEMPLE : LA CUISSON DE BEIGNETS LOGICIEL SAS Programme options ls = 80 ; data BEIGNETS ; input numero $ matgra typemat ; cards; 1 164 1 2 172 1 3 168 1 4 177 1 5 156 1 6 195 1 7 178 2 8 191 2 9 197 2 10 182 2 11 185 2 12 177 2 13 175 3 14 193 3 15 178 3 16 171 3 17 163 3 18 176 3 19 155 4 20 166 4 21 149 4 22 164 4 23 170 4 24 168 4 proc print ; proc GLM ; class typemat ; model matgra = typemat ; means typemat / BON CLM LINES ; means typemat /DUNCAN ; means typemat /LSD LINES CLM ; means typemat / SCHEFFE CLM LINES ; means typemat / TUKEY lines ; contrast ' ANIMAL vs VEGETAL ' typemat 1 1 -1 -1 ; run; proc npar1way WILCOXON ; class typemat ; var matgra ; run; CUISSON DE BEIGNETS – LOGICIEL SAS 1 UTILISATION DE LA PROCEDURE GLM General Linear Models Procedure Class Level Information Class Levels TYPEMAT 4 Values 1 2 3 4 Number of observations in data set = 24 General Linear Models Procedure Dependent Variable: MATGRA Source DF Sum of Squares Mean Square Model Error Corrected Total 3 20 23 1636.5000000 2018.0000000 3654.5000000 545.5000000 100.9000000 R-Square C.V. Root MSE MATGRA Mean 0.447804 5.781237 10.044899 173.75000 Source DF Type I SS Mean Square F Value Pr > F TYPEMAT 3 1636.5000000 545.5000000 5.41 0.0069 Source DF Type III SS Mean Square F Value Pr > F TYPEMAT 3 1636.5000000 545.5000000 5.41 0.0069 CUISSON DE BEIGNETS – LOGICIEL SAS F Value Pr > F 5.41 0.0069 2 General Linear Models Procedure Bonferroni T Confidence Intervals for variable: MATGRA Alpha= 0.05 Confidence= 0.95 df= 20 MSE= 100.9 Critical Value of T= 2.74 Half Width of Confidence Interval= 11.25417 TYPEMAT N 2 3 1 4 6 6 6 6 Simultaneous Lower Confidence Limit 173.746 164.746 160.746 150.746 Simultaneous Upper Confidence Limit Mean 185.000 176.000 172.000 162.000 196.254 187.254 183.254 173.254 Bonferroni (Dunn) T tests for variable: MATGRA NOTE: This test controls the type I experimentwise error rate, but generally has a higher type II error rate than REGWQ. Alpha= 0.05 df= 20 MSE= 100.9 Critical Value of T= 2.93 Minimum Significant Difference= 16.976 Means with the same letter are not significantly different. Bon Grouping Mean N TYPEMAT A A A A A 185.000 6 2 176.000 6 3 172.000 6 1 162.000 6 4 B B B B B CUISSON DE BEIGNETS – LOGICIEL SAS 3 Duncan's Multiple Range Test for variable: MATGRA NOTE: This test controls the type I comparisonwise error rate, not the experimentwise error rate Alpha= 0.05 df= 20 MSE= 100.9 Number of Means 2 3 4 Critical Range 12.10 12.70 13.08 Means with the same letter are not significantly different. Duncan Grouping Mean N TYPEMAT A A A 185.000 6 2 176.000 6 3 C C C 172.000 6 1 162.000 6 4 B B B T Confidence Intervals for variable: MATGRA Alpha= 0.05 Confidence= 0.95 df= 20 MSE= 100.9 Critical Value of T= 2.09 Half Width of Confidence Interval= 8.554146 TYPEMAT N Lower Confidence Limit Mean Upper Confidence Limit 2 3 1 4 6 6 6 6 176.446 167.446 163.446 153.446 185.000 176.000 172.000 162.000 193.554 184.554 180.554 170.554 T tests (LSD) for variable: MATGRA NOTE: This test controls the type I comparisonwise error rate not the experimentwise error rate. Alpha= 0.05 df= 20 MSE= 100.9 Critical Value of T= 2.09 Least Significant Difference= 12.097 Means with the same letter are not significantly different. T Grouping Mean N TYPEMAT A A A 185.000 6 2 176.000 6 3 C C C 172.000 6 1 162.000 6 4 B B B CUISSON DE BEIGNETS – LOGICIEL SAS 4 Scheffe's Confidence Intervals for variable: MATGRA Alpha= 0.05 Confidence= 0.95 df= 20 MSE= 100.9 Critical Value of F= 2.86608 Half Width of Confidence Interval= 13.88495 TYPEMAT N 2 3 1 4 6 6 6 6 Simultaneous Lower Confidence Limit 171.115 162.115 158.115 148.115 Mean Simultaneous Upper Confidence Limit 185.000 176.000 172.000 162.000 198.885 189.885 185.885 175.885 Scheffe's test for variable: MATGRA NOTE: This test controls the type I experimentwise error rate but generally has a higher type II error rate than REGWF for all pairwise comparisons Alpha= 0.05 df= 20 MSE= 100.9 Critical Value of F= 3.09839 Minimum Significant Difference= 17.681 Means with the same letter are not significantly different. Scheffe Grouping Mean N TYPEMAT A A A A A 185.000 6 2 176.000 6 3 172.000 6 1 162.000 6 4 B B B B B CUISSON DE BEIGNETS – LOGICIEL SAS 5 Tukey's Studentized Range (HSD) Test for variable: MATGRA NOTE: This test controls the type I experimentwise error rate, but generally has a higher type II error rate than REGWQ. Alpha= 0.05 df= 20 MSE= 100.9 Critical Value of Studentized Range= 3.958 Minimum Significant Difference= 16.232 Means with the same letter are not significantly different. Tukey Grouping Mean N TYPEMAT A A A A A 185.000 6 2 176.000 6 3 172.000 6 1 162.000 6 4 B B B B B Test de contraste Contrast ANIMAL vs VEGETAL DF Contrast SS Mean Square F Value Pr > F 1 541.5000000 541.5000000 5.37 0.0313 N P A R 1 W A Y P R O C E D U R E Wilcoxon Scores (Rank Sums) for Variable MATGRA Classified by Variable TYPEMAT TYPEMAT N Sum of Scores Expected Under H0 Std Dev Under H0 Mean Score 1 2 3 4 6 6 6 6 67.500000 117.000000 81.500000 34.000000 75.0 75.0 75.0 75.0 14.9869508 14.9869508 14.9869508 14.9869508 11.2500000 19.5000000 13.5833333 5.6666667 Average Scores Were Used for Ties Kruskal-Wallis Test (Chi-Square Approximation) CHISQ = 11.832 DF = 3 Prob > CHISQ = 0.0080 CUISSON DE BEIGNETS – LOGICIEL SAS 6