1 00:00:05,840 --> 00:00:08,760 Most of us are familiar with comparing 2 00:00:08,760 --> 00:00:11,040 the average values between two groups. 3 00:00:11,040 --> 00:00:14,070 For example, the average teaching evaluation square group 4 00:00:14,070 --> 00:00:15,810 four male instructors compared 5 00:00:15,810 --> 00:00:17,295 with that of female instructors, 6 00:00:17,295 --> 00:00:18,540 and the groups are two, 7 00:00:18,540 --> 00:00:20,670 and we know that such comparisons 8 00:00:20,670 --> 00:00:22,530 are made using the t-test. 9 00:00:22,530 --> 00:00:25,110 But what if you're dealing with more than two groups? 10 00:00:25,110 --> 00:00:27,540 What if there are three, four or more groups? 11 00:00:27,540 --> 00:00:29,055 In that particular case, 12 00:00:29,055 --> 00:00:31,975 we would use ANOVA or analysis of variance, 13 00:00:31,975 --> 00:00:34,160 where our intent or goal is to compare 14 00:00:34,160 --> 00:00:36,410 the means of more than two groups. 15 00:00:36,410 --> 00:00:39,665 So in order to accomplish this, 16 00:00:39,665 --> 00:00:44,310 we return back to our teaching evaluation data. 17 00:00:44,310 --> 00:00:46,235 In that particular case, 18 00:00:46,235 --> 00:00:48,110 we have a variable called age, 19 00:00:48,110 --> 00:00:49,520 where the age of the instructor 20 00:00:49,520 --> 00:00:52,190 is recorded in a number of years. 21 00:00:52,190 --> 00:00:54,880 But we will discretize this age variable, 22 00:00:54,880 --> 00:00:56,855 that is, we will create three groups. 23 00:00:56,855 --> 00:00:58,670 So instructors who are 40 years and 24 00:00:58,670 --> 00:01:01,025 younger we put them in one group, 25 00:01:01,025 --> 00:01:05,040 those between 40 and 56.5 years of age, 26 00:01:05,040 --> 00:01:06,335 they are in another group, 27 00:01:06,335 --> 00:01:08,375 and those who are 57 years and older, 28 00:01:08,375 --> 00:01:09,560 you put them in the third group. 29 00:01:09,560 --> 00:01:12,409 So you have younger instructors, 30 00:01:12,409 --> 00:01:13,580 middle-age instructors and 31 00:01:13,580 --> 00:01:15,965 rather slightly older instructors, 32 00:01:15,965 --> 00:01:17,705 and the number of observations, 33 00:01:17,705 --> 00:01:20,565 taught by each group, is reported under 34 00:01:20,565 --> 00:01:22,790 N. What we also have here 35 00:01:22,790 --> 00:01:25,055 is the teaching evaluation score for each group, 36 00:01:25,055 --> 00:01:26,870 which is not deferring much. 37 00:01:26,870 --> 00:01:28,460 It's pretty much four for 38 00:01:28,460 --> 00:01:30,530 each group and for the older professors is 39 00:01:30,530 --> 00:01:32,540 slightly less at 3.9 in 40 00:01:32,540 --> 00:01:34,700 that respective standard deviations of it. 41 00:01:34,700 --> 00:01:36,395 So we have three groups and let's see what 42 00:01:36,395 --> 00:01:38,270 we are interested in is to determine 43 00:01:38,270 --> 00:01:41,930 if these three averages for 44 00:01:41,930 --> 00:01:45,250 the three respective age categories 45 00:01:45,440 --> 00:01:49,360 are statistically the same or they are different, 46 00:01:49,360 --> 00:01:52,715 so we use the one-way analysis of variance or ANOVA, 47 00:01:52,715 --> 00:01:55,745 and using the ANOVA we use 48 00:01:55,745 --> 00:01:57,560 the F-distribution to compare 49 00:01:57,560 --> 00:01:59,030 the mean values for more than two groups. 50 00:01:59,030 --> 00:02:01,430 Our null hypothesis is that samples in 51 00:02:01,430 --> 00:02:03,320 all groups are drawn from the same populations 52 00:02:03,320 --> 00:02:04,870 with the same mean values. 53 00:02:04,870 --> 00:02:07,145 We fail to reject the null hypothesis 54 00:02:07,145 --> 00:02:10,550 if the P-value or the significance for 55 00:02:10,550 --> 00:02:13,235 the F-test is greater than 0.05 56 00:02:13,235 --> 00:02:17,455 and we then infer equal means. 57 00:02:17,455 --> 00:02:20,450 Let's say we are interested in determining if 58 00:02:20,450 --> 00:02:23,495 the beauty score for instructors differs by age. 59 00:02:23,495 --> 00:02:25,835 We have three groups, younger, 60 00:02:25,835 --> 00:02:28,295 middle aged, and older professors. 61 00:02:28,295 --> 00:02:30,170 We have the summary statistics for 62 00:02:30,170 --> 00:02:32,210 the standardized beauty scores. 63 00:02:32,210 --> 00:02:35,990 We see that there is a difference as the age goes up, 64 00:02:35,990 --> 00:02:39,140 the average value for the beauty score goes down. 65 00:02:39,140 --> 00:02:41,600 So let's run an ANOVA to see 66 00:02:41,600 --> 00:02:44,855 if the differences are statistically significant. 67 00:02:44,855 --> 00:02:47,690 Our null hypothesis will be, 68 00:02:47,690 --> 00:02:50,285 "Mean beauty scores for instructors don't differ with 69 00:02:50,285 --> 00:02:53,700 age," and the alternative hypothesis will be, 70 00:02:53,700 --> 00:02:56,095 "At least one of the means is different." 71 00:02:56,095 --> 00:02:59,315 First, this variable does not exist in our data. 72 00:02:59,315 --> 00:03:02,210 We will need to group or been the continuous age data 73 00:03:02,210 --> 00:03:05,285 using the dot loc function in Pandas. 74 00:03:05,285 --> 00:03:08,360 Then use the F underscore one wave function in 75 00:03:08,360 --> 00:03:11,945 the Scipy Stats Library to perform the ANOVA test. 76 00:03:11,945 --> 00:03:15,385 We will then print out the F statistics and the P-value. 77 00:03:15,385 --> 00:03:17,870 What we can see is that the P-value is 78 00:03:17,870 --> 00:03:21,995 4.32 times ten raised to the power of negative eight, 79 00:03:21,995 --> 00:03:24,775 and that is less than 0.05. 80 00:03:24,775 --> 00:03:27,695 We will reject the null hypothesis as there is 81 00:03:27,695 --> 00:03:29,270 significant evidence that at 82 00:03:29,270 --> 00:03:31,280 least one of the means differ. 83 00:03:31,280 --> 00:03:33,230 If I do the same tests for 84 00:03:33,230 --> 00:03:34,700 the teaching evaluation scores 85 00:03:34,700 --> 00:03:36,440 that we observe for the three groups, 86 00:03:36,440 --> 00:03:40,040 and we run ANOVA on these three mean values. 87 00:03:40,040 --> 00:03:43,505 We find out that the P-value is 0.295, 88 00:03:43,505 --> 00:03:46,145 which is greater than 0.05. 89 00:03:46,145 --> 00:03:50,030 We will fail to reject the null and infer equal means. 90 00:03:50,030 --> 00:03:52,205 That is, that the three means 91 00:03:52,205 --> 00:03:54,470 are not statistically different. 92 00:03:54,470 --> 00:03:56,450 Here we have the Analysis of 93 00:03:56,450 --> 00:03:58,685 Variants performed on two samples. 94 00:03:58,685 --> 00:04:00,650 One is the beauty score. 95 00:04:00,650 --> 00:04:03,740 We notice that the difference in means for beauty scores 96 00:04:03,740 --> 00:04:05,015 between the three groups is 97 00:04:05,015 --> 00:04:07,090 based on the significance value. 98 00:04:07,090 --> 00:04:08,960 This leads us to conclude that 99 00:04:08,960 --> 00:04:10,730 at least one mean is different 100 00:04:10,730 --> 00:04:12,800 and we reject the null hypothesis 101 00:04:12,800 --> 00:04:15,125 that states equal means. 102 00:04:15,125 --> 00:04:17,990 Here, because the P-value for teaching 103 00:04:17,990 --> 00:04:20,210 evaluation scores between the three groups 104 00:04:20,210 --> 00:04:22,880 is greater than 0.05, 105 00:04:22,880 --> 00:04:25,730 we fail to reject the null hypothesis. 106 00:04:25,730 --> 00:04:28,085 We believe that these three means 107 00:04:28,085 --> 00:04:30,690 are statistically equal.