1 00:00:00,000 --> 00:00:04,775 Hello and welcome to Grouping Select Statement Result Sets. 2 00:00:04,775 --> 00:00:07,620 In this video, we will learn about some advanced techniques 3 00:00:07,620 --> 00:00:10,515 in retrieving data from a relational database table, 4 00:00:10,515 --> 00:00:13,670 and sorting, and grouping how the results set displays. 5 00:00:13,670 --> 00:00:15,260 At the end of this lesson, 6 00:00:15,260 --> 00:00:18,880 you will be able to explain how to eliminate duplicates from 7 00:00:18,880 --> 00:00:23,445 a result set and describe how to further restrict a result set. 8 00:00:23,445 --> 00:00:28,310 At times, a select statement result set can contain duplicate values. 9 00:00:28,310 --> 00:00:31,285 Based on our simplified library database model, 10 00:00:31,285 --> 00:00:33,329 in the author table example, 11 00:00:33,329 --> 00:00:37,580 the country column lists the two-letter country code of the author's country. 12 00:00:37,580 --> 00:00:40,000 If we select just the country column, 13 00:00:40,000 --> 00:00:42,220 we get a list of all of the countries. 14 00:00:42,220 --> 00:00:46,560 For example, select country from author order by 1. 15 00:00:46,560 --> 00:00:49,520 The order by clause sorts the result set. 16 00:00:49,520 --> 00:00:52,740 This result set lists the countries the authors belong to, 17 00:00:52,740 --> 00:00:55,055 sorted alphabetically by country. 18 00:00:55,055 --> 00:00:58,380 In this case, the result set displays 20 rows, 19 00:00:58,380 --> 00:01:00,870 one row for each of the 20 authors. 20 00:01:00,870 --> 00:01:03,695 But some of the authors come from the same country, 21 00:01:03,695 --> 00:01:06,575 so the result set contains duplicates. 22 00:01:06,575 --> 00:01:10,660 However, all we need is a list of countries the authors come from. 23 00:01:10,660 --> 00:01:14,090 So in this case, duplicates do not make sense. 24 00:01:14,090 --> 00:01:18,160 To eliminate duplicates, we use the keyword distinct. 25 00:01:18,160 --> 00:01:23,005 Using the keyword "distinct" reduces the result set to just six rows. 26 00:01:23,005 --> 00:01:27,730 But what if we wanted to also know how many authors come from the same country? 27 00:01:27,730 --> 00:01:32,130 So now we know that the 20 authors come from six different countries. 28 00:01:32,130 --> 00:01:37,030 But we might want to also know how many authors come from the same country. 29 00:01:37,030 --> 00:01:39,400 To display the result set listing 30 00:01:39,400 --> 00:01:42,355 the country and number of authors that come from that country, 31 00:01:42,355 --> 00:01:46,335 we add the "group by" clause to the select statement. 32 00:01:46,335 --> 00:01:49,090 The "group by" clause groups a result into 33 00:01:49,090 --> 00:01:52,205 subsets that has matching values for one or more columns. 34 00:01:52,205 --> 00:01:57,065 In this example, countries are grouped and then counted using the count function. 35 00:01:57,065 --> 00:02:00,585 Notice the column heading for the second column and the result set. 36 00:02:00,585 --> 00:02:02,770 The numeric value "2" displays as 37 00:02:02,770 --> 00:02:07,225 a column name because the column name is not directly available in the table. 38 00:02:07,225 --> 00:02:11,615 The second column in the result set was calculated by the count function. 39 00:02:11,615 --> 00:02:14,300 Instead of using the column named "2," 40 00:02:14,300 --> 00:02:17,115 we can assign a column name to the result set. 41 00:02:17,115 --> 00:02:19,920 We do this using the "as" keyword. 42 00:02:19,920 --> 00:02:22,850 In this example, we change the derived column name 43 00:02:22,850 --> 00:02:27,080 "2" to column name "Count" using the "as count" keyword. 44 00:02:27,080 --> 00:02:30,150 This helps clarify the meaning of the result set. 45 00:02:30,150 --> 00:02:33,650 Now that we have the count of authors from different countries, 46 00:02:33,650 --> 00:02:37,460 we can further restrict the number of rows by passing some conditions. 47 00:02:37,460 --> 00:02:42,210 For example, we can check if there are more than four authors from the same country. 48 00:02:42,210 --> 00:02:44,915 To set a condition to a "group by" clause, 49 00:02:44,915 --> 00:02:47,055 we use the keyword "having". 50 00:02:47,055 --> 00:02:51,135 The "having" clause is used in combination with the "group by" clause. 51 00:02:51,135 --> 00:02:55,885 It is very important to note that the "where" clause is for the entire result set, 52 00:02:55,885 --> 00:02:59,570 but the "having" clause works only with the "group by" clause. 53 00:02:59,570 --> 00:03:02,885 To check if there are more than four authors from the same country, 54 00:03:02,885 --> 00:03:05,375 we add the following to the select statement, 55 00:03:05,375 --> 00:03:08,710 having count country greater than four. 56 00:03:08,710 --> 00:03:11,120 Only countries that have five or more authors 57 00:03:11,120 --> 00:03:13,775 from that country are listed in the result set. 58 00:03:13,775 --> 00:03:18,750 In this example, those countries are China with six authors and India, 59 00:03:18,750 --> 00:03:21,050 also with six authors. 60 00:03:21,050 --> 00:03:24,100 Now you can explain how to eliminate duplicates from 61 00:03:24,100 --> 00:03:28,625 a result set and describe how to further restrict a result set. 62 00:03:28,625 --> 00:03:31,630 Thanks for watching this video.