1 00:00:07,910 --> 00:00:12,405 Welcome back. With nested data structures, 2 00:00:12,405 --> 00:00:17,565 we'll often want to traverse all the contents and print or extract some of the items. 3 00:00:17,565 --> 00:00:20,340 We do that by nested iteration, 4 00:00:20,340 --> 00:00:23,955 with one for loop indented inside another. 5 00:00:23,955 --> 00:00:27,390 Here we have a list called nested1. 6 00:00:27,390 --> 00:00:29,265 We've seen it before, 7 00:00:29,265 --> 00:00:32,560 and on line two, we iterate. 8 00:00:32,960 --> 00:00:37,035 We go through all of the items in nested1. 9 00:00:37,035 --> 00:00:46,635 We're going to call this the outer loop because it goes through the outer list. 10 00:00:46,635 --> 00:00:51,200 The loop variable or iterator variable x, 11 00:00:51,200 --> 00:00:55,660 will be bound to one of the items each time. 12 00:00:55,660 --> 00:01:00,625 So, the first time that we execute lines three through five, 13 00:01:00,625 --> 00:01:04,630 x will be bound to the list a, b, c. So, 14 00:01:04,630 --> 00:01:09,190 we'll have x bound to a, b, 15 00:01:09,190 --> 00:01:17,605 c, and then we execute line three through five, 16 00:01:17,605 --> 00:01:20,120 and we'll change x and make it be bound to d, 17 00:01:20,120 --> 00:01:23,405 e, and we'll execute lines three through five again. 18 00:01:23,405 --> 00:01:26,085 As part of the loop, 19 00:01:26,085 --> 00:01:30,680 I just print level1 on line three because we're going to have a lot of 20 00:01:30,680 --> 00:01:34,940 stuff printing as a way of making things look nice in the output. 21 00:01:34,940 --> 00:01:39,340 Then we're going to do an inner loop on lines four and five. 22 00:01:39,340 --> 00:01:42,650 So, on line four we started another iteration. 23 00:01:42,650 --> 00:01:46,160 We have another iterator variable or loop variable. 24 00:01:46,160 --> 00:01:53,430 It's called y, and y will get bound first to a, 25 00:01:53,430 --> 00:01:55,510 and then to b, and then to c. So, 26 00:01:55,510 --> 00:01:58,520 the first time we execute line five, 27 00:01:58,520 --> 00:02:01,405 y will be bound to a, 28 00:02:01,405 --> 00:02:03,690 and it'll print out a few spaces, 29 00:02:03,690 --> 00:02:07,875 level2: and then it'll print out a. 30 00:02:07,875 --> 00:02:10,890 We'll go on. Y will get a new value. 31 00:02:10,890 --> 00:02:15,060 It'll be b, and we'll print out level2, b. 32 00:02:15,060 --> 00:02:18,960 We get all the way through a, b, c, 33 00:02:18,960 --> 00:02:27,000 and then we're done with the for loop for the first value x bound to a, 34 00:02:27,000 --> 00:02:32,555 b, c. Then we go on to let x have a new value. 35 00:02:32,555 --> 00:02:34,400 X is going to be bound to d and e, 36 00:02:34,400 --> 00:02:37,225 and we do the inner loop again. 37 00:02:37,225 --> 00:02:39,545 When we do the inner loop again, 38 00:02:39,545 --> 00:02:44,550 y will get bound to d and then y will get down to e. So, 39 00:02:44,550 --> 00:02:49,810 let's see how the whole thing unfolds when we run it. 40 00:02:53,090 --> 00:02:58,305 So, you'll notice that line three; print level1. 41 00:02:58,305 --> 00:03:01,395 That gets executed three times. 42 00:03:01,395 --> 00:03:07,410 The reason is that we have three items in the outer list. 43 00:03:07,410 --> 00:03:10,930 So, we execute the outer loop three times. 44 00:03:10,930 --> 00:03:14,080 Each time we print level1. 45 00:03:14,080 --> 00:03:18,190 Then the first time we go through a, b, 46 00:03:18,190 --> 00:03:22,750 and c, so line five executes three times, 47 00:03:22,750 --> 00:03:25,090 once with y bound to a, 48 00:03:25,090 --> 00:03:26,440 once with y bound to b, 49 00:03:26,440 --> 00:03:31,540 and once with y bound to c. One thing that's important to keep track of 50 00:03:31,540 --> 00:03:34,640 is which variable names that we're 51 00:03:34,640 --> 00:03:39,050 using for the outer loop and which are we are using for the inner loop. 52 00:03:39,050 --> 00:03:42,920 You'll notice that the variable x is 53 00:03:42,920 --> 00:03:47,140 used as the iterator variable or loop variable in the outer loop, 54 00:03:47,140 --> 00:03:55,150 and then we see it again on line four when we're creating the inner iteration. 55 00:03:55,370 --> 00:04:01,100 That's a very common pattern because when we do this nested iteration, 56 00:04:01,100 --> 00:04:04,505 we're letting x be bound to each of the inner lists, 57 00:04:04,505 --> 00:04:07,565 and then in the inner loop, 58 00:04:07,565 --> 00:04:12,280 we're going to want to have an iteration through each of those lists. 59 00:04:12,280 --> 00:04:19,470 So, if you're ever tempted to say for x in nested1, 60 00:04:20,140 --> 00:04:25,850 and then for x in y. 61 00:04:25,850 --> 00:04:28,585 That's not going to work. 62 00:04:28,585 --> 00:04:37,235 The reason it's not going to work is that for x is defining the iterator variable. 63 00:04:37,235 --> 00:04:40,690 It's saying here's a new variable name that's going to get bound to something. 64 00:04:40,690 --> 00:04:46,850 What you really wanted in this inner spot is for y in x. 65 00:04:46,850 --> 00:04:55,010 You want the x to refer to a variable that's bound as part of running the outer loop. 66 00:04:55,010 --> 00:04:58,790 Here we have a traversal where we're supposed to accumulate 67 00:04:58,790 --> 00:05:02,940 something from the inner lists rather than just printing them. 68 00:05:03,380 --> 00:05:07,880 So, we're supposed to write code to create a new list that contains 69 00:05:07,880 --> 00:05:12,420 every person's last name and save that list as last names. 70 00:05:12,500 --> 00:05:16,565 So, each item in the outer list is itself 71 00:05:16,565 --> 00:05:21,170 a list and those lists contain last names like Turner, 72 00:05:21,170 --> 00:05:23,920 and Damon, and Wiig. 73 00:05:23,920 --> 00:05:27,405 This requires an accumulation pattern. 74 00:05:27,405 --> 00:05:30,985 What makes me think that it's going to be an accumulation? 75 00:05:30,985 --> 00:05:34,045 Well, we had created a new list, 76 00:05:34,045 --> 00:05:37,720 and it contains every person's last name. 77 00:05:37,720 --> 00:05:42,385 Those are kind of triggers that suggests to me that we're going to have an accumulation. 78 00:05:42,385 --> 00:05:49,640 So, to solve this, let's set up our usual template for an accumulation pattern. 79 00:05:51,210 --> 00:05:53,670 Since I'm accumulating a list, 80 00:05:53,670 --> 00:06:02,650 I start with an empty list for something in something. 81 00:06:02,750 --> 00:06:08,805 I'm going to refer to accum here, 82 00:06:08,805 --> 00:06:11,820 and at the end, I will have my list. 83 00:06:11,820 --> 00:06:14,505 Now instead of calling it accum, 84 00:06:14,505 --> 00:06:18,320 I'm going to call it last_names because they told us we're supposed to 85 00:06:18,320 --> 00:06:23,280 accumulate it into a variable called last_names. 86 00:06:26,380 --> 00:06:29,105 Now what am I going to iterate over? 87 00:06:29,105 --> 00:06:34,590 Well, I'm going to iterate over the whole giant list. 88 00:06:35,260 --> 00:06:43,270 So, that's called info and I can name my loop variable anything I want to. 89 00:06:43,270 --> 00:06:46,845 But since these are all entertainers, 90 00:06:46,845 --> 00:06:51,970 I'm going to call my variable entertainer. 91 00:06:53,720 --> 00:06:57,495 Now, what am I going to do with each of these entertainers? 92 00:06:57,495 --> 00:07:02,625 Well, I'm going to extract just the last name. 93 00:07:02,625 --> 00:07:06,760 This is a very well-structured piece of data because 94 00:07:06,760 --> 00:07:13,370 every inner list has the last name in position one. 95 00:07:13,370 --> 00:07:15,480 You'll notice it's the same always. 96 00:07:15,480 --> 00:07:19,960 It's always position one where we have the last name. 97 00:07:19,960 --> 00:07:27,010 That means we could always just do something with entertainer position one. 98 00:07:28,790 --> 00:07:33,865 What are we going to do? We're going to append 99 00:07:33,865 --> 00:07:40,245 that entertainer to the end of the existing list of last names. 100 00:07:40,245 --> 00:07:43,690 I see that I misspelled last_names. 101 00:07:47,720 --> 00:07:50,880 Just to help me see what's happening, 102 00:07:50,880 --> 00:07:53,265 I'm going to print last_names each time. 103 00:07:53,265 --> 00:07:55,160 So, when we execute this, 104 00:07:55,160 --> 00:07:58,080 I will see as the list accumulates. 105 00:08:05,720 --> 00:08:08,480 It looks like I accidentally deleted 106 00:08:08,480 --> 00:08:11,840 my print statement as I was getting rid of all the markings on the screen. 107 00:08:11,840 --> 00:08:15,870 Let me see if I can undo it and make it come back. 108 00:08:17,780 --> 00:08:20,430 Well, I can't make it come back. 109 00:08:20,430 --> 00:08:23,140 So, I'm going to type it again. 110 00:08:27,590 --> 00:08:31,820 Now we can see on each iteration when we get to line six I'm 111 00:08:31,820 --> 00:08:36,560 printing and the first time I only have Turner. 112 00:08:36,560 --> 00:08:42,000 The next time I have Turner with Damon appended, and so on. 113 00:08:42,320 --> 00:08:46,730 So, that's how I would go about answering an exercise like this. 114 00:08:46,730 --> 00:08:50,660 I'd first figure out how I'm going to need an accumulation pattern. 115 00:08:50,660 --> 00:08:52,910 I would make the little template for that. 116 00:08:52,910 --> 00:08:55,250 Then I would start filling in parts of the template, 117 00:08:55,250 --> 00:08:59,090 and I would have to come up with the insight that 118 00:08:59,090 --> 00:09:03,275 the last_name is always in the same position in each of the inner lists. 119 00:09:03,275 --> 00:09:08,660 Therefore, I can always refer to position square bracket one knowing that 120 00:09:08,660 --> 00:09:12,770 the entertainer variable was going to be bound to 121 00:09:12,770 --> 00:09:17,450 a different one of these inner lists each time I went through here. 122 00:09:17,450 --> 00:09:19,550 The first time I get to line five. 123 00:09:19,550 --> 00:09:23,405 It's bound to the first inner list and I get Turner. 124 00:09:23,405 --> 00:09:28,715 The second time entertainers bounded that second inner lists Matt Damon 1970, 125 00:09:28,715 --> 00:09:35,010 and we get Damon's last name appended to our list of last names. 126 00:09:38,890 --> 00:09:44,590 Now note that we don't have an inner loop in this problem. 127 00:09:44,590 --> 00:09:50,880 We have for entertainer in info but we don't have another for loop listed here. 128 00:09:52,570 --> 00:09:56,510 But we do have an outer loop and we have square brackets 129 00:09:56,510 --> 00:09:59,810 to extract an item from an inner list. 130 00:09:59,810 --> 00:10:02,600 So, we have an outer loop here on line four, 131 00:10:02,600 --> 00:10:06,350 and then we have another square brackets on line five 132 00:10:06,350 --> 00:10:10,920 to extract a name from one of the inner lists. 133 00:10:10,920 --> 00:10:13,820 If we're extracting things from two levels into 134 00:10:13,820 --> 00:10:19,205 a nested data structure like Turner and Damon are getting extracted from info, 135 00:10:19,205 --> 00:10:26,675 we will always need a combination of two iterations and square bracket index operations. 136 00:10:26,675 --> 00:10:37,510 We could have info square bracket one, square bracket zero. 137 00:10:42,790 --> 00:10:48,005 So, that would be using two square brackets, and if I do that, 138 00:10:48,005 --> 00:10:51,140 I'm going to get rid 139 00:10:51,140 --> 00:10:55,030 of the print statement on line six so that we don't get confused by it in the output. 140 00:10:55,030 --> 00:10:57,960 I will just get Matt directly. 141 00:10:57,960 --> 00:11:00,255 So, if I want to extract Matt, 142 00:11:00,255 --> 00:11:02,250 or I can extract Damon by doing 143 00:11:02,250 --> 00:11:08,720 [1] [1] or I 144 00:11:08,720 --> 00:11:14,375 can have a for loop as I do on line four with a square bracket on line five. 145 00:11:14,375 --> 00:11:19,180 I could have two for loops. 146 00:11:29,220 --> 00:11:32,600 Then I'll print a whole lot of stuff, 147 00:11:33,480 --> 00:11:37,255 but it's all the inner values, Tina, 148 00:11:37,255 --> 00:11:41,630 Turner, 1939 and so on. 149 00:11:41,880 --> 00:11:45,280 So, that's two four loops, 150 00:11:45,280 --> 00:11:47,470 or two square brackets. 151 00:11:47,470 --> 00:11:50,950 We've already seen a four loop with an inner square bracket. 152 00:11:50,950 --> 00:11:53,485 We can also have a weird thing where we could say, 153 00:11:53,485 --> 00:12:01,330 for val in infosquare bracket one print val. 154 00:12:01,330 --> 00:12:15,550 I'm just going to 155 00:12:15,550 --> 00:12:17,170 write pass here so that it does; 156 00:12:17,170 --> 00:12:20,665 we don't want to do anything in that four loop. 157 00:12:20,665 --> 00:12:24,145 So here, on lines 12 and 13, 158 00:12:24,145 --> 00:12:28,300 I am only looking at val square bracket one, 159 00:12:28,300 --> 00:12:30,580 that's the inner list, Matt, Damon, 160 00:12:30,580 --> 00:12:33,985 1970, actor, and I'm printing that. 161 00:12:33,985 --> 00:12:36,670 I'm printing each of those values. 162 00:12:36,670 --> 00:12:37,960 So, in any case, 163 00:12:37,960 --> 00:12:40,600 if I'm trying to get values like Tina, and Turner, and Matt, 164 00:12:40,600 --> 00:12:43,480 and Damon out of this nested data structure, 165 00:12:43,480 --> 00:12:45,370 I have to do some combination. 166 00:12:45,370 --> 00:12:49,010 I either need two square brackets, 167 00:12:49,160 --> 00:12:53,055 or I need two four loops, 168 00:12:53,055 --> 00:13:01,270 or I need one four loop and a square bracket. 169 00:13:01,270 --> 00:13:05,100 One four loop and a square bracket. 170 00:13:05,100 --> 00:13:06,845 Two levels of nesting, 171 00:13:06,845 --> 00:13:12,280 some combination of two four loops and square brackets. 172 00:13:12,280 --> 00:13:15,400 Here's another problem where we accumulate. 173 00:13:15,400 --> 00:13:19,030 In this case, we do need nested iteration because we're 174 00:13:19,030 --> 00:13:22,840 not always going to extract the same thing from each inner list. 175 00:13:22,840 --> 00:13:26,650 Our task is to use nested iteration to save 176 00:13:26,650 --> 00:13:31,840 every string containing b into a new list named b strings. 177 00:13:31,840 --> 00:13:34,765 So, this is another accumulation pattern. 178 00:13:34,765 --> 00:13:40,000 We're going to traverse this nested data structure finding each of 179 00:13:40,000 --> 00:13:48,710 the vegetables and fruits and accumulating all of the ones that contain the letter b. 180 00:13:50,340 --> 00:13:53,560 So, this is an accumulation pattern, 181 00:13:53,560 --> 00:13:59,245 b_strings is eventually going to be a list of words that all contain the letter b, 182 00:13:59,245 --> 00:14:02,770 like bananas and blueberries but not lemons. 183 00:14:02,770 --> 00:14:09,800 Initially, it's empty, and I'm going to iterate through the outer list. 184 00:14:10,980 --> 00:14:15,685 So, for each of these lists and the outer list, 185 00:14:15,685 --> 00:14:22,735 I'm going to check whether the items in it are b words. 186 00:14:22,735 --> 00:14:26,710 So, I'm going to say for word in 187 00:14:26,710 --> 00:14:35,540 list, b_strings.append of word. 188 00:14:37,440 --> 00:14:41,320 When I finish, I'll print out b_strings. 189 00:14:41,320 --> 00:14:45,250 Now, this isn't quite right because I'm going to actually 190 00:14:45,250 --> 00:14:49,405 append all of the words even when they don't contain the letter b. 191 00:14:49,405 --> 00:14:52,390 So, I've got apples when I shouldn't have apples. 192 00:14:52,390 --> 00:14:55,030 But I do now have just a simple list, 193 00:14:55,030 --> 00:14:56,050 not a nested list. 194 00:14:56,050 --> 00:15:01,000 I just have all of the items that appeared anywhere at the second level in the list, 195 00:15:01,000 --> 00:15:03,445 and now they're in this flat list. 196 00:15:03,445 --> 00:15:07,945 So, what do I have to do instead of automatically appending every word? 197 00:15:07,945 --> 00:15:14,330 I have to append it only if it has some property, if something something. 198 00:15:22,770 --> 00:15:26,484 If the word has some nice property, 199 00:15:26,484 --> 00:15:29,230 I will append it, and otherwise, I won't bother. 200 00:15:29,230 --> 00:15:34,210 Now, what do I want to put into this IF expression? 201 00:15:34,210 --> 00:15:38,875 At this point, when people start learning about nested iteration, 202 00:15:38,875 --> 00:15:44,365 they sometimes forget about the nice built-in things that Python has given us. 203 00:15:44,365 --> 00:15:47,575 I'm just first going to remind you and then I'll show you the hard way, 204 00:15:47,575 --> 00:15:49,360 but first, I'll show you the easy way. 205 00:15:49,360 --> 00:15:52,645 If we want to check on whether a word has the letter b, 206 00:15:52,645 --> 00:15:54,865 we have the InOperator. 207 00:15:54,865 --> 00:15:56,725 So, if b is in the word, 208 00:15:56,725 --> 00:16:01,090 then we append it, and otherwise, we don't append it. 209 00:16:01,090 --> 00:16:03,430 Now, we managed to pass the test, 210 00:16:03,430 --> 00:16:04,540 we just got bananas, 211 00:16:04,540 --> 00:16:09,320 and blueberries, and green beans, and so on. 212 00:16:10,290 --> 00:16:15,310 Now there's a lot of ways that you can go wrong when you start writing this kind of code, 213 00:16:15,310 --> 00:16:17,455 you might be tempted to say things like, 214 00:16:17,455 --> 00:16:22,090 append list instead of append word, then what do you get? 215 00:16:22,090 --> 00:16:26,170 You get a bunch of lists rather than getting single words. 216 00:16:26,170 --> 00:16:29,590 You might be tempted to append the letter b instead, 217 00:16:29,590 --> 00:16:33,430 then you just get a list of the letter b a whole bunch of times, 218 00:16:33,430 --> 00:16:38,860 but what we really want is to append the word, if we do this. 219 00:16:38,860 --> 00:16:45,730 By the way, it gets easier if you use mnemonic variable names like Word. 220 00:16:45,730 --> 00:16:50,990 If I had called this y and this one x, 221 00:16:51,030 --> 00:16:55,750 it'd be a lot harder to remember what exactly I ought to be printing here. 222 00:16:55,750 --> 00:17:01,450 I'll be more tempted to do something like appending x for b, 223 00:17:01,450 --> 00:17:07,360 and say x instead of for b and y. I should be saying b and y; 224 00:17:07,360 --> 00:17:09,565 this gets a lot more complicated. 225 00:17:09,565 --> 00:17:12,470 I can get the right answer, 226 00:17:12,900 --> 00:17:15,160 but it's a lot harder. 227 00:17:15,160 --> 00:17:24,039 It was a lot easier when I used mnemonic names for lists and L for word in list, 228 00:17:24,039 --> 00:17:26,920 if b is in the word, 229 00:17:26,920 --> 00:17:32,720 then accumulate that word into our final value. 230 00:17:34,260 --> 00:17:38,020 Now, what's the complicated way that I sometimes see students 231 00:17:38,020 --> 00:17:41,455 do things if they don't remember that we have this nice InOperator? 232 00:17:41,455 --> 00:17:49,640 Well, they start to iterate a third time for character in word. 233 00:17:50,010 --> 00:17:55,000 If character equals b, 234 00:17:55,000 --> 00:18:01,360 then we append the word. 235 00:18:01,360 --> 00:18:07,030 That almost works if you don't remember about the InOperator- almost works, 236 00:18:07,030 --> 00:18:11,440 but it fails when a word has more than one b in it. 237 00:18:11,440 --> 00:18:14,380 You'll see bananas came out fine, 238 00:18:14,380 --> 00:18:20,050 but blueberries got in here twice because it got in there once for the first b, 239 00:18:20,050 --> 00:18:24,205 and a second time when we got another b in the word blueberries. 240 00:18:24,205 --> 00:18:28,000 So, there's a way to get around this if you remember 241 00:18:28,000 --> 00:18:36,535 the fancy break command that ends the inner iteration, the inner four loop. 242 00:18:36,535 --> 00:18:42,445 This will break us out of the innermost four loop that we're in. 243 00:18:42,445 --> 00:18:44,620 So, this actually should work I think. 244 00:18:44,620 --> 00:18:47,605 Yes, it does, but it's a complicated way. 245 00:18:47,605 --> 00:18:52,675 The simple way is to just check 246 00:18:52,675 --> 00:18:59,060 here if b is in word. 247 00:19:14,160 --> 00:19:21,595 So, that's nested iteration to accumulate some of the values from the inner lists. 248 00:19:21,595 --> 00:19:24,020 We'll see you next time.