1 00:00:08,060 --> 00:00:13,380 Splitting and joining are two other useful methods on strings in lists. 2 00:00:13,380 --> 00:00:19,155 So, split takes a string and turns it into a list of substrings of that string. 3 00:00:19,155 --> 00:00:23,530 So, let's suppose that we have a string, leaders and best. 4 00:00:30,860 --> 00:00:33,675 So, here we have a string leader, 5 00:00:33,675 --> 00:00:41,955 and let's suppose that we call dot split on that string. 6 00:00:41,955 --> 00:00:43,970 What dot split does, 7 00:00:43,970 --> 00:00:46,570 if we call it with nothing as an argument, 8 00:00:46,570 --> 00:00:49,405 it first looks for all the spaces in the string. 9 00:00:49,405 --> 00:00:51,940 So, here it finds a space, 10 00:00:51,940 --> 00:00:53,945 and here it finds a space, 11 00:00:53,945 --> 00:00:57,840 and it cuts the string along the spaces. 12 00:00:57,840 --> 00:01:04,240 So, what it does, is it turns this what was one string into three different strings, 13 00:01:04,240 --> 00:01:07,175 and it divides them along the spaces. 14 00:01:07,175 --> 00:01:12,790 So, the end result of this expression is going to be a list, 15 00:01:12,790 --> 00:01:17,090 where the first item is leaders, 16 00:01:20,120 --> 00:01:24,250 the second item is and, 17 00:01:25,250 --> 00:01:29,230 and the third item is best. 18 00:01:31,490 --> 00:01:34,810 One thing to note here is that split actually 19 00:01:34,810 --> 00:01:37,680 does get rid of the spaces in between all of these. 20 00:01:37,680 --> 00:01:40,870 So, this is leaders with no space afterwards and 21 00:01:40,870 --> 00:01:44,930 no space before or after and or before and after best. 22 00:01:44,930 --> 00:01:50,425 So, split essentially takes a string and splits it up into different words, 23 00:01:50,425 --> 00:01:52,600 If it's called with no arguments. 24 00:01:52,600 --> 00:01:55,360 So, let's try this out in code. 25 00:01:55,360 --> 00:01:58,100 So, suppose that we have this string, 26 00:01:58,100 --> 00:02:04,185 song equals "The Rain in Spain" and we say words equals song dot split. 27 00:02:04,185 --> 00:02:06,720 Then when we call song dot split, 28 00:02:06,720 --> 00:02:12,915 the value of this expression is going to split song along every space. 29 00:02:12,915 --> 00:02:16,790 So, we're going to get a new list that has four items, The, 30 00:02:16,790 --> 00:02:21,530 and then Rain in Spain dot dot dot. 31 00:02:21,530 --> 00:02:26,170 So, when we print out words then we should get a four item list. 32 00:02:26,170 --> 00:02:32,240 Now, we can also call dot split with an argument to specify what we want to split along. 33 00:02:32,240 --> 00:02:36,170 So, let's suppose that we actually called dot split with 34 00:02:36,170 --> 00:02:41,060 the argument e. Whatever argument we pass in here in this case e, 35 00:02:41,060 --> 00:02:43,985 specifies what we actually want to split along. 36 00:02:43,985 --> 00:02:47,510 So, here if we say we want to split along every e, 37 00:02:47,510 --> 00:02:49,835 if we split the string, leaders and best, 38 00:02:49,835 --> 00:02:54,295 then that's like crossing through every e that's in the string, 39 00:02:54,295 --> 00:02:58,370 and then the result is that we get a list where the first item is l, 40 00:02:58,370 --> 00:03:00,275 the second item is ad, 41 00:03:00,275 --> 00:03:05,305 the third item is rs and, and so on. 42 00:03:05,305 --> 00:03:07,965 So, let's try that out in code. 43 00:03:07,965 --> 00:03:10,605 So here, we have the same song. 44 00:03:10,605 --> 00:03:16,170 So, "The Rain in Spain" and let's suppose that we say song.split ("ai"). 45 00:03:16,170 --> 00:03:18,160 So, when we do that, 46 00:03:18,160 --> 00:03:21,270 then we're going to search for every ai. 47 00:03:22,180 --> 00:03:27,450 So, we find this ai, then this ai. 48 00:03:27,550 --> 00:03:32,115 What we should get back is a list with three items, the, 49 00:03:32,115 --> 00:03:34,815 space, r, n, space, 50 00:03:34,815 --> 00:03:37,530 in, space, and so on. 51 00:03:37,530 --> 00:03:41,100 You see that that's exactly what we get. 52 00:03:41,540 --> 00:03:44,220 Now, the opposite of split, 53 00:03:44,220 --> 00:03:48,060 which splits a string along lines is join. 54 00:03:48,060 --> 00:03:52,515 Join takes a list of strings and joins it into one long string. 55 00:03:52,515 --> 00:03:55,170 So, split is like chopping with a knife, 56 00:03:55,170 --> 00:03:58,655 and join is like joining it back together with glue. 57 00:03:58,655 --> 00:04:07,375 So, if we say I want to use this as the glue to join the items in this list, 58 00:04:07,375 --> 00:04:12,980 what we get as the value of this overall expression is every item in this list, 59 00:04:12,980 --> 00:04:18,545 leaders and best kind of glued back together with this string. 60 00:04:18,545 --> 00:04:22,845 So, we get leaders slash and slash best. 61 00:04:22,845 --> 00:04:25,540 So, let's try that out in code. 62 00:04:27,770 --> 00:04:30,780 So here, we have a list of words, 63 00:04:30,780 --> 00:04:32,685 red, blue, and green. 64 00:04:32,685 --> 00:04:35,350 We specify a variable glue. 65 00:04:35,350 --> 00:04:40,760 This variable glue is what's going to come in-between these items when we call dot join. 66 00:04:40,760 --> 00:04:43,075 So, if we say glue.join(wds), 67 00:04:43,075 --> 00:04:46,930 and assign that to s. Then when we print out s, 68 00:04:46,930 --> 00:04:51,540 then we should expect to see red semicolon blue Semicolon green. 69 00:04:51,540 --> 00:04:53,355 So, let's test that out. 70 00:04:53,355 --> 00:04:56,490 I'm just going to comment out the rest of this code. 71 00:04:56,660 --> 00:04:59,660 So, you can see that what we got was 72 00:04:59,660 --> 00:05:03,970 the semicolon gluing back together the items and words. 73 00:05:03,970 --> 00:05:06,610 But one thing to note is that calling 74 00:05:06,610 --> 00:05:09,835 dot join doesn't effect the value of the list itself. 75 00:05:09,835 --> 00:05:14,285 So, if we print out words after having called glue.join(wds), 76 00:05:14,285 --> 00:05:16,970 then we still get out a list that has three items, 77 00:05:16,970 --> 00:05:19,140 red, blue, and green. 78 00:05:20,360 --> 00:05:26,420 Of course, we can also use a multi character string to actually join the words together. 79 00:05:26,420 --> 00:05:28,730 So, if I use three stars to join words, 80 00:05:28,730 --> 00:05:30,470 then I get red star star star, 81 00:05:30,470 --> 00:05:32,720 blue star star star green. 82 00:05:32,720 --> 00:05:36,710 I can even use an empty string to glue back the words together. 83 00:05:36,710 --> 00:05:39,830 So, if I use an empty string dot join words, 84 00:05:39,830 --> 00:05:41,030 then I get red, blue, 85 00:05:41,030 --> 00:05:43,700 green all concatenated together. 86 00:05:43,700 --> 00:05:46,285 So, let's go through some problems. 87 00:05:46,285 --> 00:05:50,900 So, this question asks us to create a variable output that's assigned to a list, 88 00:05:50,900 --> 00:05:53,285 whose elements are the words in string one. 89 00:05:53,285 --> 00:05:57,995 So, in other words, we want to split up string one into individual words. 90 00:05:57,995 --> 00:06:04,755 So, I want to assign output equal to str1.split. 91 00:06:04,755 --> 00:06:07,410 If I call that split with no arguments, 92 00:06:07,410 --> 00:06:10,540 then it splits it along the spaces. 93 00:06:13,700 --> 00:06:18,140 This question asks us to create a variable called words and assign it 94 00:06:18,140 --> 00:06:21,830 to a list whose elements are the words in the string sent, 95 00:06:21,830 --> 00:06:24,710 and it tells us not to worry about punctuation. 96 00:06:24,710 --> 00:06:27,005 So, I'm going to do something very similar here. 97 00:06:27,005 --> 00:06:31,260 I'm going to say words equals sent.split. 98 00:06:31,260 --> 00:06:36,650 Again, no arguments in dot split because we want to split it along the spaces. 99 00:06:36,650 --> 00:06:41,245 Now words is set to a list of words in a sentence. 100 00:06:41,245 --> 00:06:43,910 So, split and join are functions that 101 00:06:43,910 --> 00:06:46,340 we'll keep coming back to you throughout this course, 102 00:06:46,340 --> 00:06:48,725 and they'll end up being enormously useful. 103 00:06:48,725 --> 00:06:52,110 That's all for now, until next time.