1 00:00:05,690 --> 00:00:09,475 Okay, we're just about at the project for this course. 2 00:00:09,475 --> 00:00:12,070 If you reflect on this specialization as a whole, 3 00:00:12,070 --> 00:00:13,540 you'll realize that you started with 4 00:00:13,540 --> 00:00:16,660 probably little or no understanding of Python, 5 00:00:16,660 --> 00:00:19,000 progressed through the basic control structures 6 00:00:19,000 --> 00:00:20,955 and libraries included with the language. 7 00:00:20,955 --> 00:00:22,805 With the help of a digital textbook, 8 00:00:22,805 --> 00:00:25,810 moved on to more high level representations 9 00:00:25,810 --> 00:00:27,715 of data and functions with objects, 10 00:00:27,715 --> 00:00:29,380 and now started to explore 11 00:00:29,380 --> 00:00:32,080 third-party libraries that exist for Python, 12 00:00:32,080 --> 00:00:34,705 which allow you to manipulate and display images. 13 00:00:34,705 --> 00:00:37,175 This is really quite an achievement. 14 00:00:37,175 --> 00:00:40,535 You've also no doubt found that as you progress, 15 00:00:40,535 --> 00:00:42,370 that demands on you to engage in 16 00:00:42,370 --> 00:00:44,950 more self-discovery have also increased. 17 00:00:44,950 --> 00:00:47,930 While the first assignments were maybe straight forward, 18 00:00:47,930 --> 00:00:50,450 the ones in this week require you to struggle a bit more 19 00:00:50,450 --> 00:00:53,560 with planning and debugging code as you develop. 20 00:00:53,560 --> 00:00:55,545 But you persisted. 21 00:00:55,545 --> 00:00:57,830 I'd like to share with you just one more set of 22 00:00:57,830 --> 00:01:00,440 features before we head over to the project. 23 00:01:00,440 --> 00:01:02,750 The OpenCV library contains 24 00:01:02,750 --> 00:01:05,615 mechanisms to do face detection on images. 25 00:01:05,615 --> 00:01:08,929 The technique used is based on Haar cascades, 26 00:01:08,929 --> 00:01:10,900 which is a machine-learning approach. 27 00:01:10,900 --> 00:01:12,320 Now, we're not going to go into 28 00:01:12,320 --> 00:01:13,520 the machine learning bits. 29 00:01:13,520 --> 00:01:15,680 We have another specialization on that called 30 00:01:15,680 --> 00:01:19,055 The Applied Data Science with Python specialization. 31 00:01:19,055 --> 00:01:22,580 You can take that after this one if you're interested. 32 00:01:22,580 --> 00:01:23,915 But here we're going to treat 33 00:01:23,915 --> 00:01:26,785 OpenCV more like a black box. 34 00:01:26,785 --> 00:01:30,770 OpenCV comes with train models for detecting faces, 35 00:01:30,770 --> 00:01:33,290 eyes, and smiles which we'll be using. 36 00:01:33,290 --> 00:01:35,450 You can train models for detecting other things, 37 00:01:35,450 --> 00:01:36,950 like hot dogs or flutes. 38 00:01:36,950 --> 00:01:38,360 If you're interested in that, 39 00:01:38,360 --> 00:01:40,550 I'd recommend you check out the OpenCV docs 40 00:01:40,550 --> 00:01:42,890 on how to train a cascade classifier, 41 00:01:42,890 --> 00:01:44,860 and here's a URL. 42 00:01:44,860 --> 00:01:47,260 However, in this lecture, 43 00:01:47,260 --> 00:01:50,000 we just want to use the current classifiers to see if 44 00:01:50,000 --> 00:01:53,390 we can detect portions of an image which are interesting. 45 00:01:53,390 --> 00:01:55,550 So the first step is to load 46 00:01:55,550 --> 00:01:58,385 OpenCV and the XML-based classifiers. 47 00:01:58,385 --> 00:02:01,355 So import cv2 as cv, 48 00:02:01,355 --> 00:02:04,805 and then we'll create a face_cascade classifiers. 49 00:02:04,805 --> 00:02:10,100 So cv.CascadeClassifier, we'll load that from read only. 50 00:02:10,100 --> 00:02:15,120 I put it in the Haarcascade_frontalface_default.xml. 51 00:02:15,980 --> 00:02:18,490 With the classifiers loaded, 52 00:02:18,490 --> 00:02:20,765 we now want to try and detect a face. 53 00:02:20,765 --> 00:02:23,465 Let's pull on the picture we played with the last time. 54 00:02:23,465 --> 00:02:26,525 So image equals cv.imread 55 00:02:26,525 --> 00:02:28,610 and will bring in the floyd picture. 56 00:02:28,610 --> 00:02:32,540 We'll convert it to grayscale using the cvtColor image. 57 00:02:32,540 --> 00:02:36,800 So gray equals cv.cvtColor, 58 00:02:36,800 --> 00:02:39,260 and we convert this to grayscale. 59 00:02:39,260 --> 00:02:43,010 The next step is to use the face_cascade classifier. 60 00:02:43,010 --> 00:02:45,560 I'll let you go explore the docs if you'd like to. 61 00:02:45,560 --> 00:02:49,180 But the norm is to use the detectMultiscaleFunction. 62 00:02:49,180 --> 00:02:52,745 This function returns a list of objects as rectangles. 63 00:02:52,745 --> 00:02:55,985 The first parameter is an ndarray of the image. 64 00:02:55,985 --> 00:02:59,300 So faces equals face_cascade.detectMultiScale, 65 00:02:59,300 --> 00:03:03,010 and we just pass it in ndarray or image gray. 66 00:03:03,010 --> 00:03:06,480 Now let's just print those phases out to the screen. 67 00:03:08,770 --> 00:03:11,940 We'll print them out of the list. 68 00:03:13,390 --> 00:03:18,500 The resulting rectangles are in the format of x, y, w, 69 00:03:18,500 --> 00:03:20,390 h. Where the x and y denote 70 00:03:20,390 --> 00:03:23,060 the upper left-hand corner point for the image, 71 00:03:23,060 --> 00:03:25,880 and the width and height represent the bounding box. 72 00:03:25,880 --> 00:03:28,460 We know how to handle this already in PIL. 73 00:03:28,460 --> 00:03:30,800 So from PIL import image, 74 00:03:30,800 --> 00:03:33,035 let's create a pill image objects. 75 00:03:33,035 --> 00:03:35,780 So PIL image equals image.fromarray. 76 00:03:35,780 --> 00:03:39,350 We pass in gray and we set the mode to luminosity. 77 00:03:39,350 --> 00:03:41,870 Now let's bring in our drawing objects. 78 00:03:41,870 --> 00:03:44,165 So from PIL import image draw, 79 00:03:44,165 --> 00:03:45,965 and let's create a context. 80 00:03:45,965 --> 00:03:48,290 So drawing equals ImageDraw.Draw 81 00:03:48,290 --> 00:03:50,170 and pass in the PIL image. 82 00:03:50,170 --> 00:03:53,895 Now, let's pull the rectangle out of the faces object. 83 00:03:53,895 --> 00:03:56,840 So we'll take a rectangle and faces to the list. 84 00:03:56,840 --> 00:03:58,220 We know there's one item in there, 85 00:03:58,220 --> 00:04:00,530 so sub-zero, and now we're 86 00:04:00,530 --> 00:04:03,080 just going to draw a rectangle around the bounds. 87 00:04:03,080 --> 00:04:05,375 So drawing.rectangle, 88 00:04:05,375 --> 00:04:07,970 passing the rectangle we're interested in, 89 00:04:07,970 --> 00:04:09,770 set the outline to white, 90 00:04:09,770 --> 00:04:11,420 and display the image in line, 91 00:04:11,420 --> 00:04:13,810 so display PIL image. 92 00:04:13,810 --> 00:04:16,620 So not quite what we're looking for, 93 00:04:16,620 --> 00:04:18,775 what do you think went wrong? 94 00:04:18,775 --> 00:04:22,210 Well, a quick double check of the docs and its apparent 95 00:04:22,210 --> 00:04:25,360 that OpenCV is returning the coordinates as x, 96 00:04:25,360 --> 00:04:29,815 y, w, h, while PIL.ImageDraw is looking for x1, 97 00:04:29,815 --> 00:04:31,750 y1 and x2, y2. 98 00:04:31,750 --> 00:04:34,055 So it looks like this is an easy fix. 99 00:04:34,055 --> 00:04:35,925 So let's wipe our old image. 100 00:04:35,925 --> 00:04:39,505 So PIL image and we'll just reload our image, 101 00:04:39,505 --> 00:04:42,125 set up our drawing context, 102 00:04:42,125 --> 00:04:43,980 and draw the new box. 103 00:04:43,980 --> 00:04:46,050 So drawing.rectangle, and so we'll take 104 00:04:46,050 --> 00:04:48,540 rec sub zero and rec sub one. 105 00:04:48,540 --> 00:04:51,825 Then we'll add to that rec sub two 106 00:04:51,825 --> 00:04:56,040 and rec sub three as appropriate and set outline, 107 00:04:56,040 --> 00:04:58,600 and display in line. 108 00:04:58,850 --> 00:05:01,070 We see the face detection 109 00:05:01,070 --> 00:05:02,840 works pretty good on this image. 110 00:05:02,840 --> 00:05:05,974 Note that it's apparent that this is not head detection, 111 00:05:05,974 --> 00:05:07,910 but that the haarcascade file that we're 112 00:05:07,910 --> 00:05:10,070 using is for eyes and a mouth. 113 00:05:10,070 --> 00:05:13,100 Let's try something a little bit more complex. 114 00:05:13,100 --> 00:05:15,730 Let's read in that MSI recruitment image. 115 00:05:15,730 --> 00:05:18,390 So image equals cv.Imread, 116 00:05:18,390 --> 00:05:20,790 and we'll bring in the msi_recruitment.gif. 117 00:05:20,790 --> 00:05:22,630 Let's take a look at that image, 118 00:05:22,630 --> 00:05:24,290 to remind ourselves what it looks like. 119 00:05:24,290 --> 00:05:26,360 So display image.fromarray. 120 00:05:26,360 --> 00:05:27,800 Remember, we have to do this, 121 00:05:27,800 --> 00:05:30,005 but we don't have to pass them luminosity values 122 00:05:30,005 --> 00:05:32,320 because it's in full color. 123 00:05:32,320 --> 00:05:35,055 Well, what's that error about? 124 00:05:35,055 --> 00:05:37,340 Looks like there's an error on a line deep 125 00:05:37,340 --> 00:05:39,545 within the PIL Image.py file, 126 00:05:39,545 --> 00:05:40,400 and it is trying to call 127 00:05:40,400 --> 00:05:42,740 an internal private member called, 128 00:05:42,740 --> 00:05:45,710 dander array underscore interface dander, 129 00:05:45,710 --> 00:05:46,910 on the image object. 130 00:05:46,910 --> 00:05:48,910 But this object is none. 131 00:05:48,910 --> 00:05:51,705 It turns out that the root of this error is that 132 00:05:51,705 --> 00:05:54,360 OpenCV can't work with gif images. 133 00:05:54,360 --> 00:05:56,910 This is kind of a pain and unfortunate. 134 00:05:56,910 --> 00:05:58,695 But we know how to fix this, right? 135 00:05:58,695 --> 00:06:01,550 One way is that we can just open this in PIL and 136 00:06:01,550 --> 00:06:04,790 then save it as a png and then open that in OpenCV. 137 00:06:04,790 --> 00:06:07,010 So let's use PIL to open our image. 138 00:06:07,010 --> 00:06:10,405 So PIL image equals Image.open and bring in our gif. 139 00:06:10,405 --> 00:06:12,394 Now let's convert it to grayscale 140 00:06:12,394 --> 00:06:14,270 for OpenCV and get the by test. 141 00:06:14,270 --> 00:06:16,190 So OpenCV version is 142 00:06:16,190 --> 00:06:19,280 equal to pilimage.convert luminosity, 143 00:06:19,280 --> 00:06:21,635 and that is going to return to us an array. 144 00:06:21,635 --> 00:06:23,800 Now, let's just write that out to a file. 145 00:06:23,800 --> 00:06:29,275 So open_cv_version.save, msirecruitment.png. 146 00:06:29,275 --> 00:06:32,090 Now that the conversion of format is done, 147 00:06:32,090 --> 00:06:34,655 let's try reading this back into OpenCV. 148 00:06:34,655 --> 00:06:39,575 So OpenCV image equals cv.imread and bring out the png. 149 00:06:39,575 --> 00:06:41,360 We don't need to color convert this, 150 00:06:41,360 --> 00:06:43,450 because we saved it as grayscale. 151 00:06:43,450 --> 00:06:46,065 Let's try and detect faces in that image. 152 00:06:46,065 --> 00:06:50,400 So faces equal face_cascaded.detectMultiScale, 153 00:06:50,400 --> 00:06:54,285 and passing the mdarrays cv image. 154 00:06:54,285 --> 00:06:57,525 Now, we still have our PIL color version in a gif. 155 00:06:57,525 --> 00:07:01,650 So PIL image equals Image.open the gif. 156 00:07:01,650 --> 00:07:03,665 So we'll set a drawing contexts. 157 00:07:03,665 --> 00:07:06,590 So drawing equals ImageDraw.Draw. 158 00:07:06,590 --> 00:07:08,780 Now, for each line in faces, 159 00:07:08,780 --> 00:07:10,865 let's surround it with a red box. 160 00:07:10,865 --> 00:07:12,905 So for x, y, 161 00:07:12,905 --> 00:07:15,245 w, and h in faces. 162 00:07:15,245 --> 00:07:18,455 So this might actually be new syntax for you. 163 00:07:18,455 --> 00:07:20,730 Recall that faces is a list of 164 00:07:20,730 --> 00:07:24,575 rectangles in the x y width and height format. 165 00:07:24,575 --> 00:07:27,095 That is a list of lists. 166 00:07:27,095 --> 00:07:29,660 Instead of having to do an iteration and 167 00:07:29,660 --> 00:07:31,670 then manually pull out each item, 168 00:07:31,670 --> 00:07:34,400 we can use something called tuple unpacking to pull out 169 00:07:34,400 --> 00:07:38,135 individual items in the sub-list directly to variables. 170 00:07:38,135 --> 00:07:40,930 This is really nice Python feature. 171 00:07:40,930 --> 00:07:43,430 All right. So now we just need to draw our box. 172 00:07:43,430 --> 00:07:45,860 So drawing.rectangle x, y 173 00:07:45,860 --> 00:07:48,305 and then x plus width, can't forget this, 174 00:07:48,305 --> 00:07:50,570 and y plus height and set 175 00:07:50,570 --> 00:07:54,475 the outline to white, and display Pil.image. 176 00:07:54,475 --> 00:07:56,820 Well, what happened here? 177 00:07:56,820 --> 00:07:59,000 We see that we've detected faces, 178 00:07:59,000 --> 00:08:00,590 so there's some white boxes and 179 00:08:00,590 --> 00:08:02,210 then we'd drawn boxes around them. 180 00:08:02,210 --> 00:08:04,880 But the colors have gone all weird. 181 00:08:04,880 --> 00:08:06,920 This it turns out has to do with 182 00:08:06,920 --> 00:08:09,335 the color limitations for gif images. 183 00:08:09,335 --> 00:08:11,330 In short, a gif image has 184 00:08:11,330 --> 00:08:13,460 a very limited number of colors. 185 00:08:13,460 --> 00:08:15,350 This is called a color pallette, 186 00:08:15,350 --> 00:08:18,260 after the palette artists use to mix paints. 187 00:08:18,260 --> 00:08:21,605 For gifs, the pallettes can only be 256 colors. 188 00:08:21,605 --> 00:08:24,800 But they can be any 256 colors. 189 00:08:24,800 --> 00:08:26,820 When a new color is introduced, 190 00:08:26,820 --> 00:08:29,480 it has to take the space of an old color. 191 00:08:29,480 --> 00:08:32,330 In this case, PIL adds white to the pallette, 192 00:08:32,330 --> 00:08:34,550 but doesn't know which color to represent, 193 00:08:34,550 --> 00:08:36,945 and thus messes up the image. 194 00:08:36,945 --> 00:08:40,300 Who knew there was so much to learn about image formats? 195 00:08:40,300 --> 00:08:41,780 We can see what mode 196 00:08:41,780 --> 00:08:44,055 an image is with the dot mode attribute. 197 00:08:44,055 --> 00:08:46,815 So if we do pil_img.mode, 198 00:08:46,815 --> 00:08:48,830 we can see that there's a list of 199 00:08:48,830 --> 00:08:50,615 modes in the Pillow documentation, 200 00:08:50,615 --> 00:08:51,710 and they correspond with 201 00:08:51,710 --> 00:08:53,815 color spaces that we've been using. 202 00:08:53,815 --> 00:08:56,695 For the moment though, let's just change back to RGB, 203 00:08:56,695 --> 00:08:58,075 which represents color as 204 00:08:58,075 --> 00:09:00,520 a three byte tuple instead of in a pallette. 205 00:09:00,520 --> 00:09:02,020 So let's read in our image, 206 00:09:02,020 --> 00:09:05,030 bring back our gif image, nice and clean. 207 00:09:05,030 --> 00:09:08,180 Let's convert it to RGB mode. So that's pretty easy. 208 00:09:08,180 --> 00:09:12,520 We just do pil_image.converts with RGB, 209 00:09:12,520 --> 00:09:14,850 and let's print out this mode. 210 00:09:14,850 --> 00:09:17,810 Okay. We're pretty convinced that we've changed it. 211 00:09:17,810 --> 00:09:19,730 Now let's go back to drawing rectangles. 212 00:09:19,730 --> 00:09:21,910 Let's get our drawing object. 213 00:09:21,910 --> 00:09:24,765 Let's iterate through the face sequence again. 214 00:09:24,765 --> 00:09:26,490 We'll tuple and pack as we go. 215 00:09:26,490 --> 00:09:29,565 So x, y width and height in faces. 216 00:09:29,565 --> 00:09:31,490 Remember again, width and 217 00:09:31,490 --> 00:09:33,530 height so we have to add these appropriately. 218 00:09:33,530 --> 00:09:35,665 So drawing rectangle x, y, 219 00:09:35,665 --> 00:09:37,695 x plus width, y plus height, 220 00:09:37,695 --> 00:09:39,020 set the outline to white. 221 00:09:39,020 --> 00:09:43,475 Finally, let's display that image to the screen. 222 00:09:43,475 --> 00:09:46,060 Awesome. We managed to detect 223 00:09:46,060 --> 00:09:48,030 a bunch of faces in that image. 224 00:09:48,030 --> 00:09:50,490 It looks like we've missed four faces. 225 00:09:50,490 --> 00:09:52,100 In the Machine Learning world, 226 00:09:52,100 --> 00:09:53,795 we would call these false negatives. 227 00:09:53,795 --> 00:09:56,290 Something which the machine thought was not a face, 228 00:09:56,290 --> 00:09:59,500 so a negative, but then it was incorrect on. 229 00:09:59,500 --> 00:10:02,260 Consequently, we would call it actual faces 230 00:10:02,260 --> 00:10:04,900 that were detected as true positives. 231 00:10:04,900 --> 00:10:06,700 Something that the machine thought was 232 00:10:06,700 --> 00:10:08,880 a face and it was correct on. 233 00:10:08,880 --> 00:10:11,325 This leaves us with false positives, 234 00:10:11,325 --> 00:10:14,570 something the machine thought was faced but it wasn't. 235 00:10:14,570 --> 00:10:17,285 We can see that there's two of these in the image, 236 00:10:17,285 --> 00:10:19,580 picking up the shadow patterns or textures in 237 00:10:19,580 --> 00:10:22,625 the shirts and matching them with hard cascades. 238 00:10:22,625 --> 00:10:24,795 Finally, we have a class of 239 00:10:24,795 --> 00:10:26,595 true negatives where the set 240 00:10:26,595 --> 00:10:28,450 of all possible rectangles at 241 00:10:28,450 --> 00:10:31,060 the Machine Learning classifier could consider 242 00:10:31,060 --> 00:10:32,680 where it correctly indicated 243 00:10:32,680 --> 00:10:34,420 that the result was not a face. 244 00:10:34,420 --> 00:10:36,495 In this case, there's many, 245 00:10:36,495 --> 00:10:39,035 many, many true negatives. 246 00:10:39,035 --> 00:10:41,940 There's a few ways we could try and improve this, 247 00:10:41,940 --> 00:10:43,470 and really it requires a lot of 248 00:10:43,470 --> 00:10:47,015 experimentation to find good values for a given image. 249 00:10:47,015 --> 00:10:49,280 First, let's create a function which we'll 250 00:10:49,280 --> 00:10:51,685 plot out rectangles for us in the image. 251 00:10:51,685 --> 00:10:56,030 So deaf show_rects will have that pass and faces. 252 00:10:56,030 --> 00:10:58,255 Let's read in our gif and convert it. 253 00:10:58,255 --> 00:11:00,260 So we'll read it in and we'll convert it to 254 00:11:00,260 --> 00:11:03,140 RGB in the same line. 255 00:11:03,140 --> 00:11:06,265 We'll set our drawing context, 256 00:11:06,265 --> 00:11:07,940 and we'll plot all of 257 00:11:07,940 --> 00:11:11,480 the faces in all of the rectangles in faces. 258 00:11:11,480 --> 00:11:14,830 So for x, y, width and height and faces, 259 00:11:14,830 --> 00:11:18,410 and we've seen this before withdrawing.rectangles, 260 00:11:18,410 --> 00:11:19,835 and we'll set the outline to white, 261 00:11:19,835 --> 00:11:22,145 and finally we'll display this. 262 00:11:22,145 --> 00:11:25,300 All right. So now we have a function show_faces. 263 00:11:25,300 --> 00:11:28,625 So first step, we can try and binarize this image, 264 00:11:28,625 --> 00:11:30,835 and it turns out that OpenCV has a built in 265 00:11:30,835 --> 00:11:33,455 binarization function called threshold. 266 00:11:33,455 --> 00:11:35,350 You simply pass in the image, 267 00:11:35,350 --> 00:11:37,685 the midpoint, and the maximum value, 268 00:11:37,685 --> 00:11:39,040 as well as a flag which 269 00:11:39,040 --> 00:11:40,700 indicates whether the threshold should 270 00:11:40,700 --> 00:11:44,065 be a binary or something else. So let's try this. 271 00:11:44,065 --> 00:11:49,925 So we'll do cv_image_bin equals cv.thresholds. 272 00:11:49,925 --> 00:11:52,130 So we pass in the image we're interested in. 273 00:11:52,130 --> 00:11:55,225 I'll choose 124 our threshold, 274 00:11:55,225 --> 00:11:59,975 255 for our top-end, and then cv.threshbinary. 275 00:11:59,975 --> 00:12:02,705 We're just going to pull from this actually sub one, 276 00:12:02,705 --> 00:12:04,520 because this function returns 277 00:12:04,520 --> 00:12:06,625 a list and we want the second value. 278 00:12:06,625 --> 00:12:09,695 Now, let's do the actual face detection on this. 279 00:12:09,695 --> 00:12:13,270 So faces equals face_cascade.detectMultiscale. 280 00:12:13,270 --> 00:12:15,320 We'll just pass in this new CV image 281 00:12:15,320 --> 00:12:16,834 after it's been binarized, 282 00:12:16,834 --> 00:12:18,635 and then let's call our show_recs 283 00:12:18,635 --> 00:12:21,170 faces to see the results. 284 00:12:21,750 --> 00:12:25,120 So that's interesting. Not better, 285 00:12:25,120 --> 00:12:26,745 but we do see that there is one 286 00:12:26,745 --> 00:12:28,810 false positive towards the bottom where 287 00:12:28,810 --> 00:12:31,180 the classifier detected the sunglasses as 288 00:12:31,180 --> 00:12:34,480 eyes and the dark shadow line as a mouth. 289 00:12:34,480 --> 00:12:37,000 If you're following in the Notebook with this video, 290 00:12:37,000 --> 00:12:38,745 why don't you pause things and try 291 00:12:38,745 --> 00:12:42,380 a few different parameters for the threshold value? 292 00:12:43,000 --> 00:12:45,680 The detect multiscale function from 293 00:12:45,680 --> 00:12:48,035 OpenCV also is a couple of parameters. 294 00:12:48,035 --> 00:12:50,980 The first of these is the scale factor. 295 00:12:50,980 --> 00:12:53,570 The scale factor changes the size of 296 00:12:53,570 --> 00:12:56,260 rectangles which are considered against the model. 297 00:12:56,260 --> 00:12:58,975 That is, the hard cascades XML file. 298 00:12:58,975 --> 00:13:00,890 You'd think of it as if it were changing 299 00:13:00,890 --> 00:13:03,580 the size of the rectangle which are on the screen. 300 00:13:03,580 --> 00:13:05,855 Let's experiment with the scale factor. 301 00:13:05,855 --> 00:13:07,160 Usually, it's a small value. 302 00:13:07,160 --> 00:13:08,935 So let's try 1.05. 303 00:13:08,935 --> 00:13:13,495 So faces equals face_cascade.detectMulti-scale, 304 00:13:13,495 --> 00:13:16,255 and we'll pass in our image and 1.05. 305 00:13:16,255 --> 00:13:18,010 Now, let's render those results 306 00:13:18,010 --> 00:13:19,820 to the screen through show racks. 307 00:13:19,820 --> 00:13:22,350 Now, let's also try this on 1.15. 308 00:13:22,350 --> 00:13:24,820 So I'll just put that in there quickly. 309 00:13:24,820 --> 00:13:28,400 Then finally, let's try and do this on 1.25 as well. 310 00:13:28,400 --> 00:13:31,600 So we'll put that in there, and let's give it a run. 311 00:13:36,760 --> 00:13:40,860 We can see that as we change the scale factor, 312 00:13:40,860 --> 00:13:41,875 we changed the number of 313 00:13:41,875 --> 00:13:44,485 true and false positives and negatives. 314 00:13:44,485 --> 00:13:46,925 With the scale set to 1.05, 315 00:13:46,925 --> 00:13:48,619 we have seven true positives 316 00:13:48,619 --> 00:13:50,615 which are correctly identified faces, 317 00:13:50,615 --> 00:13:52,940 and three false negatives which are 318 00:13:52,940 --> 00:13:56,115 faces which are there but not detected. 319 00:13:56,115 --> 00:13:58,640 We have three false positives which are known 320 00:13:58,640 --> 00:14:01,790 faces which OpenCV think are faces. 321 00:14:01,790 --> 00:14:04,170 When we change this to 1.15, 322 00:14:04,170 --> 00:14:05,760 we lose the false positives, 323 00:14:05,760 --> 00:14:07,960 but also lose one of the true positives, 324 00:14:07,960 --> 00:14:10,195 the person to the right wearing a hat. 325 00:14:10,195 --> 00:14:12,595 When we change this to 1.25, 326 00:14:12,595 --> 00:14:15,365 we lost more true positives as well. 327 00:14:15,365 --> 00:14:18,565 This is actually a really interesting phenomena 328 00:14:18,565 --> 00:14:21,215 in Machine Learning and Artificial Intelligence. 329 00:14:21,215 --> 00:14:22,825 There's a trade-off between 330 00:14:22,825 --> 00:14:25,055 not only how accurate the model is, 331 00:14:25,055 --> 00:14:28,280 but how the inaccuracy actually happens. 332 00:14:28,280 --> 00:14:31,985 So which of these three models do you think are best? 333 00:14:31,985 --> 00:14:36,140 Well, the answer to that question is really, it depends. 334 00:14:36,140 --> 00:14:38,030 It depends why you're trying to 335 00:14:38,030 --> 00:14:40,685 detect faces and what you're going to do with them. 336 00:14:40,685 --> 00:14:43,360 If you think these issues are interesting, 337 00:14:43,360 --> 00:14:45,380 you might want to check out the Applied Data Science with 338 00:14:45,380 --> 00:14:49,680 Python Specialization Michigan offers here on Coursera. 339 00:14:49,960 --> 00:14:52,770 Beyond an opportunity to advertise, 340 00:14:52,770 --> 00:14:54,530 did you notice anything else that 341 00:14:54,530 --> 00:14:57,280 happened when we changed the scale factor? 342 00:14:57,280 --> 00:14:59,550 It's subtle, but the speed at which 343 00:14:59,550 --> 00:15:03,135 the processing ran took longer at smaller scale factors. 344 00:15:03,135 --> 00:15:05,209 This is because more sub-images 345 00:15:05,209 --> 00:15:07,270 are being considered for those scales. 346 00:15:07,270 --> 00:15:10,835 This could also affect which method we might use. 347 00:15:10,835 --> 00:15:14,570 Jupyter has nice support for timing commands. 348 00:15:14,570 --> 00:15:16,220 You might have seen this before. 349 00:15:16,220 --> 00:15:18,380 A line that starts with a percentage sign in 350 00:15:18,380 --> 00:15:20,510 Jupyter is called a Magic function. 351 00:15:20,510 --> 00:15:22,260 This isn't normal Python, 352 00:15:22,260 --> 00:15:24,355 it's actually a shorthand way of writing 353 00:15:24,355 --> 00:15:26,970 a function which Jupyter has pre-defined. 354 00:15:26,970 --> 00:15:29,220 It looks a lot like the decorators 355 00:15:29,220 --> 00:15:31,415 we've talked about in a previous lecture, 356 00:15:31,415 --> 00:15:33,890 but the Magic functions were around long before 357 00:15:33,890 --> 00:15:36,740 decorators were part of the Python language. 358 00:15:36,740 --> 00:15:38,960 One of the built-in Magic functions 359 00:15:38,960 --> 00:15:40,745 in Jupyter is called Timeit. 360 00:15:40,745 --> 00:15:44,105 This repeats a piece of Python 10 times by default, 361 00:15:44,105 --> 00:15:47,410 and tells you the average speed it took to complete. 362 00:15:47,410 --> 00:15:50,340 Let's find the speed of detectMultiscale 363 00:15:50,340 --> 00:15:53,015 when using a scale of 1.05. 364 00:15:53,015 --> 00:15:55,940 So percentage Timeit, call the Magic function, 365 00:15:55,940 --> 00:15:58,430 and then we just write our normal Python code. 366 00:15:58,430 --> 00:16:03,630 So phase cascade.detectMultiscale(cv img and 367 00:16:13,120 --> 00:16:16,190 1.05). Now, let's compare that to 368 00:16:16,190 --> 00:16:19,225 the speed at scale 1.15. 369 00:16:19,225 --> 00:16:20,240 So, the same thing, 370 00:16:20,240 --> 00:16:24,640 timeit face_cascade.detectMultiscale(cv img 1.15). 371 00:16:29,440 --> 00:16:33,095 So you can see that this is a dramatic difference, 372 00:16:33,095 --> 00:16:37,570 roughly 2.5 times slower when using the smaller scale. 373 00:16:37,570 --> 00:16:40,250 This wraps up our discussion of detecting 374 00:16:40,250 --> 00:16:42,910 faces in OpenCV. You'll see that. 375 00:16:42,910 --> 00:16:45,865 Like OCR, this is not a foolproof process, 376 00:16:45,865 --> 00:16:48,240 but we can build on the work others have done in 377 00:16:48,240 --> 00:16:51,620 Machine Learning and leverage powerful libraries to bring 378 00:16:51,620 --> 00:16:55,550 us closer to building a turnkey Python-based solution. 379 00:16:55,550 --> 00:16:57,850 Remember that the detection mechanism 380 00:16:57,850 --> 00:16:59,620 isn't specific to faces, 381 00:16:59,620 --> 00:17:03,005 and that's just the hat cascade's training data we used. 382 00:17:03,005 --> 00:17:04,660 On the web, you'll be able to find 383 00:17:04,660 --> 00:17:06,170 other training data to detect 384 00:17:06,170 --> 00:17:07,850 other objects including eyes, 385 00:17:07,850 --> 00:17:10,450 animals, and so forth.