1
00:00:05,690 --> 00:00:09,475
Okay, we're just about at
the project for this course.

2
00:00:09,475 --> 00:00:12,070
If you reflect on
this specialization as a whole,

3
00:00:12,070 --> 00:00:13,540
you'll realize that
you started with

4
00:00:13,540 --> 00:00:16,660
probably little or no
understanding of Python,

5
00:00:16,660 --> 00:00:19,000
progressed through
the basic control structures

6
00:00:19,000 --> 00:00:20,955
and libraries included
with the language.

7
00:00:20,955 --> 00:00:22,805
With the help of
a digital textbook,

8
00:00:22,805 --> 00:00:25,810
moved on to more high
level representations

9
00:00:25,810 --> 00:00:27,715
of data and functions
with objects,

10
00:00:27,715 --> 00:00:29,380
and now started to explore

11
00:00:29,380 --> 00:00:32,080
third-party libraries
that exist for Python,

12
00:00:32,080 --> 00:00:34,705
which allow you to manipulate
and display images.

13
00:00:34,705 --> 00:00:37,175
This is really
quite an achievement.

14
00:00:37,175 --> 00:00:40,535
You've also no doubt found
that as you progress,

15
00:00:40,535 --> 00:00:42,370
that demands on you to engage in

16
00:00:42,370 --> 00:00:44,950
more self-discovery
have also increased.

17
00:00:44,950 --> 00:00:47,930
While the first assignments
were maybe straight forward,

18
00:00:47,930 --> 00:00:50,450
the ones in this week require
you to struggle a bit more

19
00:00:50,450 --> 00:00:53,560
with planning and
debugging code as you develop.

20
00:00:53,560 --> 00:00:55,545
But you persisted.

21
00:00:55,545 --> 00:00:57,830
I'd like to share with
you just one more set of

22
00:00:57,830 --> 00:01:00,440
features before we head
over to the project.

23
00:01:00,440 --> 00:01:02,750
The OpenCV library contains

24
00:01:02,750 --> 00:01:05,615
mechanisms to do
face detection on images.

25
00:01:05,615 --> 00:01:08,929
The technique used is
based on Haar cascades,

26
00:01:08,929 --> 00:01:10,900
which is a
machine-learning approach.

27
00:01:10,900 --> 00:01:12,320
Now, we're not going to go into

28
00:01:12,320 --> 00:01:13,520
the machine learning bits.

29
00:01:13,520 --> 00:01:15,680
We have another
specialization on that called

30
00:01:15,680 --> 00:01:19,055
The Applied Data Science with
Python specialization.

31
00:01:19,055 --> 00:01:22,580
You can take that after this
one if you're interested.

32
00:01:22,580 --> 00:01:23,915
But here we're going to treat

33
00:01:23,915 --> 00:01:26,785
OpenCV more like a black box.

34
00:01:26,785 --> 00:01:30,770
OpenCV comes with train
models for detecting faces,

35
00:01:30,770 --> 00:01:33,290
eyes, and smiles
which we'll be using.

36
00:01:33,290 --> 00:01:35,450
You can train models for
detecting other things,

37
00:01:35,450 --> 00:01:36,950
like hot dogs or flutes.

38
00:01:36,950 --> 00:01:38,360
If you're interested in that,

39
00:01:38,360 --> 00:01:40,550
I'd recommend you check
out the OpenCV docs

40
00:01:40,550 --> 00:01:42,890
on how to train
a cascade classifier,

41
00:01:42,890 --> 00:01:44,860
and here's a URL.

42
00:01:44,860 --> 00:01:47,260
However, in this lecture,

43
00:01:47,260 --> 00:01:50,000
we just want to use the
current classifiers to see if

44
00:01:50,000 --> 00:01:53,390
we can detect portions of an
image which are interesting.

45
00:01:53,390 --> 00:01:55,550
So the first step is to load

46
00:01:55,550 --> 00:01:58,385
OpenCV and the
XML-based classifiers.

47
00:01:58,385 --> 00:02:01,355
So import cv2 as cv,

48
00:02:01,355 --> 00:02:04,805
and then we'll create
a face_cascade classifiers.

49
00:02:04,805 --> 00:02:10,100
So cv.CascadeClassifier, we'll
load that from read only.

50
00:02:10,100 --> 00:02:15,120
I put it in the
Haarcascade_frontalface_default.xml.

51
00:02:15,980 --> 00:02:18,490
With the classifiers loaded,

52
00:02:18,490 --> 00:02:20,765
we now want to try
and detect a face.

53
00:02:20,765 --> 00:02:23,465
Let's pull on the picture we
played with the last time.

54
00:02:23,465 --> 00:02:26,525
So image equals cv.imread

55
00:02:26,525 --> 00:02:28,610
and will bring in
the floyd picture.

56
00:02:28,610 --> 00:02:32,540
We'll convert it to grayscale
using the cvtColor image.

57
00:02:32,540 --> 00:02:36,800
So gray equals cv.cvtColor,

58
00:02:36,800 --> 00:02:39,260
and we convert this to grayscale.

59
00:02:39,260 --> 00:02:43,010
The next step is to use
the face_cascade classifier.

60
00:02:43,010 --> 00:02:45,560
I'll let you go explore
the docs if you'd like to.

61
00:02:45,560 --> 00:02:49,180
But the norm is to use
the detectMultiscaleFunction.

62
00:02:49,180 --> 00:02:52,745
This function returns a list
of objects as rectangles.

63
00:02:52,745 --> 00:02:55,985
The first parameter is
an ndarray of the image.

64
00:02:55,985 --> 00:02:59,300
So faces equals
face_cascade.detectMultiScale,

65
00:02:59,300 --> 00:03:03,010
and we just pass it in
ndarray or image gray.

66
00:03:03,010 --> 00:03:06,480
Now let's just print
those phases out to the screen.

67
00:03:08,770 --> 00:03:11,940
We'll print them out of the list.

68
00:03:13,390 --> 00:03:18,500
The resulting rectangles are
in the format of x, y, w,

69
00:03:18,500 --> 00:03:20,390
h. Where the x and y denote

70
00:03:20,390 --> 00:03:23,060
the upper left-hand corner
point for the image,

71
00:03:23,060 --> 00:03:25,880
and the width and height
represent the bounding box.

72
00:03:25,880 --> 00:03:28,460
We know how to handle
this already in PIL.

73
00:03:28,460 --> 00:03:30,800
So from PIL import image,

74
00:03:30,800 --> 00:03:33,035
let's create
a pill image objects.

75
00:03:33,035 --> 00:03:35,780
So PIL image equals
image.fromarray.

76
00:03:35,780 --> 00:03:39,350
We pass in gray and we set
the mode to luminosity.

77
00:03:39,350 --> 00:03:41,870
Now let's bring in
our drawing objects.

78
00:03:41,870 --> 00:03:44,165
So from PIL import image draw,

79
00:03:44,165 --> 00:03:45,965
and let's create a context.

80
00:03:45,965 --> 00:03:48,290
So drawing equals ImageDraw.Draw

81
00:03:48,290 --> 00:03:50,170
and pass in the PIL image.

82
00:03:50,170 --> 00:03:53,895
Now, let's pull the rectangle
out of the faces object.

83
00:03:53,895 --> 00:03:56,840
So we'll take a rectangle
and faces to the list.

84
00:03:56,840 --> 00:03:58,220
We know there's
one item in there,

85
00:03:58,220 --> 00:04:00,530
so sub-zero, and now we're

86
00:04:00,530 --> 00:04:03,080
just going to draw a
rectangle around the bounds.

87
00:04:03,080 --> 00:04:05,375
So drawing.rectangle,

88
00:04:05,375 --> 00:04:07,970
passing the rectangle
we're interested in,

89
00:04:07,970 --> 00:04:09,770
set the outline to white,

90
00:04:09,770 --> 00:04:11,420
and display the image in line,

91
00:04:11,420 --> 00:04:13,810
so display PIL image.

92
00:04:13,810 --> 00:04:16,620
So not quite what
we're looking for,

93
00:04:16,620 --> 00:04:18,775
what do you think went wrong?

94
00:04:18,775 --> 00:04:22,210
Well, a quick double check
of the docs and its apparent

95
00:04:22,210 --> 00:04:25,360
that OpenCV is returning
the coordinates as x,

96
00:04:25,360 --> 00:04:29,815
y, w, h, while PIL.ImageDraw
is looking for x1,

97
00:04:29,815 --> 00:04:31,750
y1 and x2, y2.

98
00:04:31,750 --> 00:04:34,055
So it looks like
this is an easy fix.

99
00:04:34,055 --> 00:04:35,925
So let's wipe our old image.

100
00:04:35,925 --> 00:04:39,505
So PIL image and we'll
just reload our image,

101
00:04:39,505 --> 00:04:42,125
set up our drawing context,

102
00:04:42,125 --> 00:04:43,980
and draw the new box.

103
00:04:43,980 --> 00:04:46,050
So drawing.rectangle,
and so we'll take

104
00:04:46,050 --> 00:04:48,540
rec sub zero and rec sub one.

105
00:04:48,540 --> 00:04:51,825
Then we'll add to
that rec sub two

106
00:04:51,825 --> 00:04:56,040
and rec sub three as
appropriate and set outline,

107
00:04:56,040 --> 00:04:58,600
and display in line.

108
00:04:58,850 --> 00:05:01,070
We see the face detection

109
00:05:01,070 --> 00:05:02,840
works pretty good on this image.

110
00:05:02,840 --> 00:05:05,974
Note that it's apparent that
this is not head detection,

111
00:05:05,974 --> 00:05:07,910
but that the haarcascade
file that we're

112
00:05:07,910 --> 00:05:10,070
using is for eyes and a mouth.

113
00:05:10,070 --> 00:05:13,100
Let's try something a
little bit more complex.

114
00:05:13,100 --> 00:05:15,730
Let's read in that MSI
recruitment image.

115
00:05:15,730 --> 00:05:18,390
So image equals cv.Imread,

116
00:05:18,390 --> 00:05:20,790
and we'll bring in the
msi_recruitment.gif.

117
00:05:20,790 --> 00:05:22,630
Let's take a look at that image,

118
00:05:22,630 --> 00:05:24,290
to remind ourselves
what it looks like.

119
00:05:24,290 --> 00:05:26,360
So display image.fromarray.

120
00:05:26,360 --> 00:05:27,800
Remember, we have to do this,

121
00:05:27,800 --> 00:05:30,005
but we don't have to pass
them luminosity values

122
00:05:30,005 --> 00:05:32,320
because it's in full color.

123
00:05:32,320 --> 00:05:35,055
Well, what's that error about?

124
00:05:35,055 --> 00:05:37,340
Looks like there's
an error on a line deep

125
00:05:37,340 --> 00:05:39,545
within the PIL Image.py file,

126
00:05:39,545 --> 00:05:40,400
and it is trying to call

127
00:05:40,400 --> 00:05:42,740
an internal private
member called,

128
00:05:42,740 --> 00:05:45,710
dander array underscore
interface dander,

129
00:05:45,710 --> 00:05:46,910
on the image object.

130
00:05:46,910 --> 00:05:48,910
But this object is none.

131
00:05:48,910 --> 00:05:51,705
It turns out that the root
of this error is that

132
00:05:51,705 --> 00:05:54,360
OpenCV can't work
with gif images.

133
00:05:54,360 --> 00:05:56,910
This is kind of a pain
and unfortunate.

134
00:05:56,910 --> 00:05:58,695
But we know how to
fix this, right?

135
00:05:58,695 --> 00:06:01,550
One way is that we can
just open this in PIL and

136
00:06:01,550 --> 00:06:04,790
then save it as a png and
then open that in OpenCV.

137
00:06:04,790 --> 00:06:07,010
So let's use PIL
to open our image.

138
00:06:07,010 --> 00:06:10,405
So PIL image equals Image.open
and bring in our gif.

139
00:06:10,405 --> 00:06:12,394
Now let's convert it to grayscale

140
00:06:12,394 --> 00:06:14,270
for OpenCV and get the by test.

141
00:06:14,270 --> 00:06:16,190
So OpenCV version is

142
00:06:16,190 --> 00:06:19,280
equal to pilimage.convert
luminosity,

143
00:06:19,280 --> 00:06:21,635
and that is going to
return to us an array.

144
00:06:21,635 --> 00:06:23,800
Now, let's just write
that out to a file.

145
00:06:23,800 --> 00:06:29,275
So open_cv_version.save,
msirecruitment.png.

146
00:06:29,275 --> 00:06:32,090
Now that the conversion
of format is done,

147
00:06:32,090 --> 00:06:34,655
let's try reading
this back into OpenCV.

148
00:06:34,655 --> 00:06:39,575
So OpenCV image equals cv.imread
and bring out the png.

149
00:06:39,575 --> 00:06:41,360
We don't need to
color convert this,

150
00:06:41,360 --> 00:06:43,450
because we saved it as grayscale.

151
00:06:43,450 --> 00:06:46,065
Let's try and detect
faces in that image.

152
00:06:46,065 --> 00:06:50,400
So faces equal
face_cascaded.detectMultiScale,

153
00:06:50,400 --> 00:06:54,285
and passing
the mdarrays cv image.

154
00:06:54,285 --> 00:06:57,525
Now, we still have
our PIL color version in a gif.

155
00:06:57,525 --> 00:07:01,650
So PIL image equals
Image.open the gif.

156
00:07:01,650 --> 00:07:03,665
So we'll set a drawing contexts.

157
00:07:03,665 --> 00:07:06,590
So drawing equals ImageDraw.Draw.

158
00:07:06,590 --> 00:07:08,780
Now, for each line in faces,

159
00:07:08,780 --> 00:07:10,865
let's surround it with a red box.

160
00:07:10,865 --> 00:07:12,905
So for x, y,

161
00:07:12,905 --> 00:07:15,245
w, and h in faces.

162
00:07:15,245 --> 00:07:18,455
So this might actually
be new syntax for you.

163
00:07:18,455 --> 00:07:20,730
Recall that faces is a list of

164
00:07:20,730 --> 00:07:24,575
rectangles in the x y
width and height format.

165
00:07:24,575 --> 00:07:27,095
That is a list of lists.

166
00:07:27,095 --> 00:07:29,660
Instead of having to
do an iteration and

167
00:07:29,660 --> 00:07:31,670
then manually pull out each item,

168
00:07:31,670 --> 00:07:34,400
we can use something called
tuple unpacking to pull out

169
00:07:34,400 --> 00:07:38,135
individual items in the sub-list
directly to variables.

170
00:07:38,135 --> 00:07:40,930
This is really nice
Python feature.

171
00:07:40,930 --> 00:07:43,430
All right. So now we just
need to draw our box.

172
00:07:43,430 --> 00:07:45,860
So drawing.rectangle x, y

173
00:07:45,860 --> 00:07:48,305
and then x plus width,
can't forget this,

174
00:07:48,305 --> 00:07:50,570
and y plus height and set

175
00:07:50,570 --> 00:07:54,475
the outline to white,
and display Pil.image.

176
00:07:54,475 --> 00:07:56,820
Well, what happened here?

177
00:07:56,820 --> 00:07:59,000
We see that we've detected faces,

178
00:07:59,000 --> 00:08:00,590
so there's some white boxes and

179
00:08:00,590 --> 00:08:02,210
then we'd drawn
boxes around them.

180
00:08:02,210 --> 00:08:04,880
But the colors have
gone all weird.

181
00:08:04,880 --> 00:08:06,920
This it turns out has to do with

182
00:08:06,920 --> 00:08:09,335
the color limitations
for gif images.

183
00:08:09,335 --> 00:08:11,330
In short, a gif image has

184
00:08:11,330 --> 00:08:13,460
a very limited number of colors.

185
00:08:13,460 --> 00:08:15,350
This is called a color pallette,

186
00:08:15,350 --> 00:08:18,260
after the palette artists
use to mix paints.

187
00:08:18,260 --> 00:08:21,605
For gifs, the pallettes
can only be 256 colors.

188
00:08:21,605 --> 00:08:24,800
But they can be any 256 colors.

189
00:08:24,800 --> 00:08:26,820
When a new color is introduced,

190
00:08:26,820 --> 00:08:29,480
it has to take the space
of an old color.

191
00:08:29,480 --> 00:08:32,330
In this case, PIL adds
white to the pallette,

192
00:08:32,330 --> 00:08:34,550
but doesn't know
which color to represent,

193
00:08:34,550 --> 00:08:36,945
and thus messes up the image.

194
00:08:36,945 --> 00:08:40,300
Who knew there was so much to
learn about image formats?

195
00:08:40,300 --> 00:08:41,780
We can see what mode

196
00:08:41,780 --> 00:08:44,055
an image is with
the dot mode attribute.

197
00:08:44,055 --> 00:08:46,815
So if we do pil_img.mode,

198
00:08:46,815 --> 00:08:48,830
we can see that there's a list of

199
00:08:48,830 --> 00:08:50,615
modes in the Pillow
documentation,

200
00:08:50,615 --> 00:08:51,710
and they correspond with

201
00:08:51,710 --> 00:08:53,815
color spaces that
we've been using.

202
00:08:53,815 --> 00:08:56,695
For the moment though, let's
just change back to RGB,

203
00:08:56,695 --> 00:08:58,075
which represents color as

204
00:08:58,075 --> 00:09:00,520
a three byte tuple
instead of in a pallette.

205
00:09:00,520 --> 00:09:02,020
So let's read in our image,

206
00:09:02,020 --> 00:09:05,030
bring back our gif image,
nice and clean.

207
00:09:05,030 --> 00:09:08,180
Let's convert it to RGB mode.
So that's pretty easy.

208
00:09:08,180 --> 00:09:12,520
We just do
pil_image.converts with RGB,

209
00:09:12,520 --> 00:09:14,850
and let's print out this mode.

210
00:09:14,850 --> 00:09:17,810
Okay. We're pretty convinced
that we've changed it.

211
00:09:17,810 --> 00:09:19,730
Now let's go back to
drawing rectangles.

212
00:09:19,730 --> 00:09:21,910
Let's get our drawing object.

213
00:09:21,910 --> 00:09:24,765
Let's iterate through
the face sequence again.

214
00:09:24,765 --> 00:09:26,490
We'll tuple and pack as we go.

215
00:09:26,490 --> 00:09:29,565
So x, y width and
height in faces.

216
00:09:29,565 --> 00:09:31,490
Remember again, width and

217
00:09:31,490 --> 00:09:33,530
height so we have to add
these appropriately.

218
00:09:33,530 --> 00:09:35,665
So drawing rectangle x, y,

219
00:09:35,665 --> 00:09:37,695
x plus width, y plus height,

220
00:09:37,695 --> 00:09:39,020
set the outline to white.

221
00:09:39,020 --> 00:09:43,475
Finally, let's display
that image to the screen.

222
00:09:43,475 --> 00:09:46,060
Awesome. We managed to detect

223
00:09:46,060 --> 00:09:48,030
a bunch of faces in that image.

224
00:09:48,030 --> 00:09:50,490
It looks like we've
missed four faces.

225
00:09:50,490 --> 00:09:52,100
In the Machine Learning world,

226
00:09:52,100 --> 00:09:53,795
we would call
these false negatives.

227
00:09:53,795 --> 00:09:56,290
Something which the machine
thought was not a face,

228
00:09:56,290 --> 00:09:59,500
so a negative, but then
it was incorrect on.

229
00:09:59,500 --> 00:10:02,260
Consequently, we would
call it actual faces

230
00:10:02,260 --> 00:10:04,900
that were detected
as true positives.

231
00:10:04,900 --> 00:10:06,700
Something that
the machine thought was

232
00:10:06,700 --> 00:10:08,880
a face and it was correct on.

233
00:10:08,880 --> 00:10:11,325
This leaves us with
false positives,

234
00:10:11,325 --> 00:10:14,570
something the machine thought
was faced but it wasn't.

235
00:10:14,570 --> 00:10:17,285
We can see that there's
two of these in the image,

236
00:10:17,285 --> 00:10:19,580
picking up the shadow
patterns or textures in

237
00:10:19,580 --> 00:10:22,625
the shirts and matching
them with hard cascades.

238
00:10:22,625 --> 00:10:24,795
Finally, we have a class of

239
00:10:24,795 --> 00:10:26,595
true negatives where the set

240
00:10:26,595 --> 00:10:28,450
of all possible rectangles at

241
00:10:28,450 --> 00:10:31,060
the Machine Learning
classifier could consider

242
00:10:31,060 --> 00:10:32,680
where it correctly indicated

243
00:10:32,680 --> 00:10:34,420
that the result was not a face.

244
00:10:34,420 --> 00:10:36,495
In this case, there's many,

245
00:10:36,495 --> 00:10:39,035
many, many true negatives.

246
00:10:39,035 --> 00:10:41,940
There's a few ways we could
try and improve this,

247
00:10:41,940 --> 00:10:43,470
and really it requires a lot of

248
00:10:43,470 --> 00:10:47,015
experimentation to find
good values for a given image.

249
00:10:47,015 --> 00:10:49,280
First, let's create
a function which we'll

250
00:10:49,280 --> 00:10:51,685
plot out rectangles
for us in the image.

251
00:10:51,685 --> 00:10:56,030
So deaf show_rects will
have that pass and faces.

252
00:10:56,030 --> 00:10:58,255
Let's read in our gif
and convert it.

253
00:10:58,255 --> 00:11:00,260
So we'll read it in and
we'll convert it to

254
00:11:00,260 --> 00:11:03,140
RGB in the same line.

255
00:11:03,140 --> 00:11:06,265
We'll set our drawing context,

256
00:11:06,265 --> 00:11:07,940
and we'll plot all of

257
00:11:07,940 --> 00:11:11,480
the faces in all of
the rectangles in faces.

258
00:11:11,480 --> 00:11:14,830
So for x, y, width
and height and faces,

259
00:11:14,830 --> 00:11:18,410
and we've seen this before
withdrawing.rectangles,

260
00:11:18,410 --> 00:11:19,835
and we'll set the
outline to white,

261
00:11:19,835 --> 00:11:22,145
and finally we'll display this.

262
00:11:22,145 --> 00:11:25,300
All right. So now we have
a function show_faces.

263
00:11:25,300 --> 00:11:28,625
So first step, we can try
and binarize this image,

264
00:11:28,625 --> 00:11:30,835
and it turns out that
OpenCV has a built in

265
00:11:30,835 --> 00:11:33,455
binarization function
called threshold.

266
00:11:33,455 --> 00:11:35,350
You simply pass in the image,

267
00:11:35,350 --> 00:11:37,685
the midpoint, and
the maximum value,

268
00:11:37,685 --> 00:11:39,040
as well as a flag which

269
00:11:39,040 --> 00:11:40,700
indicates whether
the threshold should

270
00:11:40,700 --> 00:11:44,065
be a binary or something
else. So let's try this.

271
00:11:44,065 --> 00:11:49,925
So we'll do cv_image_bin
equals cv.thresholds.

272
00:11:49,925 --> 00:11:52,130
So we pass in the image
we're interested in.

273
00:11:52,130 --> 00:11:55,225
I'll choose 124 our threshold,

274
00:11:55,225 --> 00:11:59,975
255 for our top-end, and
then cv.threshbinary.

275
00:11:59,975 --> 00:12:02,705
We're just going to pull
from this actually sub one,

276
00:12:02,705 --> 00:12:04,520
because this function returns

277
00:12:04,520 --> 00:12:06,625
a list and we want
the second value.

278
00:12:06,625 --> 00:12:09,695
Now, let's do the actual
face detection on this.

279
00:12:09,695 --> 00:12:13,270
So faces equals
face_cascade.detectMultiscale.

280
00:12:13,270 --> 00:12:15,320
We'll just pass in
this new CV image

281
00:12:15,320 --> 00:12:16,834
after it's been binarized,

282
00:12:16,834 --> 00:12:18,635
and then let's call our show_recs

283
00:12:18,635 --> 00:12:21,170
faces to see the results.

284
00:12:21,750 --> 00:12:25,120
So that's interesting.
Not better,

285
00:12:25,120 --> 00:12:26,745
but we do see that there is one

286
00:12:26,745 --> 00:12:28,810
false positive towards
the bottom where

287
00:12:28,810 --> 00:12:31,180
the classifier detected
the sunglasses as

288
00:12:31,180 --> 00:12:34,480
eyes and the dark
shadow line as a mouth.

289
00:12:34,480 --> 00:12:37,000
If you're following in
the Notebook with this video,

290
00:12:37,000 --> 00:12:38,745
why don't you pause
things and try

291
00:12:38,745 --> 00:12:42,380
a few different parameters
for the threshold value?

292
00:12:43,000 --> 00:12:45,680
The detect multiscale
function from

293
00:12:45,680 --> 00:12:48,035
OpenCV also is a couple
of parameters.

294
00:12:48,035 --> 00:12:50,980
The first of these
is the scale factor.

295
00:12:50,980 --> 00:12:53,570
The scale factor
changes the size of

296
00:12:53,570 --> 00:12:56,260
rectangles which are
considered against the model.

297
00:12:56,260 --> 00:12:58,975
That is, the hard
cascades XML file.

298
00:12:58,975 --> 00:13:00,890
You'd think of it as
if it were changing

299
00:13:00,890 --> 00:13:03,580
the size of the rectangle
which are on the screen.

300
00:13:03,580 --> 00:13:05,855
Let's experiment with
the scale factor.

301
00:13:05,855 --> 00:13:07,160
Usually, it's a small value.

302
00:13:07,160 --> 00:13:08,935
So let's try 1.05.

303
00:13:08,935 --> 00:13:13,495
So faces equals
face_cascade.detectMulti-scale,

304
00:13:13,495 --> 00:13:16,255
and we'll pass in
our image and 1.05.

305
00:13:16,255 --> 00:13:18,010
Now, let's render those results

306
00:13:18,010 --> 00:13:19,820
to the screen through show racks.

307
00:13:19,820 --> 00:13:22,350
Now, let's also try this on 1.15.

308
00:13:22,350 --> 00:13:24,820
So I'll just put that
in there quickly.

309
00:13:24,820 --> 00:13:28,400
Then finally, let's try and
do this on 1.25 as well.

310
00:13:28,400 --> 00:13:31,600
So we'll put that in there,
and let's give it a run.

311
00:13:36,760 --> 00:13:40,860
We can see that as we
change the scale factor,

312
00:13:40,860 --> 00:13:41,875
we changed the number of

313
00:13:41,875 --> 00:13:44,485
true and false positives
and negatives.

314
00:13:44,485 --> 00:13:46,925
With the scale set to 1.05,

315
00:13:46,925 --> 00:13:48,619
we have seven true positives

316
00:13:48,619 --> 00:13:50,615
which are correctly
identified faces,

317
00:13:50,615 --> 00:13:52,940
and three false
negatives which are

318
00:13:52,940 --> 00:13:56,115
faces which are there
but not detected.

319
00:13:56,115 --> 00:13:58,640
We have three false positives
which are known

320
00:13:58,640 --> 00:14:01,790
faces which OpenCV
think are faces.

321
00:14:01,790 --> 00:14:04,170
When we change this to 1.15,

322
00:14:04,170 --> 00:14:05,760
we lose the false positives,

323
00:14:05,760 --> 00:14:07,960
but also lose one of
the true positives,

324
00:14:07,960 --> 00:14:10,195
the person to the right
wearing a hat.

325
00:14:10,195 --> 00:14:12,595
When we change this to 1.25,

326
00:14:12,595 --> 00:14:15,365
we lost more true
positives as well.

327
00:14:15,365 --> 00:14:18,565
This is actually
a really interesting phenomena

328
00:14:18,565 --> 00:14:21,215
in Machine Learning and
Artificial Intelligence.

329
00:14:21,215 --> 00:14:22,825
There's a trade-off between

330
00:14:22,825 --> 00:14:25,055
not only how accurate
the model is,

331
00:14:25,055 --> 00:14:28,280
but how the inaccuracy
actually happens.

332
00:14:28,280 --> 00:14:31,985
So which of these three models
do you think are best?

333
00:14:31,985 --> 00:14:36,140
Well, the answer to that
question is really, it depends.

334
00:14:36,140 --> 00:14:38,030
It depends why you're trying to

335
00:14:38,030 --> 00:14:40,685
detect faces and what you're
going to do with them.

336
00:14:40,685 --> 00:14:43,360
If you think these issues
are interesting,

337
00:14:43,360 --> 00:14:45,380
you might want to check out
the Applied Data Science with

338
00:14:45,380 --> 00:14:49,680
Python Specialization Michigan
offers here on Coursera.

339
00:14:49,960 --> 00:14:52,770
Beyond an opportunity
to advertise,

340
00:14:52,770 --> 00:14:54,530
did you notice anything else that

341
00:14:54,530 --> 00:14:57,280
happened when we changed
the scale factor?

342
00:14:57,280 --> 00:14:59,550
It's subtle, but
the speed at which

343
00:14:59,550 --> 00:15:03,135
the processing ran took longer
at smaller scale factors.

344
00:15:03,135 --> 00:15:05,209
This is because more sub-images

345
00:15:05,209 --> 00:15:07,270
are being considered
for those scales.

346
00:15:07,270 --> 00:15:10,835
This could also affect
which method we might use.

347
00:15:10,835 --> 00:15:14,570
Jupyter has nice support
for timing commands.

348
00:15:14,570 --> 00:15:16,220
You might have seen this before.

349
00:15:16,220 --> 00:15:18,380
A line that starts with
a percentage sign in

350
00:15:18,380 --> 00:15:20,510
Jupyter is called
a Magic function.

351
00:15:20,510 --> 00:15:22,260
This isn't normal Python,

352
00:15:22,260 --> 00:15:24,355
it's actually
a shorthand way of writing

353
00:15:24,355 --> 00:15:26,970
a function which Jupyter
has pre-defined.

354
00:15:26,970 --> 00:15:29,220
It looks a lot like
the decorators

355
00:15:29,220 --> 00:15:31,415
we've talked about in
a previous lecture,

356
00:15:31,415 --> 00:15:33,890
but the Magic functions
were around long before

357
00:15:33,890 --> 00:15:36,740
decorators were part of
the Python language.

358
00:15:36,740 --> 00:15:38,960
One of the built-in
Magic functions

359
00:15:38,960 --> 00:15:40,745
in Jupyter is called Timeit.

360
00:15:40,745 --> 00:15:44,105
This repeats a piece of
Python 10 times by default,

361
00:15:44,105 --> 00:15:47,410
and tells you the average speed
it took to complete.

362
00:15:47,410 --> 00:15:50,340
Let's find the speed
of detectMultiscale

363
00:15:50,340 --> 00:15:53,015
when using a scale of 1.05.

364
00:15:53,015 --> 00:15:55,940
So percentage Timeit,
call the Magic function,

365
00:15:55,940 --> 00:15:58,430
and then we just write
our normal Python code.

366
00:15:58,430 --> 00:16:03,630
So phase
cascade.detectMultiscale(cv img and

367
00:16:13,120 --> 00:16:16,190
1.05). Now, let's compare that to

368
00:16:16,190 --> 00:16:19,225
the speed at scale 1.15.

369
00:16:19,225 --> 00:16:20,240
So, the same thing,

370
00:16:20,240 --> 00:16:24,640
timeit face_cascade.detectMultiscale(cv
img 1.15).

371
00:16:29,440 --> 00:16:33,095
So you can see that this
is a dramatic difference,

372
00:16:33,095 --> 00:16:37,570
roughly 2.5 times slower when
using the smaller scale.

373
00:16:37,570 --> 00:16:40,250
This wraps up our
discussion of detecting

374
00:16:40,250 --> 00:16:42,910
faces in OpenCV. You'll see that.

375
00:16:42,910 --> 00:16:45,865
Like OCR, this is not
a foolproof process,

376
00:16:45,865 --> 00:16:48,240
but we can build on
the work others have done in

377
00:16:48,240 --> 00:16:51,620
Machine Learning and leverage
powerful libraries to bring

378
00:16:51,620 --> 00:16:55,550
us closer to building
a turnkey Python-based solution.

379
00:16:55,550 --> 00:16:57,850
Remember that
the detection mechanism

380
00:16:57,850 --> 00:16:59,620
isn't specific to faces,

381
00:16:59,620 --> 00:17:03,005
and that's just the hat
cascade's training data we used.

382
00:17:03,005 --> 00:17:04,660
On the web, you'll
be able to find

383
00:17:04,660 --> 00:17:06,170
other training data to detect

384
00:17:06,170 --> 00:17:07,850
other objects including eyes,

385
00:17:07,850 --> 00:17:10,450
animals, and so forth.