1
00:00:08,713 --> 00:00:10,331
Welcome back.

2
00:00:10,331 --> 00:00:13,660
I'm going to introduce you to
a programming pattern called caching.

3
00:00:14,770 --> 00:00:18,540
We have some expensive or
maybe unreliable operation.

4
00:00:20,340 --> 00:00:24,222
We'll have a little function,
it takes an input,

5
00:00:24,222 --> 00:00:28,755
produces an output, but
it might take a long time to run it.

6
00:00:28,755 --> 00:00:33,961
Each time we run it,
we're going to take the results and

7
00:00:33,961 --> 00:00:37,094
we'll stick them into a cache.

8
00:00:37,094 --> 00:00:42,950
And that cache is going to
save our previous results.

9
00:00:44,370 --> 00:00:47,855
You can think of it sort of like
a squirrel's cache of nuts,

10
00:00:47,855 --> 00:00:52,960
it's spelled C-A-C-H-E, pronounced cache.

11
00:00:52,960 --> 00:00:57,270
And it just associates
some inputs with outputs.

12
00:00:57,270 --> 00:01:00,511
I'll call them keys and results.

13
00:01:03,023 --> 00:01:07,868
And now, the next time I was going to
ask this expensive operation to do

14
00:01:07,868 --> 00:01:13,138
the very same operation that's been
done before, I would first check and

15
00:01:13,138 --> 00:01:17,760
see, hey, is this result already
available to me in the cache?

16
00:01:17,760 --> 00:01:21,800
If it is, just send back the same result
I would have gotten if I had run that

17
00:01:21,800 --> 00:01:26,005
expensive operation, but don't bother,
just take the result from the cache.

18
00:01:28,050 --> 00:01:33,678
Now, in our case,
the expensive operation is request.get.

19
00:01:38,684 --> 00:01:42,877
It's expensive because it takes a little
bit of time to go out over the Internet,

20
00:01:42,877 --> 00:01:45,530
to make a connection with another server.

21
00:01:45,530 --> 00:01:47,610
It's also a little unreliable.

22
00:01:47,610 --> 00:01:50,550
Sometimes you don't have a good Internet
connection, sometimes the server

23
00:01:50,550 --> 00:01:53,550
that you're connecting to doesn't respond
in the way that it did yesterday.

24
00:01:54,670 --> 00:02:00,760
In our text book there's some sites that
have restrictions on cross site scripting,

25
00:02:00,760 --> 00:02:03,505
and some days it works to connect
to them some days it doesn't.

26
00:02:03,505 --> 00:02:06,760
There's another problem
that sometimes when

27
00:02:06,760 --> 00:02:11,560
we request.get we're doing some debugging
and we run our code a bunch of times.

28
00:02:11,560 --> 00:02:16,430
And the site that we're talking to as
a rate limit, it says you can only make 15

29
00:02:16,430 --> 00:02:21,480
calls every 15 minutes and you've run this
more than 15 times in the last 15 minutes.

30
00:02:21,480 --> 00:02:25,630
So these are all reasons why
it's going to be good idea to

31
00:02:25,630 --> 00:02:30,120
save our results when we run a
request.get, put them in a cache, and get

32
00:02:30,120 --> 00:02:34,400
the results from the cache the next time
rather than calling request.get again.

33
00:02:36,080 --> 00:02:41,270
So we've implemented this caching
pattern in a module, it's available only

34
00:02:41,270 --> 00:02:46,473
in the textbook and it's called,
as you might guess, Request with Caching.

35
00:02:49,707 --> 00:02:54,437
So in the code window here,
I've got an import statement,

36
00:02:54,437 --> 00:02:57,377
import requests_with_caching.

37
00:02:57,377 --> 00:03:01,315
And then requests.with.caching
is available to us and

38
00:03:01,315 --> 00:03:03,780
we are going to call the method get.

39
00:03:06,390 --> 00:03:11,320
This get method in requests.with.caching
is going to return exactly the same

40
00:03:11,320 --> 00:03:16,149
result, a response object,
just like if I were to call request.get.

41
00:03:17,280 --> 00:03:21,460
But the way it works is that,
it's going to first look in the cache and

42
00:03:21,460 --> 00:03:23,340
see if it can find the result there.

43
00:03:23,340 --> 00:03:26,230
If so,
it'll give us the results from the cache.

44
00:03:26,230 --> 00:03:29,961
If it can't find it in the cache,
it calls the real request.get and

45
00:03:29,961 --> 00:03:34,215
it returns that but it also saves the
result in the cache so that the next time,

46
00:03:34,215 --> 00:03:36,320
we'll get the result from the cache.

47
00:03:39,147 --> 00:03:42,726
And there's one little twist
is that I've actually got,

48
00:03:42,726 --> 00:03:46,975
in this request.caching module,
we've implemented two caches.

49
00:03:46,975 --> 00:03:50,365
There's stuff that we provided
as part of the textbook and

50
00:03:50,365 --> 00:03:52,660
that goes into a permanent cache file.

51
00:03:53,850 --> 00:03:56,032
And then there's a temporary cache.

52
00:03:56,032 --> 00:04:03,450
We can sort of think us having two caches,
the permanent cache, And

53
00:04:03,450 --> 00:04:08,690
then there's a temporary cache and you can
think of it as a second little database.

54
00:04:09,850 --> 00:04:14,920
And that's stuff that's saved between code
runs while you're on the current page, but

55
00:04:14,920 --> 00:04:16,680
it disappears when you reload the page.

56
00:04:18,110 --> 00:04:20,550
So when you call
requests_with_caching.get,

57
00:04:20,550 --> 00:04:22,680
it checks in both of the caches.

58
00:04:22,680 --> 00:04:25,160
If it's found in either place,
it returns that.

59
00:04:25,160 --> 00:04:29,160
If it doesn't find it in either place,
then it does call request.get.

60
00:04:29,160 --> 00:04:32,600
And it saves the result in this temporary,
page specific cache.

61
00:04:34,590 --> 00:04:37,223
So let's see what happens
when we run this code.

62
00:04:46,222 --> 00:04:52,820
So we have a cache file, a permanent
cache file called, datamusecache.text.

63
00:04:52,820 --> 00:04:56,001
And our first call,

64
00:04:56,001 --> 00:05:01,432
to requests_with_caching that

65
00:05:01,432 --> 00:05:06,306
get asks for the datamuse API.

66
00:05:06,306 --> 00:05:09,500
And it asks for
words that rhyme with happy.

67
00:05:11,160 --> 00:05:15,741
And then we can see the results.

68
00:05:15,741 --> 00:05:20,733
The requests_with_caching,get tells us
whether it found the result in the caches

69
00:05:20,733 --> 00:05:22,127
or whether it didn't.

70
00:05:22,127 --> 00:05:24,695
In this case,
it didn't find in the cache, and so

71
00:05:24,695 --> 00:05:26,604
it says it's adding it to the cache.

72
00:05:34,207 --> 00:05:37,765
Then on line four,
I've printed the first 100 characters.

73
00:05:37,765 --> 00:05:42,656
And you can see the things
that are rhyming

74
00:05:42,656 --> 00:05:46,864
are snappy, and nappy, and so on.

75
00:05:49,084 --> 00:05:54,730
On line six,
I'm making exactly the same request.

76
00:05:54,730 --> 00:06:00,410
I'm asking for the same word, for
words that rhyme with happy again.

77
00:06:00,410 --> 00:06:07,200
And this time we're told that it
found it in the page specific cache.

78
00:06:07,200 --> 00:06:10,040
Because it saved it from the first
time that we made the call.

79
00:06:11,560 --> 00:06:14,876
The last call that we make,
we're asking for

80
00:06:14,876 --> 00:06:19,770
different words, we're asking for
words that rhyme with funny.

81
00:06:19,770 --> 00:06:24,679
And we have a cached result in
the permanent cache file for

82
00:06:24,679 --> 00:06:30,730
words that rhyme with funny, so
it found that in the permanent cache.

83
00:06:30,730 --> 00:06:33,754
Now if I were to run
this whole thing again,

84
00:06:33,754 --> 00:06:38,546
now it's going to run where it
already has the page specific cache.

85
00:06:38,546 --> 00:06:42,620
And by the way, you can see the results
of that page specific cache.

86
00:06:42,620 --> 00:06:46,930
It's telling us it's stored in this data
file called this page cache.text and

87
00:06:46,930 --> 00:06:50,320
it's the things that rhyme with happy,
like snappy, nappy, and scrappy.

88
00:06:52,000 --> 00:06:55,908
If I run this again,
instead of showing new here,

89
00:06:55,908 --> 00:07:01,222
it's going to tell us that it found
it in the cache, so let's do that.

90
00:07:04,206 --> 00:07:05,742
Takes it a little while to run.

91
00:07:08,259 --> 00:07:12,739
Some inefficiencies, but we're going to
see that new adding to cache is going to

92
00:07:12,739 --> 00:07:15,611
change because it's going to
find it in the cache.

93
00:07:23,133 --> 00:07:26,000
So now it says it found it
in the page specific cache.

94
00:07:27,140 --> 00:07:32,483
If I were to reload this whole page,
which I'll do now,

95
00:07:32,483 --> 00:07:38,520
I'm going to clear my markings,
and I'm going to reload the page.

96
00:07:42,739 --> 00:07:47,382
Now I've reloaded the page, and
that got rid of the page specific cache.

97
00:07:47,382 --> 00:07:52,888
When I run this again, The first time,

98
00:07:52,888 --> 00:07:57,881
it's going to have to do a call to
request.get and add it to the cache.

99
00:08:02,036 --> 00:08:05,737
That's the requests_with_caching
module that we've provided.

100
00:08:05,737 --> 00:08:07,470
It's really easy to use.

101
00:08:07,470 --> 00:08:12,021
Just import requests_with_caching and
then you call requests_with_caching.get

102
00:08:12,021 --> 00:08:14,334
the same way as you
would call request.get.

103
00:08:15,550 --> 00:08:19,580
And it just makes it so that when you
make the same request multiple times,

104
00:08:19,580 --> 00:08:23,545
the additional times that you make
the same request you'll get data from

105
00:08:23,545 --> 00:08:24,200
the cache.

106
00:08:25,680 --> 00:08:26,730
See you next time.