1 00:00:08,713 --> 00:00:10,331 Welcome back. 2 00:00:10,331 --> 00:00:13,660 I'm going to introduce you to a programming pattern called caching. 3 00:00:14,770 --> 00:00:18,540 We have some expensive or maybe unreliable operation. 4 00:00:20,340 --> 00:00:24,222 We'll have a little function, it takes an input, 5 00:00:24,222 --> 00:00:28,755 produces an output, but it might take a long time to run it. 6 00:00:28,755 --> 00:00:33,961 Each time we run it, we're going to take the results and 7 00:00:33,961 --> 00:00:37,094 we'll stick them into a cache. 8 00:00:37,094 --> 00:00:42,950 And that cache is going to save our previous results. 9 00:00:44,370 --> 00:00:47,855 You can think of it sort of like a squirrel's cache of nuts, 10 00:00:47,855 --> 00:00:52,960 it's spelled C-A-C-H-E, pronounced cache. 11 00:00:52,960 --> 00:00:57,270 And it just associates some inputs with outputs. 12 00:00:57,270 --> 00:01:00,511 I'll call them keys and results. 13 00:01:03,023 --> 00:01:07,868 And now, the next time I was going to ask this expensive operation to do 14 00:01:07,868 --> 00:01:13,138 the very same operation that's been done before, I would first check and 15 00:01:13,138 --> 00:01:17,760 see, hey, is this result already available to me in the cache? 16 00:01:17,760 --> 00:01:21,800 If it is, just send back the same result I would have gotten if I had run that 17 00:01:21,800 --> 00:01:26,005 expensive operation, but don't bother, just take the result from the cache. 18 00:01:28,050 --> 00:01:33,678 Now, in our case, the expensive operation is request.get. 19 00:01:38,684 --> 00:01:42,877 It's expensive because it takes a little bit of time to go out over the Internet, 20 00:01:42,877 --> 00:01:45,530 to make a connection with another server. 21 00:01:45,530 --> 00:01:47,610 It's also a little unreliable. 22 00:01:47,610 --> 00:01:50,550 Sometimes you don't have a good Internet connection, sometimes the server 23 00:01:50,550 --> 00:01:53,550 that you're connecting to doesn't respond in the way that it did yesterday. 24 00:01:54,670 --> 00:02:00,760 In our text book there's some sites that have restrictions on cross site scripting, 25 00:02:00,760 --> 00:02:03,505 and some days it works to connect to them some days it doesn't. 26 00:02:03,505 --> 00:02:06,760 There's another problem that sometimes when 27 00:02:06,760 --> 00:02:11,560 we request.get we're doing some debugging and we run our code a bunch of times. 28 00:02:11,560 --> 00:02:16,430 And the site that we're talking to as a rate limit, it says you can only make 15 29 00:02:16,430 --> 00:02:21,480 calls every 15 minutes and you've run this more than 15 times in the last 15 minutes. 30 00:02:21,480 --> 00:02:25,630 So these are all reasons why it's going to be good idea to 31 00:02:25,630 --> 00:02:30,120 save our results when we run a request.get, put them in a cache, and get 32 00:02:30,120 --> 00:02:34,400 the results from the cache the next time rather than calling request.get again. 33 00:02:36,080 --> 00:02:41,270 So we've implemented this caching pattern in a module, it's available only 34 00:02:41,270 --> 00:02:46,473 in the textbook and it's called, as you might guess, Request with Caching. 35 00:02:49,707 --> 00:02:54,437 So in the code window here, I've got an import statement, 36 00:02:54,437 --> 00:02:57,377 import requests_with_caching. 37 00:02:57,377 --> 00:03:01,315 And then requests.with.caching is available to us and 38 00:03:01,315 --> 00:03:03,780 we are going to call the method get. 39 00:03:06,390 --> 00:03:11,320 This get method in requests.with.caching is going to return exactly the same 40 00:03:11,320 --> 00:03:16,149 result, a response object, just like if I were to call request.get. 41 00:03:17,280 --> 00:03:21,460 But the way it works is that, it's going to first look in the cache and 42 00:03:21,460 --> 00:03:23,340 see if it can find the result there. 43 00:03:23,340 --> 00:03:26,230 If so, it'll give us the results from the cache. 44 00:03:26,230 --> 00:03:29,961 If it can't find it in the cache, it calls the real request.get and 45 00:03:29,961 --> 00:03:34,215 it returns that but it also saves the result in the cache so that the next time, 46 00:03:34,215 --> 00:03:36,320 we'll get the result from the cache. 47 00:03:39,147 --> 00:03:42,726 And there's one little twist is that I've actually got, 48 00:03:42,726 --> 00:03:46,975 in this request.caching module, we've implemented two caches. 49 00:03:46,975 --> 00:03:50,365 There's stuff that we provided as part of the textbook and 50 00:03:50,365 --> 00:03:52,660 that goes into a permanent cache file. 51 00:03:53,850 --> 00:03:56,032 And then there's a temporary cache. 52 00:03:56,032 --> 00:04:03,450 We can sort of think us having two caches, the permanent cache, And 53 00:04:03,450 --> 00:04:08,690 then there's a temporary cache and you can think of it as a second little database. 54 00:04:09,850 --> 00:04:14,920 And that's stuff that's saved between code runs while you're on the current page, but 55 00:04:14,920 --> 00:04:16,680 it disappears when you reload the page. 56 00:04:18,110 --> 00:04:20,550 So when you call requests_with_caching.get, 57 00:04:20,550 --> 00:04:22,680 it checks in both of the caches. 58 00:04:22,680 --> 00:04:25,160 If it's found in either place, it returns that. 59 00:04:25,160 --> 00:04:29,160 If it doesn't find it in either place, then it does call request.get. 60 00:04:29,160 --> 00:04:32,600 And it saves the result in this temporary, page specific cache. 61 00:04:34,590 --> 00:04:37,223 So let's see what happens when we run this code. 62 00:04:46,222 --> 00:04:52,820 So we have a cache file, a permanent cache file called, datamusecache.text. 63 00:04:52,820 --> 00:04:56,001 And our first call, 64 00:04:56,001 --> 00:05:01,432 to requests_with_caching that 65 00:05:01,432 --> 00:05:06,306 get asks for the datamuse API. 66 00:05:06,306 --> 00:05:09,500 And it asks for words that rhyme with happy. 67 00:05:11,160 --> 00:05:15,741 And then we can see the results. 68 00:05:15,741 --> 00:05:20,733 The requests_with_caching,get tells us whether it found the result in the caches 69 00:05:20,733 --> 00:05:22,127 or whether it didn't. 70 00:05:22,127 --> 00:05:24,695 In this case, it didn't find in the cache, and so 71 00:05:24,695 --> 00:05:26,604 it says it's adding it to the cache. 72 00:05:34,207 --> 00:05:37,765 Then on line four, I've printed the first 100 characters. 73 00:05:37,765 --> 00:05:42,656 And you can see the things that are rhyming 74 00:05:42,656 --> 00:05:46,864 are snappy, and nappy, and so on. 75 00:05:49,084 --> 00:05:54,730 On line six, I'm making exactly the same request. 76 00:05:54,730 --> 00:06:00,410 I'm asking for the same word, for words that rhyme with happy again. 77 00:06:00,410 --> 00:06:07,200 And this time we're told that it found it in the page specific cache. 78 00:06:07,200 --> 00:06:10,040 Because it saved it from the first time that we made the call. 79 00:06:11,560 --> 00:06:14,876 The last call that we make, we're asking for 80 00:06:14,876 --> 00:06:19,770 different words, we're asking for words that rhyme with funny. 81 00:06:19,770 --> 00:06:24,679 And we have a cached result in the permanent cache file for 82 00:06:24,679 --> 00:06:30,730 words that rhyme with funny, so it found that in the permanent cache. 83 00:06:30,730 --> 00:06:33,754 Now if I were to run this whole thing again, 84 00:06:33,754 --> 00:06:38,546 now it's going to run where it already has the page specific cache. 85 00:06:38,546 --> 00:06:42,620 And by the way, you can see the results of that page specific cache. 86 00:06:42,620 --> 00:06:46,930 It's telling us it's stored in this data file called this page cache.text and 87 00:06:46,930 --> 00:06:50,320 it's the things that rhyme with happy, like snappy, nappy, and scrappy. 88 00:06:52,000 --> 00:06:55,908 If I run this again, instead of showing new here, 89 00:06:55,908 --> 00:07:01,222 it's going to tell us that it found it in the cache, so let's do that. 90 00:07:04,206 --> 00:07:05,742 Takes it a little while to run. 91 00:07:08,259 --> 00:07:12,739 Some inefficiencies, but we're going to see that new adding to cache is going to 92 00:07:12,739 --> 00:07:15,611 change because it's going to find it in the cache. 93 00:07:23,133 --> 00:07:26,000 So now it says it found it in the page specific cache. 94 00:07:27,140 --> 00:07:32,483 If I were to reload this whole page, which I'll do now, 95 00:07:32,483 --> 00:07:38,520 I'm going to clear my markings, and I'm going to reload the page. 96 00:07:42,739 --> 00:07:47,382 Now I've reloaded the page, and that got rid of the page specific cache. 97 00:07:47,382 --> 00:07:52,888 When I run this again, The first time, 98 00:07:52,888 --> 00:07:57,881 it's going to have to do a call to request.get and add it to the cache. 99 00:08:02,036 --> 00:08:05,737 That's the requests_with_caching module that we've provided. 100 00:08:05,737 --> 00:08:07,470 It's really easy to use. 101 00:08:07,470 --> 00:08:12,021 Just import requests_with_caching and then you call requests_with_caching.get 102 00:08:12,021 --> 00:08:14,334 the same way as you would call request.get. 103 00:08:15,550 --> 00:08:19,580 And it just makes it so that when you make the same request multiple times, 104 00:08:19,580 --> 00:08:23,545 the additional times that you make the same request you'll get data from 105 00:08:23,545 --> 00:08:24,200 the cache. 106 00:08:25,680 --> 00:08:26,730 See you next time.