1 00:00:00,000 --> 00:00:02,530 In this video, we will discuss 2 00:00:02,530 --> 00:00:06,630 Application Program Interfaces, API for short. 3 00:00:06,630 --> 00:00:10,150 Specifically, we will discuss what is an API, 4 00:00:10,150 --> 00:00:13,150 API libraries, REST API, 5 00:00:13,150 --> 00:00:15,550 including request and response. 6 00:00:15,550 --> 00:00:19,290 An API lets two pieces of software talk to each other. 7 00:00:19,290 --> 00:00:22,650 For example, you have your program, you have some data, 8 00:00:22,650 --> 00:00:24,720 you have other software components, 9 00:00:24,720 --> 00:00:26,910 you use the API to communicate with 10 00:00:26,910 --> 00:00:29,200 the API via inputs and outputs. 11 00:00:29,200 --> 00:00:30,780 Just like a function, 12 00:00:30,780 --> 00:00:32,400 you don't have to know how the API 13 00:00:32,400 --> 00:00:35,770 works but just as inputs and outputs. 14 00:00:35,770 --> 00:00:37,910 Pandas is actually a set of 15 00:00:37,910 --> 00:00:39,590 software components much of 16 00:00:39,590 --> 00:00:41,450 which are not even written in Python. 17 00:00:41,450 --> 00:00:43,310 You have some data. You have 18 00:00:43,310 --> 00:00:44,900 a set of software components. 19 00:00:44,900 --> 00:00:47,240 We use the pandas API to process 20 00:00:47,240 --> 00:00:48,590 the data by communicating with 21 00:00:48,590 --> 00:00:50,550 the other software components. 22 00:00:50,550 --> 00:00:52,710 Let's clean up the diagram. 23 00:00:52,710 --> 00:00:55,340 When you create a dictionary and then create 24 00:00:55,340 --> 00:00:58,160 a pandas object with a DataFrame constructor, 25 00:00:58,160 --> 00:01:01,290 in API lingo this is an instance. 26 00:01:01,290 --> 00:01:03,350 The data in the dictionary is passed 27 00:01:03,350 --> 00:01:05,430 along to the pandas API. 28 00:01:05,430 --> 00:01:09,200 You then use the DataFrame to communicate with the API. 29 00:01:09,200 --> 00:01:11,570 When you call the method head, 30 00:01:11,570 --> 00:01:13,010 the DataFrame communicates with 31 00:01:13,010 --> 00:01:16,610 the API displaying the first few rows of the DataFrame. 32 00:01:16,610 --> 00:01:18,390 When you call the method mean, 33 00:01:18,390 --> 00:01:21,800 the API will calculate the mean and return the values. 34 00:01:21,800 --> 00:01:25,620 REST APIs are another popular type of API. 35 00:01:25,620 --> 00:01:27,050 They allow you to communicate 36 00:01:27,050 --> 00:01:28,460 through the Internet allowing you 37 00:01:28,460 --> 00:01:31,280 to take advantage of resources like storage, 38 00:01:31,280 --> 00:01:32,840 access more data, 39 00:01:32,840 --> 00:01:36,260 artificial intelligent algorithms and much more. 40 00:01:36,260 --> 00:01:39,050 The RE stands for representational. 41 00:01:39,050 --> 00:01:40,980 The S stands for state. 42 00:01:40,980 --> 00:01:43,420 The T stands for transfer. 43 00:01:43,420 --> 00:01:47,210 In REST APIs, your program is called the client. 44 00:01:47,210 --> 00:01:49,070 The API communicates with 45 00:01:49,070 --> 00:01:51,420 a web service you call through the Internet. 46 00:01:51,420 --> 00:01:54,140 There is a set of rules regarding communication, 47 00:01:54,140 --> 00:01:57,700 input or request, and output or response. 48 00:01:57,700 --> 00:01:59,970 Here are some common terms. 49 00:01:59,970 --> 00:02:03,050 You or your code can be thought of as a client. 50 00:02:03,050 --> 00:02:05,780 The web service is referred to as a resource. 51 00:02:05,780 --> 00:02:08,700 The client finds the service via an endpoint. 52 00:02:08,700 --> 00:02:11,600 We will review this more in the next section. 53 00:02:11,600 --> 00:02:13,400 The client sends requests to 54 00:02:13,400 --> 00:02:15,930 the resource and the response to the client. 55 00:02:15,930 --> 00:02:18,260 HTTP methods are a way of 56 00:02:18,260 --> 00:02:20,520 transmitting data over the internet. 57 00:02:20,520 --> 00:02:24,510 We tell the REST APIs what to do by sending a request. 58 00:02:24,510 --> 00:02:28,890 The request is usually communicated via an HTTP message. 59 00:02:28,890 --> 00:02:33,230 The HTTP message usually contains a JSON file. 60 00:02:33,230 --> 00:02:35,150 This contains instructions for 61 00:02:35,150 --> 00:02:37,850 what operation we would like the service to perform. 62 00:02:37,850 --> 00:02:39,770 This operation is transmitted to 63 00:02:39,770 --> 00:02:41,750 the web service via the Internet. 64 00:02:41,750 --> 00:02:44,100 The service performs the operation. 65 00:02:44,100 --> 00:02:46,580 In the similar manner, the web service returns 66 00:02:46,580 --> 00:02:49,160 a response via an HTTP message, 67 00:02:49,160 --> 00:02:50,570 where the information is usually 68 00:02:50,570 --> 00:02:52,920 returned via JSON file. 69 00:02:52,920 --> 00:02:56,310 This information is transmitted back to the client. 70 00:02:56,310 --> 00:02:59,000 Sports data is always changing. 71 00:02:59,000 --> 00:03:00,980 This is an excellent application of 72 00:03:00,980 --> 00:03:03,840 an API as it can be constantly updated. 73 00:03:03,840 --> 00:03:07,910 We will use the nba_api by Swar Patel. 74 00:03:07,910 --> 00:03:10,220 The API is always being updated 75 00:03:10,220 --> 00:03:12,560 from endpoints at nba.com. 76 00:03:12,560 --> 00:03:14,180 Is simple to use, so you can 77 00:03:14,180 --> 00:03:16,740 focus on the task of collecting data. 78 00:03:16,740 --> 00:03:19,190 In the nba_api, to make 79 00:03:19,190 --> 00:03:21,860 request for a specific team, it's quite simple. 80 00:03:21,860 --> 00:03:24,180 We don't require a JSON file. 81 00:03:24,180 --> 00:03:26,380 All we require is an ID. 82 00:03:26,380 --> 00:03:29,550 This information is stored locally in the API. 83 00:03:29,550 --> 00:03:31,740 We import the module teams. 84 00:03:31,740 --> 00:03:35,020 The method get_teams returns a list of dictionaries, 85 00:03:35,020 --> 00:03:36,500 which have the same keys but 86 00:03:36,500 --> 00:03:38,390 the values depend on the team. 87 00:03:38,390 --> 00:03:40,490 The dictionary key id has 88 00:03:40,490 --> 00:03:43,870 a unique identifier for each team as a value. 89 00:03:43,870 --> 00:03:45,740 To make things easier, 90 00:03:45,740 --> 00:03:48,140 we can convert the dictionary to a table. 91 00:03:48,140 --> 00:03:51,440 First, we create the function one_dict. 92 00:03:51,440 --> 00:03:53,100 To create a dictionary, 93 00:03:53,100 --> 00:03:56,210 we use the common keys for each team as the keys. 94 00:03:56,210 --> 00:03:57,830 The value is a list. 95 00:03:57,830 --> 00:03:59,240 Each element of the list 96 00:03:59,240 --> 00:04:01,740 corresponds to the values for each team. 97 00:04:01,740 --> 00:04:04,700 We then convert the dictionary to a DataFrame. 98 00:04:04,700 --> 00:04:08,150 Each row contains the information for a different team. 99 00:04:08,150 --> 00:04:12,020 We'll use the teams nickname to find the unique ID. 100 00:04:12,020 --> 00:04:13,650 We can find the row that 101 00:04:13,650 --> 00:04:15,750 contains the warriors as follows. 102 00:04:15,750 --> 00:04:17,940 The ID is the first column. 103 00:04:17,940 --> 00:04:19,940 We can use the following line of code to 104 00:04:19,940 --> 00:04:22,440 access the first column of the DataFrame. 105 00:04:22,440 --> 00:04:24,650 We now have an integer that can be used to 106 00:04:24,650 --> 00:04:26,980 request the warriors information. 107 00:04:26,980 --> 00:04:30,770 The function Leaguegamefinder, will make an API call. 108 00:04:30,770 --> 00:04:33,590 The parameter team_id_nullable is 109 00:04:33,590 --> 00:04:35,580 the unique id for the warriors. 110 00:04:35,580 --> 00:04:40,530 Under the hood, the nba_api is making it an HTTP request. 111 00:04:40,530 --> 00:04:43,440 This is transmitted to nba.com. 112 00:04:43,440 --> 00:04:46,610 The information requested is provided and 113 00:04:46,610 --> 00:04:49,470 is transmitted via an HTTP response. 114 00:04:49,470 --> 00:04:52,470 This is assigned to the object gamefinder. 115 00:04:52,470 --> 00:04:54,350 The gamefinder object has 116 00:04:54,350 --> 00:04:57,620 a method get_data_frame that returns a data frame. 117 00:04:57,620 --> 00:04:59,150 If we view the DataFrame, 118 00:04:59,150 --> 00:05:00,830 we can see it contains information 119 00:05:00,830 --> 00:05:03,020 about all the games the warriors played. 120 00:05:03,020 --> 00:05:06,440 The plus_minus column contains information on the score. 121 00:05:06,440 --> 00:05:08,160 If the value is negative, 122 00:05:08,160 --> 00:05:10,610 the warriors lost by that many points. 123 00:05:10,610 --> 00:05:12,420 If the value is positive, 124 00:05:12,420 --> 00:05:15,220 the warriors won by that amount of points. 125 00:05:15,220 --> 00:05:17,040 The column match-up had 126 00:05:17,040 --> 00:05:18,970 the team, the warriors were playing. 127 00:05:18,970 --> 00:05:21,540 GSW stands for Golden State 128 00:05:21,540 --> 00:05:24,080 and TOR means Toronto Raptors. 129 00:05:24,080 --> 00:05:26,820 Versus signifies it was a home game, 130 00:05:26,820 --> 00:05:29,540 and the @ symbol means an away game. 131 00:05:29,540 --> 00:05:31,650 We can create two DataFrames, 132 00:05:31,650 --> 00:05:33,260 one for the games where the warriors 133 00:05:33,260 --> 00:05:35,290 faced the raptors at home, 134 00:05:35,290 --> 00:05:37,440 and the second for away games. 135 00:05:37,440 --> 00:05:40,010 We can plot out the plus_minus column 136 00:05:40,010 --> 00:05:41,280 for both DataFrames. 137 00:05:41,280 --> 00:05:44,610 We see the warriors played better at home. 138 00:05:44,610 --> 00:05:49,000 (Music)