1
00:00:00,000 --> 00:00:02,290
In this video, we will discuss

2
00:00:02,290 --> 00:00:04,470
Application Program Interfaces that

3
00:00:04,470 --> 00:00:06,970
use some kind of artificial intelligence.

4
00:00:06,970 --> 00:00:09,510
We will transcribe an audio file using

5
00:00:09,510 --> 00:00:11,890
the Watson Text to Speech API.

6
00:00:11,890 --> 00:00:13,710
We will then translate the text to

7
00:00:13,710 --> 00:00:17,520
a new language using the Watson Language Translator API.

8
00:00:17,520 --> 00:00:19,360
In the API call,

9
00:00:19,360 --> 00:00:22,650
you will send a copy of the audio file to the API.

10
00:00:22,650 --> 00:00:25,440
This is sometimes called a POST request.

11
00:00:25,440 --> 00:00:27,210
Then the API will send

12
00:00:27,210 --> 00:00:30,180
the text transcription of what the individual is saying.

13
00:00:30,180 --> 00:00:34,050
Under the hood, the API is making a GET request.

14
00:00:34,050 --> 00:00:36,660
We then send the text we would like to translate

15
00:00:36,660 --> 00:00:39,520
into a second language to a second API.

16
00:00:39,520 --> 00:00:42,040
The API will translate the text

17
00:00:42,040 --> 00:00:44,430
and send the translation back to you.

18
00:00:44,430 --> 00:00:47,610
In this case, we translate English to Spanish.

19
00:00:47,610 --> 00:00:51,950
We then provide an overview of API keys and endpoints,

20
00:00:51,950 --> 00:00:55,740
Watson Speech to Text, and Watson Translate.

21
00:00:55,740 --> 00:00:59,540
First, we will review API keys and endpoints.

22
00:00:59,540 --> 00:01:02,400
They will give you access to the API.

23
00:01:02,400 --> 00:01:06,120
An API key as a way to access the API.

24
00:01:06,120 --> 00:01:08,510
It's a unique set of characters that the API

25
00:01:08,510 --> 00:01:11,440
uses to identify you and authorize you.

26
00:01:11,440 --> 00:01:15,450
Usually, your first call to the API includes the API key.

27
00:01:15,450 --> 00:01:18,380
This will allow you access to the API.

28
00:01:18,380 --> 00:01:21,800
In many APIs, you may get charged for each call.

29
00:01:21,800 --> 00:01:23,060
So like your password,

30
00:01:23,060 --> 00:01:25,330
you should keep your API key a secret.

31
00:01:25,330 --> 00:01:28,740
An endpoint is simply the location of the service.

32
00:01:28,740 --> 00:01:30,650
It's used to find the API on

33
00:01:30,650 --> 00:01:33,110
the Internet just like a web address.

34
00:01:33,110 --> 00:01:35,690
Now, we will transcribe an audio file

35
00:01:35,690 --> 00:01:38,530
using the Watson Text to Speech API.

36
00:01:38,530 --> 00:01:40,340
Before you start the lab,

37
00:01:40,340 --> 00:01:42,830
you should sign up for an API key.

38
00:01:42,830 --> 00:01:46,080
We will download an audio file into your directory.

39
00:01:46,080 --> 00:01:50,450
First, we import SpeechToTextV1 from IBM Watson.

40
00:01:50,450 --> 00:01:52,250
The service endpoint is based on

41
00:01:52,250 --> 00:01:54,290
the location of the service instance.

42
00:01:54,290 --> 00:01:58,410
We store the information in the variable url_s2t.

43
00:01:58,600 --> 00:02:01,700
To find out which URL to use,

44
00:02:01,700 --> 00:02:03,550
view the service credentials.

45
00:02:03,550 --> 00:02:06,090
You will do the same for your API key.

46
00:02:06,090 --> 00:02:09,080
You create a speech-to-text adapter object.

47
00:02:09,080 --> 00:02:12,420
The parameters are the endpoint and API key.

48
00:02:12,420 --> 00:02:14,510
You will use this object to communicate

49
00:02:14,510 --> 00:02:16,830
with the Watson Speech to Text service.

50
00:02:16,830 --> 00:02:19,010
We have the path of the wav file

51
00:02:19,010 --> 00:02:20,780
we would like to convert to text.

52
00:02:20,780 --> 00:02:22,920
We create the file object wav

53
00:02:22,920 --> 00:02:25,010
with the wav file using open.

54
00:02:25,010 --> 00:02:27,020
We set the mode to rb,

55
00:02:27,020 --> 00:02:29,870
which means to read the file in binary format.

56
00:02:29,870 --> 00:02:31,700
The file object allows us

57
00:02:31,700 --> 00:02:34,280
access to the wav file that contains the audio.

58
00:02:34,280 --> 00:02:36,290
We use the method recognize from

59
00:02:36,290 --> 00:02:38,610
the speech to text adapter object.

60
00:02:38,610 --> 00:02:40,490
This basically sends the audio file

61
00:02:40,490 --> 00:02:42,660
to Watson Speech to Text service.

62
00:02:42,660 --> 00:02:45,960
The parameter audio is the file object.

63
00:02:45,960 --> 00:02:49,280
The content type is the audio file format.

64
00:02:49,280 --> 00:02:51,080
The service sends a response

65
00:02:51,080 --> 00:02:53,160
stored in the object response.

66
00:02:53,160 --> 00:02:56,640
The attribute result contains a python dictionary.

67
00:02:56,640 --> 00:02:58,460
The key results value has

68
00:02:58,460 --> 00:03:00,410
a list that contains a dictionary.

69
00:03:00,410 --> 00:03:03,350
We are interested in the key transcript.

70
00:03:03,350 --> 00:03:04,430
We can assign it to

71
00:03:04,430 --> 00:03:08,180
the variable recognized_text as follows.

72
00:03:08,180 --> 00:03:11,450
Recognized_text now contains a string

73
00:03:11,450 --> 00:03:13,280
with a transcribed text.

74
00:03:13,280 --> 00:03:15,020
Now let's see how to translate

75
00:03:15,020 --> 00:03:17,600
the text using the Watson Language Translator.

76
00:03:17,600 --> 00:03:23,420
First, we import LanguageTranslatorV3 from ibm_watson.

77
00:03:23,420 --> 00:03:27,470
We assign the service endpoint to the variable url_l2.

78
00:03:27,470 --> 00:03:30,450
You can obtain the service in the lab instructions.

79
00:03:30,450 --> 00:03:32,960
You require an API key,

80
00:03:32,960 --> 00:03:36,800
see the lab instructions on how to obtain the API key.

81
00:03:36,800 --> 00:03:38,870
This API request requires

82
00:03:38,870 --> 00:03:41,870
the date of the version, see the documentation.

83
00:03:41,870 --> 00:03:44,560
We create a language translator object,

84
00:03:44,560 --> 00:03:46,270
LanguageTranslator.

85
00:03:46,270 --> 00:03:48,350
We can get a list of the languages that

86
00:03:48,350 --> 00:03:50,570
the service can identify as follows.

87
00:03:50,570 --> 00:03:53,010
The method returns the language code.

88
00:03:53,010 --> 00:03:57,990
For example, English has a symbol en to Spanish,

89
00:03:57,990 --> 00:04:00,310
which has the symbol es.

90
00:04:00,310 --> 00:04:02,270
In the last section, we assigned

91
00:04:02,270 --> 00:04:05,600
the transcribed texts to the variable to recognized_text.

92
00:04:05,600 --> 00:04:08,460
We can use the method translate.

93
00:04:08,460 --> 00:04:10,100
This will translate the text.

94
00:04:10,100 --> 00:04:13,130
The result is a detailed response object.

95
00:04:13,130 --> 00:04:15,980
The parameter text is the text.

96
00:04:15,980 --> 00:04:20,240
model_id is the type of model we would like to use.

97
00:04:20,240 --> 00:04:26,190
In this case, we set it to en-es for English to Spanish.

98
00:04:26,190 --> 00:04:28,460
We use the method get result to get

99
00:04:28,460 --> 00:04:29,810
the translated text and

100
00:04:29,810 --> 00:04:32,150
assign it to the variable translation.

101
00:04:32,150 --> 00:04:34,370
The result is a dictionary that includes

102
00:04:34,370 --> 00:04:37,680
the translation word count and character count.

103
00:04:37,680 --> 00:04:40,160
We can obtain the translation and assign it to

104
00:04:40,160 --> 00:04:44,110
the variable spanish_translation as follows.

105
00:04:44,110 --> 00:04:47,570
Using the variable spanish_translation,

106
00:04:47,570 --> 00:04:50,940
we can translate the text back to English as follows.

107
00:04:50,940 --> 00:04:53,000
The result is a dictionary.

108
00:04:53,000 --> 00:04:56,160
We can obtain the string with the text as follows.

109
00:04:56,160 --> 00:04:59,460
We can then translate the text to French as follows.

110
00:04:59,460 --> 00:05:02,330
Thanks for watching this video.

111
00:05:02,330 --> 00:05:06,000
(Music)