1 00:00:00,240 --> 00:00:19,110 This video examines the feature of Watson studio that helps to ensure fairness and explain-ability of machine learning pipelines, as well as monitored their performance after deployment. 2 00:00:20,580 --> 00:00:27,210 IBM Watson Openscale is a product that includes several important features. It can test the model 3 00:00:27,210 --> 00:00:33,450 and its predictions for fairness and apply ways to overcome bias. It can also help to 4 00:00:33,450 --> 00:00:38,310 provide explanations for model predictions that are often hard to get but are necessary 5 00:00:38,310 --> 00:00:45,000 for compliance in some application areas. It monitors the model performance and can detect 6 00:00:45,000 --> 00:00:51,690 its deterioration or model drift over time. It can alert the users when drift is detected and 7 00:00:51,690 --> 00:00:57,000 explain which predictors are causing it. We can specify criteria under which the model 8 00:00:57,000 --> 00:01:02,700 gets automatically retrained on fresh data; it also helps to measure how the model helps 9 00:01:02,700 --> 00:01:08,700 the business. The attributes to monitor for bias are automatically recommended based on 10 00:01:08,700 --> 00:01:14,970 prior experience. They can be edited as needed. Openscale then keeps track of model predictions 11 00:01:14,970 --> 00:01:21,270 for the specified groups and checks the bias in the predictions. Users need to know that 12 00:01:21,270 --> 00:01:25,440 their AI models are fair but the date of their models were trained on and include 13 00:01:25,440 --> 00:01:31,830 unwanted biases\a which may unintentionally be included in the resulting models. IBM Watson 14 00:01:31,830 --> 00:01:36,960 Openscale can detect bias when a model is in production and not just when it's being built. 15 00:01:36,960 --> 00:01:43,200 In this demo of Watson Openscale we'll monitor a credit risk model which has been trained to 16 00:01:43,200 --> 00:01:47,970 determine whether or not someone is eligible for a loan, based on a variety of different features, 17 00:01:47,970 --> 00:01:55,450 such as their credit history age and their number of dependents. After launching Openscale we can 18 00:01:55,450 --> 00:02:00,880 see a few highlighted metrics for the monitored model, such as its quality and a fairness score. 19 00:02:05,320 --> 00:02:10,990 What Openscale does is measure a model's fairness by calculating the difference between the rates at 20 00:02:10,990 --> 00:02:17,560 which different groups, for example, women versus men, received the same outcome. A fairness value 21 00:02:17,560 --> 00:02:23,380 below 100% means that the monitored group receives an unfavorable outcome more often 22 00:02:23,380 --> 00:02:29,170 than the reference group. In this case, we see that women are receiving the no-risk outcome, 23 00:02:29,170 --> 00:02:34,570 or getting approved for loans, at a lower rate than men. Openscale 24 00:02:34,570 --> 00:02:39,610 enables the inspection of each model's training data and this reveals that there was more training 25 00:02:39,610 --> 00:02:45,310 data for men than women. This can give some insight as to why the model exhibits bias against 26 00:02:45,310 --> 00:02:51,720 women who apply for loans. Data scientists can use this information to approve the model. Now, 27 00:02:51,720 --> 00:02:58,230 detecting bias is one thing-- Openscale can also mitigate it by creating a D bias model that runs 28 00:02:58,230 --> 00:03:04,770 alongside the monitored one. In this case the D bias model is 12% more fair than the production 29 00:03:04,770 --> 00:03:10,230 model. The D bias model has been trained to detect when your production model will make 30 00:03:10,230 --> 00:03:15,570 a bias prediction so that you can isolate the specific transactions that result in the bias. 31 00:03:15,570 --> 00:03:22,290 For each of these transactions Watson Openscale will flip the monitored value in a record to the 32 00:03:22,290 --> 00:03:27,930 reference value, in this case from female to male, and leave all other data points in that 33 00:03:27,930 --> 00:03:34,560 record the same. If this changes the prediction from risk to no-risk then the D biased model 34 00:03:34,560 --> 00:03:41,010 will surface the no-risk outcome as the D biased result. This is just one of the ways that Watson 35 00:03:41,010 --> 00:03:46,710 open scale helps you ensure that your models are fair explainable and compliant wherever your model 36 00:03:46,710 --> 00:03:55,320 was built or is running. Insurance underwriters can use machine learning and Openscale to more 37 00:03:55,320 --> 00:04:01,950 consistently and accurately assess claims risk, ensure fair outcomes for customers, and explain 38 00:04:01,950 --> 00:04:09,000 AI recommendations for regulatory and business intelligence purposes. Why does an AI model arrive 39 00:04:09,000 --> 00:04:14,640 at a given recommendation or prediction? Users and customers want an explanation and with most 40 00:04:14,640 --> 00:04:21,630 models providing this information is not an easy task. IBM Watson Openscale explains predictions 41 00:04:21,630 --> 00:04:26,400 in business friendly language. This credit application, for instance, was predicted to 42 00:04:26,400 --> 00:04:31,680 be a risk. Openscale determines the features which contributed positively or negatively 43 00:04:31,680 --> 00:04:39,030 to that prediction and spells them out. The explanation is presented visually, as well as in 44 00:04:39,030 --> 00:04:45,880 a sentence-based text summary in order to ensure maximum clarity. Using proprietary IBM research 45 00:04:45,880 --> 00:04:52,930 technology, Openscale also generates a contrast of explanations. Here we see the minimum changes to 46 00:04:52,930 --> 00:04:58,600 this input record which would produce a different output, changing the prediction from risk to 47 00:04:58,600 --> 00:05:05,500 no-risk. The explanations provided by Watson Openscale can help organizations comply with 48 00:05:05,500 --> 00:05:11,470 regulations such as the Fair Credit Reporting Act and GDPR which give customers the right to ask for 49 00:05:11,470 --> 00:05:20,320 reasons why their applications were denied. Before an AI model is put into production it must prove 50 00:05:20,320 --> 00:05:26,380 it can make accurate predictions on test data, a subset of its training data; however, over time, 51 00:05:26,380 --> 00:05:30,970 production data can begin to look different than training data, causing the model to start making 52 00:05:30,970 --> 00:05:38,230 less accurate predictions. This is called drift. IBM Watson Openscale monitors a model's accuracy 53 00:05:38,230 --> 00:05:44,470 on production data and compares it to accuracy on its training data. When a difference in accuracy 54 00:05:44,470 --> 00:05:51,700 exceeds a chosen threshold Openscale generates an alert. Watson Openscale reveals which transactions 55 00:05:51,700 --> 00:06:00,570 caused drift and identifies the top transaction features responsible. For instance, 25% of a 56 00:06:00,570 --> 00:06:05,250 transactions causing drift in this loan approval model were problematic because of these features, 57 00:06:05,250 --> 00:06:12,870 which contained data crucially different from the training data. The transactions causing drift can 58 00:06:12,870 --> 00:06:17,730 be sent for manual labeling and use to retrain the model so that its predictive accuracy does not 59 00:06:17,730 --> 00:06:25,110 drop at run time. Watson Openscale not only helps identify drift but also highlights its root cause 60 00:06:25,110 --> 00:06:31,170 and provides transactions which can be turned into training data useful at fixing drift. It gives you 61 00:06:31,170 --> 00:06:35,310 the insight you need to ensure that your models will consistently deliver the results you want 62 00:06:35,310 --> 00:06:41,190 over time. For instance, the retrain version of the model, but based on the recommendations made 63 00:06:41,190 --> 00:06:48,540 by Watson Openscale, started making accurate recommendations alleviating the drift. This is 64 00:06:48,540 --> 00:06:53,400 just one of the ways that Watson Openscale helps you ensure your models are fair explainable and 65 00:06:53,400 --> 00:07:03,330 compliant wherever your model was built or is running. In this video you have learned how 66 00:07:03,330 --> 00:07:09,960 Openscale ensures fairness and explain ability of models and monitors for model drift in production. 67 00:07:09,960 --> 00:07:16,260 This completes the model on IBM products for data scientists. Good luck on the quizzes!