解释型模型与预测性模型_模型可解释性锚点的最终指南

解释型模型与预测性模型

演示地址

关于模型解释性(XAI)的所有评论: (All ado about Model Explainability (XAI):)

There is now a laundry list of Machine Learning and Deep Learning algorithms to solve each AI problem. The more complex a model, the more accurate it tends to be in general (of course, if it has not been over-fit, and if the data pipeline is fine, and so on).

现在有一个机器学习和深度学习算法清单可以解决每个AI问题。 模型越复杂,一般而言它就越趋于精确(当然,如果模型没有过拟合,数据流水线是否合适等等)。

Although having higher accuracy of prediction is desirable, it is also becoming increasingly important to be able to explain the behaviour of the model. This is especially true in light of the new GDPR regulations which offer the “Right to Explanation.” This means that if anyone uses an AI model to make predictions for someone, then they are liable to explain why the model has predicted so. More importantly in case of classification problems where mis-classification can have high costs.

尽管期望具有更高的预测准确度,但是能够解释模型的行为也变得越来越重要。 鉴于提供了“解释权”的新GDPR法规尤其如此。 这意味着,如果有人使用AI模型为某人做出预测,则他们有责任解释该模型为何如此预测。 更重要的是,在分类问题中,错误分类可能会带来很高的成本。

A common use-case is that of loan approvals. If a bank uses a Machine Learning based classification algorithm (let say a black-box model such as XGBoost or Catboost or LightGBM), and the model classifies a particular applicant as a “Defaulter”, then the candidates loan application will be rejected.

一个常见的用例是贷款批准。 如果银行使用基于机器学习的分类算法(比方说黑匣子模型,例如XGBoost或Catboost或LightGBM),并且该模型将特定申请人分类为“默认者”,则候选人贷款申请将被拒绝。

If the candidate challenges the decision, the bank may require to explain that the model has classified the candidate as a Defaulter. Now the candidate might ask on what basis the model has classified her as a Defaulter. Now the bank needs to figure out what the model is doing to come up with its classification. If it were a simple decision tree, it would be easy to show to the customer what that essential information about her is, that put her in the Defaulter category. But using a black-box model makes it necessary for the bank to take help of explainers.

如果候选人质疑该决定,银行可能会要求解释该模型已将候选人分类为违约者。 现在,候选人可能会问,该模型在什么基础上将她归类为违约者。 现在,银行需要弄清楚该模型在做什么,以进行分类。 如果这是简单的决策树,则很容易向客户显示有关她的基本信息是什么,这使她进入了Defaulter类别。 但是,使用黑匣子模型使银行有必要获得解释者的帮助。

Note: This post is restricted to classification problems alone, since Anchors work for classification problems

注意:此职位仅限于分类问题,因为锚点适用于分类问题

Photo by Author 作者照片

石灰和SHAP: (LIME and SHAP:)

There are oh so many explainers being formulated to explain a spectrum of machine learning algorithms. Each of them has its own advantages and disadvantages.

制定了太多的解释器来解释一系列机器学习算法。 它们每个都有自己的优点和缺点。

Among the ones most used around, the most popular ones are LIME (Locally Interpretable Model-Agnostic Explainer) and Shapley Values (SHAP).

在最常用的方法中,最流行的是LIME(本地可解释模型不可知论者)和Shapley值(SHAP)。

The purpose they serve are similar yet different. They are similar in that they try to give the user an intuition of why the model has classified a particular observation under a particular category. However, they are different in their approaches in explaining the model prediction.

它们服务的目的是相似的但又有所不同。 它们的相似之处在于,它们试图使用户直观了解模型为何将特定观察结果归为特定类别。 但是,它们在解释模型预测方面的方法不同。

While LIME fits a simpler linear model to a complex model, SHAP tries to attribute relative contribution of each feature to the deviation of the prediction from the mean model prediction.

虽然LIME将较简单的线性模型拟合为复杂模型,但SHAP尝试将每个特征的相对贡献归因于预测与平均模型预测的偏差。

本地,全球和本地? (Local, Global and Glocal?)

Explainers are of two types: local explainers and global explainers. Local explainers are those explainers that explain only the prediction of interest. This means their predictions are valid only for that one observation. LIME is a good example of a local explainer. However, the disadvantage with LIME is that it is incapable of explaining models with non-linear decision boundaries. Moreover, it is also not capable of explaining surrounding observations.

解释器有两种类型:本地解释器和全局解释器。 本地解释者是仅解释兴趣预测的解释者。 这意味着他们的预测仅对那个观察有效。 LIME是本地解释器的一个很好的例子。 但是,LIME的缺点是无法解释具有非线性决策边界的模型。 而且,它也不能解释周围的观察。

On the other hand, Global explainers are those that provide an explanation for the entire dataset. These explanations are valid across observations. However, some global explainers are capable of making local explanations as well. For example, SHAP can provide a local explanation as well as a global explanation.

另一方面,全局解释器是为整个数据集提供解释的解释器。 这些解释适用于所有观察结果。 但是,一些全球性的解释者也能够进行本地解释。 例如,SHAP可以提供本地说明和全局说明。

Global explanation by Author 作者的全球说明

Since SHAP obeys the rules of fair attribution of contribution to each prediction. For details on what the rules are please refer to the following link authored by Christoph Molnar.

由于SHAP遵守每个预测的贡献公平分配规则。 有关规则的详细信息,请参阅Christoph Molnar撰写的以下链接 。

Force Plot: A single local explanation by Author 力图:作者的一个本地解释
Decision Plot: A collection of local explanations by Author 决策图:作者的本地解释的集合

However, SHAP values come with a huge computational cost. Apart from that, there is a problem of misinterpretation of SHAP values (they explain the deviation from mean prediction and not the prediction itself).

但是,SHAP值会带来巨大的计算成本。 除此之外,还有一个错误解释SHAP值的问题(它们解释了均值预测的偏差,而不是预测本身的偏差)。

These disadvantages are overcome by the new kid in the block, Anchors!

这些缺点被新来的孩子Anchors克服了!

另一个解释器! 但为什么?? (Another Explainer! But Why??)

Anchors are high precision explainers that use reinforcement learning methods to come up with the set of feature conditions (called anchors), which will help explain the observation of interest and also a set of surrounding observations with a high precision (the user is free to choose their minimum precision cut-off).

锚点是高精度的解释器,使用强化学习方法来提出一组特征条件(称为“ 锚点” ),这将有助于解释感兴趣的观测值以及一组高精度的周围观测值(用户可以自由选择其最小精度截止值)。

Anchors was developed by Marco Ribeiro, who happens to be the one who came up with LIME as well. Who better to propose an alternative.

锚是由Marco Ribeiro开发的,碰巧也是LIME的人。 谁最好提出替代方案。

锚点如何工作? (How do Anchors work?)

An intuitive way to think of Anchors is to start from individual features (Bottom-up approach). Let say you want to figure out some rules based on the feature values of observations so that if an observation ticks all the boxes, then there is an extremely high chance your prediction will match the model’s prediction.

想到锚的一种直观方法是从各个功能开始(自下而上的方法)。 假设您要根据观测值的特征值找出一些规则,以便如果观测值打勾所有框,那么您的预测与模型的预测相匹配的机会就很高。

Let’s say a person’s age, loan amount and credit history impact their classification as Defaulter or Repayer. We have built our model and has classified 100 candidates as Defaulters. We would like to explain these model predictions.

假设某人的年龄,贷款金额和信用记录会影响其分类为“默认人”或“还款人”。 我们已经建立了模型,并将100个候选人分类为违约者。 我们想解释这些模型预测。

We start with age for the sake of example. Based on the predictions, we conclude that if age is more than 55 years, then with a 0.6 precision the candidate will be classified as a Defaulter. And this rule is followed by 80% of the candidates (this is called Coverage in the paper).

为了示例,我们从年龄入手。 根据预测,我们得出的结论是,如果年龄超过55岁,则精度为0.6的候选人将被分类为违约者。 80%的候选人遵循此规则(在本文中称为Coverage)。

But 0.6 isn’t good enough, we want to improve the precision. So we examine loan amount additionally. We see that if age is above 55 years and loan amount is greater than 5 lakh rupees, then with a 0.85 precision, the candidate is classified as a Defaulter. However, this duo of feature cut-offs (called feature predicates in the paper) can only explain 50% of the predictions. This means only 50% of the predicted defaulting candidates have both age above 55 and loan amount greater than 5 lakh rupees.

但是0.6不够好,我们想提高精度。 因此,我们额外检查贷款金额。 我们看到,如果年龄超过55岁且贷款金额大于50万卢比,则精度为0.85,该候选人将被分类为违约者。 但是,这种特征截断二重奏(在本文中称为特征谓词)只能解释50%的预测。 这意味着只有50%的预计违约候选人的年龄都在55岁以上,且贷款额超过50万卢比。

85% precision also seems less. So we move on to credit history. Let say credit history is categorical. It is either “good” or “bad”. Let’s say if a candidate has age above 55 years, loan amount greater than 5 lakh rupees and a bad credit hstory, then with 100% precision the candidate will be classified as Defaulter. This means that every time you see a candidate with age more than 55 years, with a loan of more than 5 lakh rupees and bad credit history you can, without doubt, classify the candidate as a Defaulter. But this combination of anchors holds true only for 20% of the candidates.

85%的精度似乎也更低。 因此,我们继续进行信用记录。 可以说信用记录是绝对的。 它是“好”或“坏”。 假设某候选人的年龄超过55岁,贷款金额超过50万卢比,并且信用状况欠佳,那么该候选人的准确度为100%,将被归为违约者。 这意味着,每当您看到年龄超过55岁,贷款超过50万卢比且信用记录不良的候选人时,您都可以毫无疑问地将候选人分类为违约者。 但是,这种锚组合仅适用于20%的候选人。

It is important to observe that as the number of feature predicates in the anchor increases, the coverage (the number of predictions that the model can predict with high precision) goes down.

重要的是要注意,随着锚中特征谓词的数量增加,覆盖率(模型可以高精度预测的预测数量)会下降。

锚点有什么特别之处? (What’s special about Anchors?)

Firstly, anchors are model agnostic, Yay! They are also not computationally taxing like SHAP. They have better generalisability than LIME. Moreover, they are capable of explaining non linear decision boundaries since they work on feature predicates and not try to fit a linear model on data.

首先,锚点与模型无关,是的! 它们也没有像SHAP那样在计算上费力。 它们比LIME具有更好的通用性。 此外,它们能够解释非线性决策边界,因为它们处理特征谓词,而不尝试将线性模型拟合到数据上。

They also help in pointing out to business analysts or data analysts exactly which features are influencing the model output. This is actually a valuable way to explain a model which is global in the sense that for those observations that fulfil the anchor conditions are classified with high precision. By varying the precision cut-off we can see which features add to the precision the most.

它们还有助于向业务分析师或数据分析师指出哪些功能正在影响模型输出。 实际上,这是解释全局模型的一种有价值的方式,因为对于满足锚定条件的那些观测值,可以进行高精度分类。 通过更改精度截止值,我们可以看到哪些功能可以最大程度地提高精度。

However, they come with a very gaping drawback that as the anchor gets overly specific (number of feature predicates increases), the coverage (the ability to explain more observations) goes down drastically.

但是,它们具有一个非常巨大的缺点,即当锚点变得过于具体时(特征谓词的数量增加),覆盖率(解释更多观测值的能力)将急剧下降。

结论: (Conclusion:)

Model explainability is a prolific research area. In the past few years (3 years or so), the number of publications on model explainability has exploded. Even the industry is pushing for explainable AI in production.

模型的可解释性是一个多产的研究领域。 在过去的几年中(大约三年),关于模型可解释性的出版物激增。 甚至整个行业都在推动在生产中使用可解释的AI。

Many practitioners use LIME and SHAP for model explanation, but they serve different purposes, and are not extremely straight-forward to interpret. They provide a proxy for feature importance. However, Anchors provide a set of rules that are not only easy to interpretable, but are also not computationally burdensome. Additionally they are capable of explaining surrounding observations as well.

许多从业人员使用LIME和SHAP进行模型解释,但是它们具有不同的用途,并且解释起来也不是很简单。 它们提供了功能重要性的代理。 但是,锚点提供了一组规则,这些规则不仅易于解释,而且在计算上也不麻烦。 此外,他们还能够解释周围的观察结果。

This makes them a great choice for those models that need customer-friendly model explanations.

对于需要客户友好型模型说明的模型,这使其成为一个不错的选择。

参考: (Reference:)

https://christophm.github.io/interpretable-ml-book/shapley.htmlhttps://christophm.github.io/interpretable-ml-book/lime.htmlhttps://cran.r-project.org/web/packages/lime/vignettes/Understanding_lime.htmlhttps://nbviewer.jupyter.org/urls/arteagac.github.io/blog/lime.ipynbhttps://arxiv.org/pdf/1705.07874.pdfhttps://github.com/slundberg/shap/tree/fc30c661339e89e0132f5f89e5385e3681090e1f#citationshttps://homes.cs.washington.edu/~marcotcr/aaai18.pdf

https://christophm.github.io/interpretable-ml-book/shapley.html https://christophm.github.io/interpretable-ml-book/lime.html https://cran.r-project.org/ web / packages / lime / vignettes / Understanding_lime.html https://nbviewer.jupyter.org/urls/arteagac.github.io/blog/lime.ipynb https://arxiv.org/pdf/1705.07874.pdf https:/ /github.com/slundberg/shap/tree/fc30c661339e89e0132f5f89e5385e3681090e1f#citations https://homes.cs.washington.edu/~marcotcr/aaai18.pdf

翻译自: https://medium.com/swlh/ultimate-guide-to-model-explainability-anchors-2deab8239f57

解释型模型与预测性模型

你可能感兴趣的:(python,机器学习,tensorflow,深度学习,人工智能)