Sep 2, 2024 11 min read AI

A peek into Explainable AI

AI is full swing in the hype cycle and makes an amazing topic for our first newsletter. Let's take a peek into the security and responsibility side of AI with an overview of what is Explainable AI.

generated by DALL-E 3

Welcome to the first Trustwai newsletter! At this point in time, AI is fully at the peak of its hype cycle. Like technology trends before it, it seems like this cycle may never end and we find ourselves at the intersection of promise and caution. We want to embrace the power of AI, but we need to demonstrate that we have learned from our past mistakes with other technologies. We must be able to navigate the complexities that this technology brings, particularly in the realm of digital trust. In the context of cybersecurity, this presents a unique conundrum. How can we ensure the security of AI systems if we can’t fully understand or explain their decision-making processes? This is a question we must grapple with as we continue to integrate AI into our digital infrastructure.

In that spirit, today we are going to talk about one particular technology on the trust side of AI, and that is explainable AI (xAI). xAI is a fascinating concept that aims to make AI's decision-making process transparent and understandable. There are many open questions in the field of xAI and the topic crosses several domains of thought including psychology. After all, the goal is to make an AI system understandable, but, understandable to whom? As soon as you bring the human in the loop, a whole host of other problems arise that need to be dealt with.

News

xAI is a field that is under active research, and views are changing quickly. Recently, many papers published in this area are making the transition from feature-based explanation methods to human-centric approaches which is advancing AI integration into fields such as finance and healthcare. There are so many areas to explore within xAI, and we will likely revisit this in future newsletters. For example, what are the ethical implications of running these types of xAI systems? Who is liable for a decision and how does that change depending on the "involvement" of the human? How can xAI assist in the quest for human-like intelligence?

Even though xAI is in its early stages, we are already seeing it being integrated as "the selling feature" within products. Rich Data Co, for example, builds AI solutions in the lending space. As one of the only spaces with regulation mandating explanations to end users, xAI becomes a key feature of the product development. xAI, you can say, becomes the interface between the AI model and the human user.

xAI isn't only a trust solution, but a set of concepts that can lead to unique discoveries. Late last year, researches published in Scientific American a groundbreaking discovery where AI was used to discover a new class of antibiotics. Using deep learning, they screened millions of compounds in search of the prize. What was particularly interesting is that the model wasn't a traditional "black box" model, but rather, a glassbox model that explained the biochemistry behind each of its decisions. Using this explanation, researchers were able to identify 283 promising compounds, and found that many of them showed promising results.

Paper Overview

The central source for this newsletter is the article titled False Sense of Security in Explainable Artificial Intelligence. The article discusses the complexities and challenges associated with explainable AI (xAI). The authors argue that while AI regulations and policies in the EU and the USA place explainability as a central deliverable of compliant AI systems, achieving true explainability remains a complex and elusive target.

The paper highlights that even state-of-the-art methods often reach erroneous, misleading, and incomplete explanations. The term “explainability” has multiple meanings which are often used interchangeably, and there are numerous xAI methods, none of which presents a clear advantage.

The authors analyze legislative and policy developments in the United States and the European Union, such as the Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, the AI Act, the AI Liability Directive, and the General Data Protection Regulation (GDPR) from a right to explanation perspective.

They argue that these AI regulations and current market conditions threaten effective AI governance and safety because the objective of trustworthy, accountable, and transparent AI is intrinsically linked to the questionable ability of AI operators to provide meaningful explanations.

The authors caution that unless governments explicitly tackle the issue of explainability through clear legislative and policy statements that actively consider technical realities, AI governance risks becoming a vacuous “box-ticking” exercise where scientific standards are replaced with legalistic thresholds, providing only a false sense of security in xAI.

What is an Explanation?

You would have expected that in a field that has "explainable" in the name, there would be consensus on, at the very least, the answer to the question "what is an explanation". But that is not the case.

From a technical perspective, an explanation is additional meta-information generated to describe the feature importance or relevance of an input instance towards a particular output classification or decision. As the paper describes, the polysemy of explainability actually touches on 5 different areas.

The first two areas deal with the overall approach of the explanation. Explanation can be global in nature, which means that it can describe the overall process of how the AI system arrives at a decision. These global explanations look at the overall relationships between variables. These explanations are generally derived ex ante. Explanations can also be local in nature, which means that it can describe, for a given outcome, the factors that went into that specific decision. Another way to describe these explanations is that they are post hoc. Both scopes achieve different goals depending on who is trying to understand the explanation and for what purpose. Generally speaking, knowing what "types of things" a model may consider is good to give some indication of the usefulness of a model, but understanding a specific answers explanation, and the features that were used or not used in a decision, give a different granularity of understanding.

The third area to try to understand explainability is in its relation to the need to provide a comprehensive yet human-understandable explanation. The typical way to think about this is that the explanation needs to provide, to the human, the answer to the question "Why?". If you took a logical approach to answering, you might come up with the following sequence to provide an explanation. From this video

Start with the input data
Give a sequence of logical deductions, where
Each deduction conforms with the rules of logic, and
The sequence terminates with the conclusion

As the speaker in the video points out, the above sequence really doesn't scale well when we have billions or more arithmetic operations. At that point, you would need to judge explanations in a different way, a much more subjective way. You may also look at the following attributes of an explanation:

Transparency: The explanation should be able to provide its reasoning in a detailed way, as and when required
Interpretability: The explanation should actually be effective in helping users understand and interpret the model's outputs
Believability: The user must be able to believe the explanation (along the lines of plausibility)
Distinguishability: The model should produce distinguishable explanations for different sets of inputs

If you thought about it hard enough, you'd note that even the above attributes don't tell the whole story. And this ties into the fourth way to understand explainability, in terms of the user for whom the explanation is intended. What to include in an explanation will vary widely depending on the knowledge of the user trying to read it.

The last area of explainability is that, just because I can describe how a model arrived at its output, it doesn't necessarily mean that I know enough to change that result. That is to say, it doesn't provide me enough information to actually control and interact with the AI model in a meaningful way. A finance example would be answering the question "What should I do differently?" when trying to change the outcome of a loan application. So maybe our explanations need to be detailed enough to allow for "next steps".

Failure modes of xAI

The paper focused on 5 critical failure modes for xAI in relation to black-box models. As a quick recap, AI models are typically broken out into two categories from an xAI perspective. The first is directly explainable models, or glass-box models. In these models, the algorithm itself outputs a measure of uncertainty so that users may understand when its predictions are reliable or not. The second is post-hoc explainability and is generally referred to as a black-box model. These are the majority of models, particularly ones with many layers, or ones that consider many features. Not all problems are "best solved" by a particular approach, and generally involves a trade-off between accuracy and explainability.

The first critical failure mode described in the paper is robustness. Basically, a little random noise added to an input, which yields an output that is still interpreted the same way by a human, may have a vastly different explanation from an xAI perspective. To put it another way, a small amount of noise may result in the same ultimate classification by an AI system, but the explanation of how that classification was derived will vary widely.

This robustness leads to the next critical failure mode. Because of this vulnerability, xAI is subject to adversarial attacks. This ties in well back to the video mentioned above. Where AI systems are used for decision support, the "believability" of the explanation may become paramount from a human perspective. Being able to have the AI system arrive at the same conclusion, but drastically change the explanation of that decision, allows attackers to continue to use effectively the same input until they reach a positive outcome.

Another critical failure mode of xAI outputs are partial explanations. Again, let's draw an example of the field of decision support. Generically speaking, predictions from an AI model will either be correct or incorrect. In many cases, it will be obvious to tell that the model has failed to predict or has succeeded to predict a given outcome. However, there is going to be a boundary where the model prediction is believable (to be correct, let's say) but actually be incorrect (and vice versa). In this boundary area, the full explanation of how the model arrived at its decision is necessary for the human to be able to make a firm decision to trust or not trust the model output. Partial explanations may not provide sufficient information for a human to catch the error in these scenarios.

The fourth critical failure mode is one that would be unique to black-box models. In particular, models that learn/adapt dynamically over time. The point here is that the xAI techniques need to keep up with how the data and its relationship to underlying concepts change over time. An obvious case would be where, for example, AI systems trained on certain demographic data are then used to make predictions in different demographics. This can also occur as new "insights" are found within the data that are not reflected in the xAI system or model.

The last critical failure mode identified is an interesting one, that of anthropomorphizing of the xAI system or of the explanation it generates. Effectively, the human viewing the output makes their own inferences on what the explanation means based on how a human would look at the data and the explanation. This is dangerous as current theories of mind place human capabilities, such as intuition, as non-algorithmic. So, humans could tend to assume that the AI system works the same way, when it doesn't.

Current AI Legislation

The current state of regulation and law as it relates to xAI is still in its early stages, with various countries and organizations working on developing policies and guidelines to address the challenges posed by AI technologies. There is no globally unified regulatory framework for AI, and approaches to AI regulation vary across different regions and jurisdictions.

In the United States, AI regulation is active but still in the early stages. Many US states are beginning to form councils and task forces to look into AI. A lot of the AI focus has been around the privacy rights of individuals, particularly around consumer protection and the right to opt out of AI systems. On a federal level, President Biden’s 2023 executive order on AI remains the most relevant guidance. The US AI Safety Institute will be responsible for executing most of the policies called for in the order.

In Europe, the first sweeping AI law was agreed upon in the European Union. The EU’s AI Act grades types and uses of AI by how much risk they pose.

Specific laws that address xAI well include California’s Algorithmic Accountability Act, which requires businesses to disclose how they use algorithms that impact consumers. France’s Explainable AI Law mandates companies to make AI systems used in administrative decisions understandable for those affected.

There is a bunch of work also happening at the organization level, with several groups and consortiums publishing guidelines and attempting to establish best practices in this space. These works span various areas of AI systems, including technology infrastructure concerns, data quality and sharing issues, project lifecycle rules and processes, and monitoring requirements.

Our Thoughts

The main issue discussed in the paper "False Sense of Security in Explainable Artificial Intelligence (xAI)" is the gap between the technical reality of xAI and its interpretation in AI regulations and policies. The authors argue that the current AI regulations and market conditions could threaten effective AI governance and safety because the objective of trustworthy, accountable, and transparent AI is intrinsically linked to the questionable ability of AI operators to provide meaningful explanations.

For us, this manifests as the "locking" of AI regulation and market trends together. In complexity, we would use the term governing constraints. As systems form, governing constraints also from to maintain the system in place. You can think of the cycle like this. Industry is working to push AI into all areas, ultimately hoping to replace human talent but settling for "decision support" when required. In order to do this, they need to prove, to a certain degree, that they are taking a responsible look at the use of this technology. Within the realm of AI, responsibility is ill defined, and, based on the laws currently in place, is basically in the eye of the beholder. Legislators agree that responsibility is a laudable goal, but have also not defined it. This vacuum gives rise to a set of technologies (xAI companies in this case) that push via marketing the idea that AI is explainable and that they have the technology to do so. Since explainability can't be currently measured, there is no challenge to the status quo.

Over time, this system locks into place. Legislation mandates that we "need explainable AI". Explainable AI is "bought" by companies trying to use AI (regardless of its effectiveness). Explainable AI companies continue to market that they have the solution, even though research would suggest that it is an unsolved problem. This cycle creates forces that can lead to negative outcomes. For example, recent research shows that humans tend to believe models with incorrect predictions over those with the correct prediction if the incorrect model is able to believably explain its output. The mere act of providing an explanation gives credibility to the output.

The “locking” of regulation and industry might not be a good outcome given the elusiveness of xAI because it could lead to a false sense of security in xAI. If regulations are based on the assumption that AI systems can provide clear and accurate explanations, but in reality, these explanations are often misleading or incomplete, then the regulations may fail to achieve their intended purpose of ensuring the transparency, accountability, and safety of AI systems. This could potentially lead to misuse or misunderstanding of AI systems.

Conclusion

In conclusion, the paper “False Sense of Security in Explainable Artificial Intelligence (xAI)” underscores a pivotal issue that extends beyond the realm of AI specialists and impacts society at large. The complexities and nuances of xAI have profound implications for everyone, as AI systems increasingly permeate our daily lives.

The issue of explainability in AI is not just a technical challenge, but a societal one. It’s about ensuring transparency, accountability, and fairness in systems that have the potential to affect us all. Therefore, it’s crucial that we all participate in this conversation.

For those interested in going deeper into this topic, here are some suggested next steps:

Read the Full Paper: Gain a comprehensive understanding of the issues discussed by reading the full paper.
Explore Related Literature: Investigate other scholarly articles and papers on AI governance, the elusiveness of xAI, and the concept of complex adaptive systems.
Stay Updated with AI Regulations: Keep up with updates on AI regulations from reliable sources to understand how they are evolving to address the challenges posed by xAI.
Participate in Discussions and Forums: Join AI-related forums and discussions. Engaging with experts and enthusiasts in the field can provide valuable insights and different perspectives on the topic.
Attend Webinars and Conferences: Participate in webinars and conferences on AI and xAI. These platforms often host experts who share their knowledge and latest research findings.
Spread Awareness: Share your learnings about xAI with your community. Encourage others to join the conversation and contribute their perspectives.

Remember, the field of AI is rapidly evolving, and staying informed is key to understanding its implications and challenges. As we navigate this journey, your voice matters. Let’s all be part of shaping a future where AI is transparent, accountable, and beneficial for all.

News

Paper Overview

What is an Explanation?

Failure modes of xAI

Current AI Legislation

Our Thoughts

Conclusion

You might also like...

Trust the Algorithm, Meatbag: A Snarky Manifesto on Our AI-Decision Making Future

NIST Gen AI Profile

An Overview of AIDA