QII (Quantitative Input Influence)

What is QII?

Quantitative Input Influence (QII) is a model-agnostic technique for measuring how individual input features—or subsets of features—affect the predictions of complex, potentially opaque machine learning models. Its defining characteristic is that it treats input influence not just as a matter of feature importance scores or linear approximations, but as a question of how an input variable’s presence or value distribution changes the probability distribution of the model’s predictions. In other words, QII tries to answer, “How different would the model’s output distribution be if we altered or ‘randomized’ a particular input (or set of inputs)?”

This approach to explanation is motivated by the need for a robust, theoretically grounded framework that can handle correlated features and provide meaningful attributions of influence in a way that’s consistent and fair across different subsets of variables.

Core Concept

Most standard explanation methods attempt to measure feature influence by observing how predictions change when a feature is varied. QII generalizes this idea by considering the entire distribution of outcomes rather than looking at just point estimates or local linear approximations.

Key Idea:

Take a model M and a dataset D representing the joint distribution of input features.
Identify a feature (or feature set) X whose influence you want to measure.
Consider two scenarios:
1. Original Scenario: The model predictions given the true joint distribution of all features, including X, as observed in D.
2. Perturbed Scenario: A hypothetical scenario where the distribution of X is replaced with a different, “intervened” distribution—often uniform noise, marginal distributions, or some baseline distribution—while holding other variables’ distributions fixed.

By comparing the model’s predicted output distributions between these two scenarios, QII quantifies how much influence the presence and distribution of X had on the model. This comparison can be formalized using metrics such as the Kullback-Leibler divergence, Jensen-Shannon divergence, or total variation distance to measure the difference in output distributions.

Key Properties

Model-Agnostic:
QII does not rely on internal model parameters. It only needs the ability to query the model’s predictions under various input distributions. Therefore, it can be applied to neural networks, random forests, gradient boosting machines, support vector machines, and even proprietary “black-box” models with no transparency.
Distribution-Based Explanation:
Unlike local linear approximation methods (e.g., LIME) or additive feature-attribution methods (e.g., SHAP), QII emphasizes how changes in the input distribution alter the overall distribution of outputs. This inherently captures global aspects of the model’s behavior and how the model depends on feature relationships.
Handling Correlated Features:
One of the toughest challenges in explainability is dealing with correlated features. Traditional variable-importance methods can misattribute importance when multiple features carry similar information. QII, by examining distributions and allowing for interventions, can be structured to handle conditional distributions that help disentangle these correlations. Through a careful selection of baseline distributions and subset conditioning, QII can tease apart which features truly drive outcome changes.
Local and Global Interpretations:
QII can be adapted to provide both:
- Global Explanations: Measuring influence of features over the entire data distribution, offering insight into which inputs matter most across all predictions.
- Local Explanations: Applying the QII framework to a single instance or small subset of data points. In such a case, the “distribution” might refer to a conditional distribution around that instance. This reveals how changing a feature for that specific case affects the model’s output probability distribution.
Flexibility in Metrics:
QII does not prescribe a single metric for comparing distributions. Researchers or practitioners can choose divergence measures (KL, JS), variation distances, or other statistical distances. This flexibility allows QII to adapt to different interpretability needs or domain-specific definitions of what constitutes a “significant difference” in predictions.

Comparing QII to Other Methods

Versus Feature Importance (e.g., Gini Importance in Random Forests):
Traditional feature importance metrics often rely on internal model heuristics. QII, in contrast, focuses on output distribution changes. This can lead to more robust explanations, especially when dealing with non-linear, highly complex models.
Versus LIME (Local Interpretable Model-agnostic Explanations):
LIME fits a simple, local model around a prediction to explain it. QII, on the other hand, can assess how changing the input’s distribution affects the output, providing a more holistic view of influence that is not constrained to linear approximations.
Versus SHAP (Shapley Values):
SHAP offers a theoretically principled way to distribute credit among features. QII shares some conceptual similarities, especially when analyzing subsets of features and their combined effects. However, SHAP values are computed based on conditional expectations and marginal contributions, whereas QII focuses on altering distributions and measuring resulting changes in the output distribution. QII can be considered more general but may be more computationally demanding.

Technical Considerations and Complexity

Computational Overhead:
Implementing QII typically involves repeatedly sampling from distributions and re-querying the model under different input conditions. For many features or large datasets, this can become computationally expensive. Approximation methods, sampling strategies, or focusing on key subsets of features can help mitigate the computational cost.
Choice of Intervention Distribution:
Selecting how to perturb the input distributions is crucial. Should you replace a feature’s distribution with uniform noise, its marginal distribution, or some domain-specific baseline? The chosen baseline greatly influences the resulting QII scores. Domain knowledge can guide these choices for more meaningful explanations.
Handling High-Dimensionality:
As the number of features grows, exploring all combinations of variables (to assess their joint influence) can become impractical. Techniques to reduce complexity include focusing on a small set of candidate features, hierarchical grouping of features, or using dimension-reduction techniques before applying QII.

Use Cases

Compliance and Regulatory Settings:
In finance or healthcare, regulatory bodies often require that decisions made by AI systems be explainable. QII can provide a robust, statistically rigorous explanation by showing how changing a protected attribute (e.g., race, gender) would shift the decision distribution, thus highlighting potential bias.
Model Validation in Critical Domains:
When deploying AI in safety-critical industries (aviation, autonomous driving), QII can help engineers and stakeholders understand how sensitive the model is to variations in sensor inputs or environmental conditions.
Feature Engineering and Model Improvement:
By identifying which features drastically change the output distribution, data scientists can refine their feature engineering efforts. If a feature that should logically have minimal influence significantly alters predictions, it might indicate a data leakage or a need for re-checking data quality and model assumptions.

Conclusion

QII (Quantitative Input Influence) provides a powerful and flexible framework for explaining the impact of input variables on a model’s predictions. By treating explanation as a problem of comparing output distributions under different input scenarios, QII goes beyond simple feature importance metrics. Although it can be computationally more intensive and requires careful consideration of intervention distributions, QII’s rigorous approach to handling correlated features and providing both global and local insights makes it a valuable tool in the arsenal of explainable AI methodologies.