Core Concepts
The XNB classifier leverages Kernel Density Estimation and class-specific feature selection to achieve comparable classification performance to traditional Naïve Bayes while significantly improving model interpretability, particularly in high-dimensional datasets like genomic data.
Stats
The Shapiro–Wilk normality test on eighteen datasets with a mean of 42,109 variables revealed that, on average, 57% of the variables rejected the null hypothesis of normality, indicating that more than half of the variables do not follow a normal distribution.
A conditional independence test on the same datasets showed that the percentages of variables conditionally dependent on another variable given the class range from 28% to 95%, with an average of 58%, revealing a strong conditional dependency among variables.
XNB achieved an extremely low average number of variables in the classification model, with a value of 8.3, representing a remarkable average reduction of the feature space dimensionality of about 99.98%.
Quotes
"Explainable means providing details about how the model works in order to better understand why a particular decision was made."
"In the broad field of Artificial Intelligence (AI) there has recently emerged a sudden and remarkable interest in understanding how the model makes decisions in the sense that humans can interpret the knowledge contained in the model [5], named Explainable AI (XAI)."
"This research addresses two important issues: a) How relevant is each variable for each class?; b) How to improve the posterior probability estimate, given the class?"