Exploring a new paper that aims to explain DNN behaviors
Recently, a great researcher from AAC Technologies, Caglar Aytekin, published a paper titled “Neural Networks are Decision Trees.” I read it carefully and tried to understand exactly what the big discovery from this paper is. As many data scientists will probably agree, many transformations take one algorithm into another. However, (deep) neural networks (DNNs) are hard to interpret. So, did Aytekin discover something new that leads us one step closer to the explainable AI era?
In this post, let’s explore the paper and try to understand if this is actually a new discovery. Alternatively, we’ll examine if it is just an important spotlight that any data scientist needs to know and remember while handling the DNN interpretability challenge.
Aytekin demonstrated that any classical feedforward DNN with piece-wise linear activation functions ( like ReLU) can be represented by a decision tree model. Let’s review the main difference between the two:
DNN fits parameters to transform the input and indirectly direct the activations of their neurons.
Decision trees explicitly fit parameters to direct the data flow.
The motivation for this paper is to tackle the black-box nature of DNN models and have another way to explain DNN behaviors. The work handles fully connected and convolutional networks and presents a directly equivalent decision tree representation. So, in essence, it examines the transformation from DNN to a decision tree model when taking a sequence of weights with non-linearity between them and transforming it into a new weights structure. One additional result that Aytekin discusses is the advantages of the corresponding DNN in terms of computational complexity (less storage memory).
Frosst and Hinton presented in their work  “Distilling a Neural Network into a soft decision tree” a great approach to explaining DNNs using decision trees. However, their work differs from Aytekin’s paper as they combined the advantages of both DNN and decision trees.
Building the spanning tree by computing the new weights: the suggested algorithm takes the signals that come to the network and searches for the signals where the ReLUs are activated and where they are not activated. Eventually, the algorithm (transformation) replaces/puts a vector of ones (or the slops values) and zeros.
The algorithm runs over all the layers. For each layer, it sees what the inputs from the previous layer are and calculates the dependency for each input. Actually, in each layer, a new efficient filter is selected so it will be applied to the network input (based on the previous decision). By doing so, a fully connected DNN can be represented as a single decision tree where the effective matrix, found by the transformations, acts as categorization rules.
You can also implement it for a convolutional layer. The main difference is that many decisions are made on partial input regions rather than the entire input to the layer.
About dimensionality and computational complexity: The number of categories in the obtained decision tree appears to be huge. In a fully balanced tree, we need 2 to the power of the depth of the tree (intractable). However, we also need to remember the violating and redundant rules that provide lossless pruning.
- This idea holds for DNN with piece-wise linear activation functions
- The basis of this idea that neural networks are decision trees is not new
- Personally, I found the explanation and mathematical description very straightforward , motivated to use it and boost the Explainable AI domain
- Someone needs to test this idea on ResNet 😊
The original paper can be found at: https://arxiv.org/pdf/2210.05189.pdf
 Aytekin, Caglar. “Neural Networks are Decision Trees.” arXiv preprint arXiv:2210.05189 (2022).
If you want to watch a 30 min. interview about the paper look here:
 The great Yannic Kilcher interviews Alexander Mattick about this paper, on YouTube: https://www.youtube.com/watch?v=_okxGdHM5b8&ab_channel=YannicKilcher
A great paper on applying approximation theory to deep learning to study how the DNN model organizes the signals in a hierarchical fashion:
 Balestriero, Randall. “A spline theory of deep learning.” International Conference on Machine Learning. PMLR, 2018.
A great work that combines the power of both decision trees and DNNs:
 Frosst, Nicholas, and Geoffrey Hinton. “Distilling a neural network into a soft decision tree.” arXiv preprint arXiv:1711.09784 (2017).
You can read a post on Medium summarizing this work :
 Distilling a Neural Network into a soft decision tree by Razorthink Inc, Medium, 2019.
Barak Or is an Entrepreneur and AI & navigation expert; Ex-Qualcomm. Barak holds M.Sc. and B.Sc. in Engineering and B.A. in Economics from the Technion. Winner of Gemunder prize. Barak finished his Ph.D. in the fields of AI and Sensor Fusion. Author of several papers and patents. He is the founder and CEO of ALMA Tech. LTD, an AI & advanced navigation company.
Pushing Towards the Explainable AI Era: Neural Networks are Decision Trees Republished from Source https://towardsdatascience.com/pushing-towards-the-explainable-ai-era-neural-networks-are-decision-trees-1603ab97eb1b?source=rss—-7f60cf5620c9—4 via https://towardsdatascience.com/feed