Neural Networks and Denotation

Eric E. Allen

We introduce a framework for reasoning about what meaning is captured by the neurons in a trained neural network. We provide a strategy for discovering meaning by training a second model (referred to as an observer model) to classify the state of the model it observes (an object model) in relation to attributes of the underlying dataset. We implement and evaluate observer models in the context of a specific set of classification problems, employ heat maps for visualizing the relevance of components of an object model in the context of linear observer models, and use these visualizations to extract insights about the manner in which neural networks identify salient characteristics of their inputs. We identify important properties captured decisively in trained neural networks; some of these properties are denoted by individual neurons. Finally, we observe that the label proportion of a property denoted by a neuron is dependent on the depth of a neuron within a network; we analyze these dependencies, and provide an interpretation of them.

Knowledge Graph



Sign up or login to leave a comment