Deep neural networks have specific neurons specializing in recognizing particular inputs' semantic, structural, or perceptual aspects. In computer vision, techniques are available for finding neurons that respond to individual idea categories such as object classes, textures, and colors. However, the reach of these strategies is limited, as they only label a tiny subset of neurons and behaviors in every network.
A new study suggests using plain language descriptions to mark neurons with open-ended, compositional, and expressive annotations. The model can be used for practical purposes, including checking models for sensitive demographic data or finding and minimizing the consequences of misleading text feature correlations. The experiments could turn out to e revolutionary for Neural Network Market as they show that fine-grained, automatic annotation of deep network models is achievable and valuable.
The researchers address this problem to discover valuable descriptions of a neuron's pattern of activity on input images in the visual domain. A new dataset of fine-grained image annotations was gathered. Then these annotations were utilized to build learned approximations to image distributions.
Mutual-Information-guided Linguistic Annotation of Neurons (MILAN) is a procedure that automatically labels neurons with natural language, compositional, open-ended descriptions.
MILAN finds a natural language string that optimizes pointwise mutual information with the picture regions where the neuron is active and describes it. MILAN generates fine-grained descriptions of learned features that capture logical, relational, and category structures.
These descriptions align with human-generated feature descriptions across many model architectures and workloads and help understand and control trained models. Three uses of natural language neuron descriptions are highlighted.
First, researchers utilized MILAN to characterize the distribution and significance of neurons in vision models that are selective for attribute, category, and related information. Second, they employed MILAN for auditing, revealing neurons sensitive to protected variables like race and gender. This occurred in models trained on datasets designed to disguise these characteristics.
Finally, the team utilized MILAN to improve the robustness of an image classifier by eliminating neurons that are sensitive to text properties and unrelated to class labels.