
Image By Pixels Hunter (Adobe Stock)
Computation over compositionally structured representations is central to linguistic processing and cognition. Classical models readily account for this capacity through discrete symbolic representations that can be combined into complex representations with constituent structure. By contrast, it has long been argued that connectionist models face a dilemma: either they merely implement a classical symbolic architecture and thus lack independent explanatory value, or they rely on fundamentally unstructured representations and thus cannot exhibit the systematic, structure-sensitive properties of language and thought.
Here, Raphaël argues that recent evidence from deep learning exposes this dilemma as a false dichotomy. Behavioral evidence shows that modern deep neural networks can exhibit human-like compositional generalization without relying on discrete constituents or explicit symbolic rules built into their architecture. In addition, mechanistic evidence suggests that this behavior is enabled by content-independent computations over representations with constituent structure. Importantly, however, Raphaël argues that modern deep neural networks do not merely implement a classical architecture. Instead of concatenating discrete symbolic representations through classical variable binding, they can compose distributed (vector-based) representations through a form of fuzzy variable binding enabled by attention mechanisms in the Transformer architecture. Raphaël will offer both theoretical and empirical support for this hypothesis, and suggest that it goes a long way towards explaining the remarkable performance of language models.
The implications are threefold. First, deep neural networks need not be uninterpretable black boxes; their behavior can be explained by compositional operations over structured representations. Second, classical models are not the only route to systematicity; neural networks can learn to behave systematically from data. Third, this non-classical approach to compositionality has a number of characteristics that makes it increasingly attractive not just as an engineering project, but also as an empirically plausible model of linguistic processing and cognition in humans.
Location
Speakers
- Raphaël Millière
Event Series
Contact
- Michael Barnes