Compositionality in Deep Neural Networks - Raphaël Millière

an image of a double exposure of creative artificial Intelligence interface on a modern meeting room background. Neural networks and machine learning concept.

Image By Pixels Hunter (Adobe Stock)

Computation over compositionally structured representations is central to linguistic processing and cognition. Classical models readily account for this capacity through discrete symbolic representations that can be combined into complex representations with constituent structure. By contrast, it has long been argued that connectionist models face a dilemma: either they merely implement a classical symbolic architecture and thus lack independent explanatory value, or they rely on fundamentally unstructured representations and thus cannot exhibit the systematic, structure-sensitive properties of language and thought.

Here, Raphaël argues that recent evidence from deep learning exposes this dilemma as a false dichotomy. Behavioral evidence shows that modern deep neural networks can exhibit human-like compositional generalization without relying on discrete constituents or explicit symbolic rules built into their architecture. In addition, mechanistic evidence suggests that this behavior is enabled by content-independent computations over representations with constituent structure. Importantly, however, Raphaël argues that modern deep neural networks do not merely implement a classical architecture. Instead of concatenating discrete symbolic representations through classical variable binding, they can compose distributed (vector-based) representations through a form of fuzzy variable binding enabled by attention mechanisms in the Transformer architecture. Raphaël will offer both theoretical and empirical support for this hypothesis, and suggest that it goes a long way towards explaining the remarkable performance of language models.

The implications are threefold. First, deep neural networks need not be uninterpretable black boxes; their behavior can be explained by compositional operations over structured representations. Second, classical models are not the only route to systematicity; neural networks can learn to behave systematically from data. Third, this non-classical approach to compositionality has a number of characteristics that makes it increasingly attractive not just as an engineering project, but also as an empirically plausible model of linguistic processing and cognition in humans.

Date & time

Thu 30 May 2024, 2:00 pm - 3:30 pm

Location

Level 1 Auditorium (1.28), RSSS Building 146 Ellery Cres. Acton 2601, ACT

Speakers

Raphaël Millière

Event Series

Philosophy Departmental Seminars

Contact

Michael Barnes
Send email