Cognitive and Behavioral science
Human conceptual representations have complex and abstract structure allowing us to
deal efficiently with extremely diverse situations. Concepts are often named by
single lexical items, providing a convenient window on mental life. For instance,
take the concepts of 'tree', 'forest', and 'friend'. These are everyday concepts,
yet they are hard to capture as (even deep) classifiers or other standard machine
learning approaches. Instead, they are fuzzy, inter-related systems grounded in our
knowledge of physical things, social agents, and so on.
Modeling meaning through programming languages.
Functions in a programming language are a good model of human concepts, because
they admit complex abstraction and model systems of related meanings.
In particular, programs provide a natural means to capture three key aspects of human concepts: compositional abstraction, graded reasoning under uncertainty, and causal relations. Human concept learning can then be viewed as (probabilistic) program induction. Existing demonstrations of this approach have been hampered by lack of efficient and scalable techniques for program induction. We propose that this can be addressed by building high-level languages for common sense domains, and using new program learning tools to induce concepts within these domains. The first part of this task amounts to constructing a "standard library" for core knowledge and the second part to concept learning by probabilistic program induction.
Understanding behavior.
Behavior is arguably the most complex phenotype one could analyze in life and
cognitive sciences. For instance, what is a concise theory that explains courtship in fruit
flies? Such questions are pervasive in the life sciences where fine-grained behavior
of model organisms (e.g., fruit flies or macaques) are being collected at an
unprecedented scale. The study of dynamic behavior such as courtship is an ideal
testbed for our research agenda. On the one hand, scientists want to discover
short programs that provide an interpretable explanation of courtship.
On the other hand, the raw data is high dimensional in both time and space, and it
also contains significant variability in the behavior of interest, thus requiring deep
learning to process raw data and instantiate modules within the program.