Neurosymbolic Reading Group

We are organizing a reading group to discuss papers related to the project. The goal of the reading group is to help us understand the state of the art in the field. Currently, we plan to meet on Wednesday at 5-6PM EST, but we prioritize meeting authors, so this sometimes can change the meeting time. We are still planning the schedule for the rest of the semester, but the next few meetings are:

2023-11-01: 5:00-6:00 PM EST. TBD
2023-11-08: 5:00-6:00 PM EST. Daniel Fried will present TBD
2023-11-15: 5:00-6:00 PM EST. Ruocheng and Linlu will present Hypothesis Search: Inductive Reasoning with Language Models and Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement

Please contact Alex Gu (gua+nrg@mit.edu) for information on how to join the reading group. If you would like to present, please fill out this form.

Previously Covered Papers and Notes.

Neurosymbolic Programming in Mental Health

Read on November 1, 2023

Speakers: Morgan Talbot (MIT) and Omar Costilla-Reyes (MIT)
Abstract: Our overburdened mental healthcare system has a compelling need for new evidence-based approaches to diagnosis, prognosis, and treatment. Machine learning in mental health research offers vast potential for future clinical applications, but comes with significant challenges. Promising research directions include digital phenotyping, a process of leveraging data from personal digital devices to predict mental states, network analysis to quantify relations among symptoms or other factors, and counterfactual analysis to elucidate causal relationships that drive mental health and to identify appropriate treatments for individual patients. Key obstacles in the field include limited availability of large-scale datasets, noise and missingness in longitudinal datasets from patients' smartphones and symptom self-reports, and the extraordinary complexity of the many inter-related processes in patients' lives that affect their mental health. We will explore a range of research questions and the challenges we have encountered in addressing them, with a goal of advancing towards the application of advanced techniques such as program synthesis and neurosymbolic programming in mental health research. Slides can be found here.

Code llama: Open foundation models for code

Read on October 25, 2023

We discussed the paper Code llama: Open foundation models for code roziere2023code with author Baptiste Rozière. The paper a set of open source foundation models for code. Slides can be found here.

Differentiable Tree Operations Promote Compositional Generalization

Read on October 18, 2023

We discussed the paper Differentiable Tree Operations Promote Compositional Generalization soulos2023differentiable with author Paul Soulos. The paper describes a technique for representing trees as vectors and performing differentiable operations on them, rather than flattening them into sequences. Slides can be found here.

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Read on October 11, 2023

We discussed the paper WizardCoder: Empowering Code Large Language Models with Evol-Instruct luo2023wizardcoder with the author Ziyang Luo. The paper describes a technique that empowers Code LLMs with complex instruction fine-tuning, by adapting the Evol-Instruct method to the domain of code. Slides can be found here.

Evidence of Meaning in Language Models Trained on Programs

Read on October 4, 2023

We discussed the paper Evidence of Meaning in Language Models Trained on Programs jin2023evidence with the author Charles Jin. The paper describes a causal framework for understanding the intermediate states of a language model as it is executed on data, and how this can be correlated with semantic features of the program. Slides can be found here.

Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks

Read on September 13, 2023

We discussed the paper Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks wu2023reasoning with the author Zhaofeng Wu. The paper describes an analysis where "counterfactual" tasks are used instead of familiar ones, to determine how much the model is generalizing. Slides can be found here.

Neurosym Introduction

Read on September 6, 2023

Omar Costilla-Reyes gave a talk kicking off the reading group for the fall. Slides can be found here.

Language Model Query Language

Read on May 15, 2023

We discussed the paper Prompting Is Programming: A Query Language for Large Language Models beurer2022prompting with the authors Luca Beurer-Kellner and Marc Fisher. The paper describes a new programming language that is used to perform structured inference with LLMs. Slides can be found here.

Interpretability Discussion

Read on May 8, 2023

Omar Costilla Reyes and Atharva Seghal lead a discussion on interpretability. Slides can be found here.

Inducing causal structure for interpretable neural networks

Read on April 24, 2023

We discussed the paper Inducing causal structure for interpretable neural networks geiger2022inducing with the authors Atticus Geiger and Zhengxuan Wu. The paper describes a process by which do-notation-style interchange interventions are performed on a neural network to align certain neurons with certain states in a known causal model of the process being modeled.

Programmatically Grounded, Compositionally Generalizable Robotic Manipulation

Read on April 17, 2023

We discussed the paper Programmatically Grounded, Compositionally Generalizable Robotic Manipulationwangprogrammatically with the authors Jiayuan Mao and Ren Wang. The paper describes a modular approach to vision-to-actuation robotics pipelines that uses the syntactic and semantic structure of language instructions.

Scallop: A Language for Neurosymbolic Programming

Read on April 10, 2023

We discussed the Scallop paperli2023scallop which is upcoming at PLDI 2023 with authors Ziyang Li and Jiani Huang. The paper describes a prolog-like language for neurosymbolic programming that allows a user to write a logical program as the second half of a pipeline that takes as input the outputs of a neural network. The logical program is then treated as manipulating the distributions over the outputs of the neural network, and thus is differentiable, allowing for end-to-end training of the pipeline. Notes can be found here.

From word models to world models: Translating from natural language to the probabilistic language of thought

Read on April 3, 2023

Gabe Grand gave a presentation on an upcoming position paper on the topic of solving reasoning tasks by having a model learn to translate from natural language to the probabilistic language of thought. Specifically, an LLM is prompted with a world model and then used to translate conditions, queries, and even new definitions into Church statements, allowing the use of sampling procedures to infer the answers to queries. This approach outperforms LLMs trying to directly reason without the use of the symbolic component.

Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions and LEVER: Learning to Verify Language-to-Code Generation with Execution

Read on March 13, 2023

We discussed these papersnilearning ni2023lever with the author Ansong Ni. These papers describe an approach to learn mathematical solutions from a dataset of prefix programs and an algorithm to improve the output of code generators via a separate reranking network trained to look at their execution results. A video of the presentation can be found here

Productivity Assessment of Neural Code Completion

Read on March 6, 2023

We discussed this paperziegler2022productivity with the author Shawn Simister. This paper discusses aspects of the deployment of copilot as well as metrics of success in improving practical productivity. Notes can be found here .

Looped Transformers as Programmable Computers

Read on February 27, 2023

We discussed this papergiannou2023looped with the author Dimitris Papailiopoulos. This paper demonstrates an algorithm to convert programs into transformers, highlighting the extent to which this can be accomplished with a small number of transformer layers. Notes will be posted shortly.

Planning with Large Language Models for Code Generation

Read on February 13, 2023

We discussed this paperzhang2022planning with the authors Shun Zhang and Zhenfang Chen. This paper proposes an algorithm to more effectively sample from code transformers in which prefixes that lead to better programs are weighted more heavily. Our notes can be found here.

Parsel: A Unified Natural Language Framework for Algorithmic Reasoning

Read on February 6, 2023

We discussed Parselzelikman2022parsel with the authors. This paper proposes a framework enabling automatic implementation and validation of complex algorithms with code LLMs, using hierarchical function descriptions as an intermediate language. It is able to outperform Codex and AlphaCode on the APPS dataset. Our notes can be found here.

Binding language models in symbolic languages

Read on January 31, 2023

We discussed the new Bindercheng2022binding technique with the authors. This paper proposes Binder, a training-free neural-symbolic framework that maps the task input to a program, which (1) allows binding a unified API of language model (LM) functionalities to a programming language (e.g., SQL, Python) to extend its grammar coverage and thus tackle more diverse questions, (2) adopts an LM as both the program parser and the underlying model called by the API during execution, and (3) requires only a few in-context exemplar annotations. Our notes can be found here.

Learning Differentiable Programs with Admissible Neural Heuristics

Read on January 24, 2023

We discussed NEARshah2020learning. This paper provides a method to learn differentiable functions expressed as programs in a domain-specific language by relaxing programs into differentiable neural networks. Our notes can be found here.

Understanding the World Through Code

Funded through the NSF Expeditions in Computing Program