Understanding the World Through Code

Funded through the NSF Expeditions in Computing Program


You can subscribe to our public mailing list to receive announcements about upcoming events.
3/2021: Neurosym webinar series Jacob Andreas — X Consortium Assistant Professor at MIT in EECS and CSAIL — will be talking about symbolic representation and reasoning in DNNs.

Implicit Symbolic Representation and Reasoning in Deep Neural Networks

Standard neural network architectures can *in principle* implement symbol processing operations like logical deduction and simulation of complex automata. But do current neural models, trained on standard tasks like image recognition and language understanding, learn to perform symbol manipulation *in practice*? I'll survey two recent findings about implicit symbolic behavior in deep networks. First, I will describe a procedure for automatically labeling neurons with compositional logical descriptions of their behavior. These descriptions surface interpretable learned abstractions in models for vision and language, reveal implicit logical "definitions" of visual and linguistic categories, and enable the design of simple adversarial attacks that exploit errors in definitions. Second, I'll describe ongoing work showing that neural models for language generation perform implicit simulation of entities and relations described by text. Representations in these language models can be (linearly) translated into logical representations of world state, and can be directly edited to produce predictable changes in generated output. Together, these results suggest that highly structured representations and behaviors can emerge even in relatively unstructured models trained on natural tasks. Symbolic models of computation can play a key role in helping us understand these models.

Bio: Jacob Andreas is the X Consortium Assistant Professor at MIT in EECS and CSAIL. He did his PhD work at Berkeley, where he was a member of the Berkeley NLP Group and the Berkeley AI Research Lab. He has also spent time with the Cambridge NLIP Group, and the Center for Computational Learning Systems and NLP Group at Columbia.

When: Tuesday, March 23 2021, 4-5pm EST
Where: Zoom
2/2021: Neurosym webinar series Mayur Naik — Professor of Computer and Information Science at the University of Pennsylvania — will be talking about differentiable reasoning.

Scallop: End-to-end Differentiable Reasoning at Scale

Approaches to systematically combine symbolic reasoning with deep learning have demonstrated remarkable promise in terms of accuracy and generalizability. However, the complexity of exact probabilistic reasoning renders these methods inefficient for real-world, data-intensive machine learning applications. I will present Scallop, a scalable differentiable probabilistic Datalog engine equipped with a top-k approximate inference algorithm. The algorithm significantly reduces the amount of computation needed for inference and learning tasks without affecting their principal outcomes. To evaluate Scallop, we have crafted a challenging dataset, VQAR, comprising 4 million Visual Question Answering (VQA) instances that necessitate reasoning about real-world images with external common-sense knowledge. Scallop not only scales to these instances but also outperforms state-of-the-art neural-based approaches by 12.44%.

Bio: Mayur Naik is a Professor of Computer and Information Science at the University of Pennsylvania. His research spans the area of programming languages, with a current emphasis on developing scalable techniques to reason about programs by combining machine learning and formal methods. He is also interested in foundations and applications of neuro-symbolic approaches that synergistically combine deep learning and symbolic reasoning. He received a Ph.D. in Computer Science from Stanford University in 2008. Previously, he was a researcher at Intel Labs, Berkeley from 2008 to 2011, and an assistant professor in the College of Computing at Georgia Tech from 2011 to 2016.

When: Tuesday, February 23 2021, 4-5pm EST
Watch: Recorded Talk
1/2021: Neurosym webinar series Jiajun Wu — Assistant Professor of Computer Science at Stanford University — will be talking about some of his work on neurosymbolic approaches to computer vision.

Understanding the Visual World Through Code

Much of our visual world is highly regular: objects are often symmetric and have repetitive parts; indoor scenes such as corridors often consist of objects organized in a repetitive layout. How can we infer and represent such regular structures from raw visual data, and later exploit them for better scene recognition, synthesis, and editing? In this talk, I will present our recent work on developing neuro-symbolic methods for scene understanding. Here, symbolic programs and neural nets play complementary roles: symbolic programs are more data-efficient to train and generalize better to new scenarios, as they robustly capture high-level structure; deep nets effectively extract complex, low-level patterns from cluttered visual data. I will demonstrate the power of such hybrid models in three different domains: 2D image editing, 3D shape modeling, and human motion understanding.

Bio: Jiajun Wu is an Assistant Professor of Computer Science at Stanford University, working on computer vision, machine learning, and computational cognitive science. Before joining Stanford, he was a Visiting Faculty Researcher at Google Research. He received his PhD in Electrical Engineering and Computer Science at Massachusetts Institute of Technology. Wu's research has been recognized through the ACM Doctoral Dissertation Award Honorable Mention, the AAAI/ACM SIGAI Doctoral Dissertation Award, the MIT George M. Sprowls PhD Thesis Award in Artificial Intelligence and Decision-Making, the 2020 Samsung AI Researcher of the Year, the IROS Best Paper Award on Cognitive Robotics, and fellowships from Facebook, Nvidia, Samsung, and Adobe.

When: Tuesday January 26, 2021, 4-5pm EST
Watch: Recorded Talk
12/2020: Neurosym webinar series Justin Gottschlich — Principal Scientist and the Director & Founder of Machine Programming Research at Intel Labs — will be talking about Machine Programming.

A Glance into Machine Programming @ Intel Labs

As defined by "The Three Pillars of Machine Programming", machine programming (MP) is concerned with the automation of software development. The three pillars partition MP into the following conceptual components: (i) intention, (ii) invention, and (iii) adaptation, with data being a foundational element that is generally necessary for all pillars. While the goal of MP is complete software automation – something that is likely decades away – we believe there are many seminal research opportunities waiting to be explored today across the three pillars.
In this talk, we will provide a glance into the new Pioneering Machine Programming Research effort at Intel Labs and how it has been established around the three pillars across the entire company. We will also discuss Intel Labs’ general charter for MP, as well as a few early research systems that we have built and are using today to improve the quality and rate at which we are developing software (and hardware) in production systems

Bio: Justin Gottschlich is a Principal Scientist and the Director & Founder of Machine Programming Research at Intel Labs. He also has an academic appointment as an Adjunct Assistant Professor at the University of Pennsylvania. Justin is the Principal Investigator of the joint Intel/NSF CAPA research center, which focuses on simplifying the software programmability challenge for heterogeneous hardware. He co-founded the ACM SIGPLAN Machine Programming Symposium (previously Machine Learning and Programming Languages) and currently serves as its Steering Committee Chair. He is currently serving on two technical advisory boards: the 2020 NSF Expeditions “Understanding the World Through Code” and a new MP startup fully funded by Intel, which is currently in stealth.
Justin has a deep desire to build bridges with thought leaders across industry and academia to research disruptive technology as a community. Recently, he has been focused on machine programming, which is principally about automating software development. Justin currently has active collaborations with Amazon, Brown University, Georgia Tech, Google AI, Hebrew University, IBM Research, Microsoft Research, MIT, Penn, Stanford, UC-Berkeley, UCLA, and University of Wisconsin. He received his PhD in Computer Engineering from the University of Colorado-Boulder in 2011. Justin has 30+ peer-reviewed publications, 35+ issued patents, with 100+ patents pending.

When: Tuesday December 1, 4-5PM EST.
Watch: Recorded Talk
10/2020: Neurosym webinar series Abhinav Verma — PhD student at UT Austin — will talk about his recent work on reinforcement learning algorithms.

Programmatic Reinforcement Learning

We study reinforcement learning algorithms that generate policies that can be represented in expressive high-level Domain Specific Languages (DSL). This work aims to simultaneously address four fundamental drawbacks of Deep Reinforcement Learning (Deep-RL), where the policy is represented by a neural network; interpretability, verifiability, reliability and domain awareness. We formalize a new learning paradigm and provide empirical and theoretical evidence to show that we can generate policies in expressive DSLs that do not suffer from the above shortcomings of Deep-RL. To overcome the challenges of policy search in non-differentiable program space, we introduce a meta-algorithm that is based on mirror descent, program synthesis, and imitation learning. This approach leverages neurosymbolic learning, using synthesized symbolic programs to regularize Deep-RL and using the gradients available to Deep-RL to improve the quality of synthesized programs. Overall this approach establishes a synergistic relationship between Deep-RL and program synthesis.

Bio: Abhinav Verma is a PhD student at UT Austin where he is supervised by Swarat Chaudhuri. His research lies at the intersection of machine learning and program synthesis, with a focus on programmatically interpretable learning. He is a recipient of the 2020 JP Morgan AI Research PhD Fellowship.

When: Tuesday October 27, 4-5PM EST.
Watch: Recorded Talk
10/2020: We are having our official kickoff meeting Some of the talks will be streamed online, see the schedule for the recordings.
9/2020: Neurosym webinar series. In the first talk in the series, Kevin Ellis — research scientist at Common Sense Machines, and soon to be faculty member at the Computer Science Department at Cornell — will talk about his recent work on growing domain specific languages.

Growing domain-specific languages alongside neural program synthesizers via wake-sleep program learning

Two challenges in engineering program synthesis systems are: (1) crafting specialized yet expressive domain specific languages, and (2) designing search algorithms that can tractably explore the space of expressions in this domain specific language. We take a step toward the joint learning of domain specific languages, and the search algorithms that perform synthesis in that language. We propose an algorithm which starts with a relatively minimal domain specific language, and then enriches that language by compressing out common syntactic patterns into a library of reusable domain specific code. In tandem, the system trains a neural network to guide search over expressions in the growing language. From a machine learning perspective, this system implements a wake-sleep algorithm similar to the Helmholtz machine. We apply this algorithm to AI and program synthesis problems, with the goal of understanding how domain specific languages and neural program synthesizers can mutually bootstrap one another.

Related paper

Bio: Kevin Ellis is a research scientist at Common Sense Machines, and recently finished a PhD at MIT under Armando Solar-Lezama and Josh Tenenbaum. He works on program synthesis and artificial intelligence. He will be moving to Cornell to start as an assistant professor in the computer science department starting fall 2021.

When: Tuesday September 29, 4-5PM EST.
Watch: Recorded Talk
7/2020: Meet us at Tapia 2020. We will be present at Tapia 2020. If you are attending the (virtual) conference, come talk to us to learn more about the project and opportunities for undergraduate summer research.