In almost every field of science, it is now possible to capture large amounts of data. This has led machine learning to play an increasingly important role in scientific discovery, for example sifting through large amounts of data to identify interesting events. But modern machine learning techniques are less well suited for the critical tasks of devising hypotheses consistent with the data or imagining new experiments to test those hypotheses.
In order to drive this research, we have selected four domains where we believe these techniques can have significant impact: organic chemistry, RNA splicing, cognitive science / behavioral modeling, and computing systems. Machine learning is already demonstrating value in all of these domains, including predicting properties of organic compounds, recognizing complex social activities, and modeling CPU performance. However, our proposed techniques could have a transformative impact in all of these domains by helping scientists move from black-box predictions to a deeper understanding of the processes that give rise to the data.This material is based upon work supported by the National Science Foundation under Grant No. 1918839. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.