One of the seminal problems in organic chemistry is to be able to engineer
molecules with particularly desirable properties, or at the very least to
predict the properties that a particular molecule is going to have.
In recent years, deep learning has been making tremendous strides in both
of these problems. Modern techniques developed by the team of co-PIs
Barzilay and Jaakkola have demonstrated remarkable abilities in both property
prediction and molecular optimization. However, deep learning approaches also
have some important limitations: First, they are extremely data intensive, limiting
their application to domains where very large amounts of data are available.
Second, they operate as a black box; from a scientific standpoint, we would like
to abstract specific substructures or functional group descriptions that caused a particular molecule to screen high
on the desired property.
The promise of neurosymbolic models is that they will be able to
better incorporate expert knowledge. Models that are better able to incorporate
expert knowledge would be useful in settings where data is difficult to gather.