High-throughput technologies now allow scientists to measure the DNA, RNA, and protein content of any given sample, but measuring all of the small molecules in that same sample is still essentially impossible. Remarkably, this gap is computational rather than experimental in nature. Mass spectrometry-based metabolomics can acquire rich data for thousands of small molecules in any given sample, but the complexity of the resulting data is such that the vast majority of these molecules currently remain unidentified. My work seeks to develop artificial intelligence technologies to decode the unidentified metabolites embedded within metabolomic datasets. I envision a future in which powerful AI approaches allow scientists to routinely enumerate the complete set of known and unknown small molecules detected in any metabolomic experiment, and ultimately reveal the as-of-yet unknown small molecules driving human health and disease in high-throughput.
Fellow
