Scaffold-Induced Molecular Subgraphs (SIMSG): Effective Graph Sampling Methods for High-Throughput Computational Drug Discovery
TimeFriday, 13 November 202012:15pm - 12:30pm EDT
DescriptionHistorically, drugs have been discovered serendipitously based on active chemicals known to interact with and bind to protein targets. Combinatorial chemistry has enabled an explosion of diverse chemical libraries which potentially have drug-like properties; however, the main issue that is still faced by drug discovery community is the need to efficiently navigate the high dimensional chemical space to identify viable molecules that can target proteins of interest. Ongoing pandemics such as the novel coronavirus disease 2019 (COVID-19) further emphasize the need for such effective methods to sample these chemical spaces and quickly identify effective drugs against the virus. Similar needs are also emerging within the context of other diseases such as cancer, where intrinsic heterogeneity of gene expression within tumor cells and the cancer type can result in similar challenges. Current techniques assume enumeration of large compound libraries coupled with GPU- accelerated machine learning (ML) models will be an effective tool for screening through datasets. However, these techniques still face inherent limitations where inference or physics-based simulations become computational bottlenecks for spanning large chemical spaces (beyond 10**12 molecules). To overcome these limitations, we propose a graph based structure of chemical space, opposed to a static library of compounds. By embracing this inherent structure of chemical space for small molecule design, we show an enhanced sampling technique that exploits random walk theory and intrinsic relationships between chemical "scaffolds" for ultra high-throughput docking studies.