PMID- 37585609 OWN - NLM STAT- MEDLINE DCOM- 20230829 LR - 20230829 IS - 1549-960X (Electronic) IS - 1549-9596 (Linking) VI - 63 IP - 16 DP - 2023 Aug 28 TI - pBRICS: A Novel Fragmentation Method for Explainable Property Prediction of Drug-Like Small Molecules. PG - 5066-5076 LID - 10.1021/acs.jcim.3c00689 [doi] AB - Generative artificial intelligence algorithms have shown to be successful in exploring large chemical spaces and designing novel and diverse molecules. There has been considerable interest in developing predictive models using artificial intelligence for drug-like properties, which can potentially reduce the late-stage attrition of drug candidates or predict the properties of novel AI-designed molecules. Concurrently, it is important to understand the contribution of functional groups toward these properties and modify them to obtain property-optimized lead compounds. As a result, there is an increasing interest in the development of explainable property prediction models. However, current explainable approaches are mostly atom-based, where, often, only a fraction of a fragment is shown to be significant. To address the above challenges, we have developed a novel domain-aware molecular fragmentation approach termed post-processing of BRICS (pBRICS), which can fragment small molecules into their functional groups. Multitask models were developed to predict various properties, including the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties. The fragment importance was explained using the gradient-weighted class activation mapping (Grad-CAM) approach. The method was validated on data sets of experimentally available matched molecular pairs (MMPs). The explanations from the model can be useful for medicinal chemists to identify the fragments responsible for poor drug-like properties and optimize the molecule. The explainability approach was also used to identify the reason behind false positive and false negative MMP predictions. Based on evidence from the existing literature and our analysis, some of these mispredictions were justified. We propose that the quantity, quality, and diversity of the training data will improve the accuracy of property prediction algorithms for novel molecules. FAU - Vangala, Sarveswara Rao AU - Vangala SR AD - TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad 500081, India. FAU - Krishnan, Sowmya Ramaswamy AU - Krishnan SR AUID- ORCID: 0000-0001-5404-3266 AD - TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad 500081, India. FAU - Bung, Navneet AU - Bung N AUID- ORCID: 0000-0002-6376-277X AD - TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad 500081, India. FAU - Srinivasan, Rajgopal AU - Srinivasan R AD - TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad 500081, India. FAU - Roy, Arijit AU - Roy A AUID- ORCID: 0000-0002-1961-2483 AD - TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad 500081, India. LA - eng PT - Journal Article DEP - 20230816 PL - United States TA - J Chem Inf Model JT - Journal of chemical information and modeling JID - 101230060 SB - IM MH - *Artificial Intelligence MH - *Algorithms EDAT- 2023/08/16 18:42 MHDA- 2023/08/29 12:42 CRDT- 2023/08/16 14:33 PHST- 2023/08/29 12:42 [medline] PHST- 2023/08/16 18:42 [pubmed] PHST- 2023/08/16 14:33 [entrez] AID - 10.1021/acs.jcim.3c00689 [doi] PST - ppublish SO - J Chem Inf Model. 2023 Aug 28;63(16):5066-5076. doi: 10.1021/acs.jcim.3c00689. Epub 2023 Aug 16.