PMID- 24959206 OWN - NLM STAT- PubMed-not-MEDLINE DCOM- 20140624 LR - 20211021 IS - 1758-2946 (Print) IS - 1758-2946 (Electronic) IS - 1758-2946 (Linking) VI - 6 DP - 2014 TI - Self organising hypothesis networks: a new approach for representing and structuring SAR knowledge. PG - 21 LID - 10.1186/1758-2946-6-21 [doi] AB - BACKGROUND: Combining different sources of knowledge to build improved structure activity relationship models is not easy owing to the variety of knowledge formats and the absence of a common framework to interoperate between learning techniques. Most of the current approaches address this problem by using consensus models that operate at the prediction level. We explore the possibility to directly combine these sources at the knowledge level, with the aim to harvest potentially increased synergy at an earlier stage. Our goal is to design a general methodology to facilitate knowledge discovery and produce accurate and interpretable models. RESULTS: To combine models at the knowledge level, we propose to decouple the learning phase from the knowledge application phase using a pivot representation (lingua franca) based on the concept of hypothesis. A hypothesis is a simple and interpretable knowledge unit. Regardless of its origin, knowledge is broken down into a collection of hypotheses. These hypotheses are subsequently organised into hierarchical network. This unification permits to combine different sources of knowledge into a common formalised framework. The approach allows us to create a synergistic system between different forms of knowledge and new algorithms can be applied to leverage this unified model. This first article focuses on the general principle of the Self Organising Hypothesis Network (SOHN) approach in the context of binary classification problems along with an illustrative application to the prediction of mutagenicity. CONCLUSION: It is possible to represent knowledge in the unified form of a hypothesis network allowing interpretable predictions with performances comparable to mainstream machine learning techniques. This new approach offers the potential to combine knowledge from different sources into a common framework in which high level reasoning and meta-learning can be applied; these latter perspectives will be explored in future work. FAU - Hanser, Thierry AU - Hanser T AD - Lhasa Limited, Leeds, UK. FAU - Barber, Chris AU - Barber C AD - Lhasa Limited, Leeds, UK. FAU - Rosser, Edward AU - Rosser E AD - Lhasa Limited, Leeds, UK. FAU - Vessey, Jonathan D AU - Vessey JD AD - Lhasa Limited, Leeds, UK. FAU - Webb, Samuel J AU - Webb SJ AD - Lhasa Limited, Leeds, UK. FAU - Werner, Stephane AU - Werner S AD - Lhasa Limited, Leeds, UK. LA - eng PT - Journal Article DEP - 20140508 PL - England TA - J Cheminform JT - Journal of cheminformatics JID - 101516718 PMC - PMC4048587 OTO - NOTNLM OT - Confidence metric OT - Data mining OT - Hypothesis Network OT - Interpretable model OT - Knowledge discovery OT - Machine learning OT - QSAR OT - SAR OT - SOHN EDAT- 2014/06/25 06:00 MHDA- 2014/06/25 06:01 PMCR- 2014/05/08 CRDT- 2014/06/25 06:00 PHST- 2013/11/29 00:00 [received] PHST- 2014/03/28 00:00 [accepted] PHST- 2014/06/25 06:00 [entrez] PHST- 2014/06/25 06:00 [pubmed] PHST- 2014/06/25 06:01 [medline] PHST- 2014/05/08 00:00 [pmc-release] AID - 1758-2946-6-21 [pii] AID - 10.1186/1758-2946-6-21 [doi] PST - epublish SO - J Cheminform. 2014 May 8;6:21. doi: 10.1186/1758-2946-6-21. eCollection 2014.