Using
Mantis-ML is a framework for the identification of novel gene-disease associations. To dive into the results of running Mantis-ML across the human phenome, simply enter a disease or gene name into the search box.
Mantis-ML works by modelling the biological context of disease by leveraging (i) annotation terms: these are signalling pathways, biological processes, tissues etc., known to be implicated in the disease of interest, and (ii) an input gene list: containing a set of genes as positive examples, these are genes confirmed to be associated to the disease. Mantis-ML models the broader biological context underlying the input gene list, identifies the features they share in common, and eventually uncovers novel, under-reported genes, which are likely to be associated to the disease in a similar way to the input genes. Additional details of the methodology are provided in the About page.
A schematic of the methodology is shown above. In effect, a disease is chosen by the user from one of 1000s present across multiple resources. After having selected a disease, natural language processing techniques are leveraged to identify features that are relevant to that disease. Having isolated these features, Mantis-ML is trained to predict novel disease-gene associations based on genes that are known to be associated to a given disease. Phenome-wide Mantis-ML presents a fully streamlined, automated and scalable workflow for the compilation and deployment of Mantis-ML to hundreds of diseases/phenotypes; subsequent aggregation of the results from each individual run, and calculation of summary statistics.
For each disease, Mantis-ML reports the following
- Genes For a given disease, Mantis-ML outputs an association score between the disease and each gene in the human genome.
- Gene Ontology (GO) enrichment For the most strongly associated genes, GO enrichments are performed to check the biological relevance of the resulting predictions.
- Overlap with PheWAS Extensive support is provided to compare the predicted associations to the gold standard resulting from a PheWAS collapsing analysis.
- Feature importances It is also possible to explore which of the features selected by Phenome-wide Mantis-ML contributed most to the predicted association scores.
This web resource is provided as a companion to the paper:
Phenome-wide identification of therapeutic genetic targets, leveraging knowledge graphs, graph neural networks, and UK Biobank data |
---|
Middleton L, Melas I, Vasavda C, Raies A, Rozemberczki B, Dhindsa RS, Dhindsa JS, Weido B, Wang Q, Harper AR, Edwards G, Petrovski S, Vitsios D* |
Science Advances, 10, eadj1424 (2024).doi:10.1126/sciadv.adj1424
*Corresponding author.
|
Phenome-wide Mantis-ML provides a rich resource to explore gene-phenotype associations. Scores are calculated for 18,626 protein-coding genes and over 5,000 diseases from three key resources. To collaborate on accessing/generating Mantis-ML results related to other phenotypes please get in touch.
diseases from the HPO resource
phenotypes in the UK Biobank included for enrichment
features integrated + the BIKG knowledge graph (8.7M edges)