Phenome-wide Mantis-ML

Using




Getting started


Mantis-ML is a framework for the identification of novel gene-disease associations. To dive into the results of running Mantis-ML across the human phenome, simply enter a disease or gene name into the search box.



Overview


Mantis-ML works by modelling the biological context of disease by leveraging (i) annotation terms: these are signalling pathways, biological processes, tissues etc., known to be implicated in the disease of interest, and (ii) an input gene list: containing a set of genes as positive examples, these are genes confirmed to be associated to the disease. Mantis-ML models the broader biological context underlying the input gene list, identifies the features they share in common, and eventually uncovers novel, under-reported genes, which are likely to be associated to the disease in a similar way to the input genes. Additional details of the methodology are provided in the About page.


A schematic of the methodology is shown above. In effect, a disease is chosen by the user from one of 1000s present across multiple resources. After having selected a disease, natural language processing techniques are leveraged to identify features that are relevant to that disease. Having isolated these features, Mantis-ML is trained to predict novel disease-gene associations based on genes that are known to be associated to a given disease. Phenome-wide Mantis-ML presents a fully streamlined, automated and scalable workflow for the compilation and deployment of Mantis-ML to hundreds of diseases/phenotypes; subsequent aggregation of the results from each individual run, and calculation of summary statistics.



Summary of the results


For each disease, Mantis-ML reports the following

- Genes For a given disease, Mantis-ML outputs an association score between the disease and each gene in the human genome.

- Gene Ontology (GO) enrichment For the most strongly associated genes, GO enrichments are performed to check the biological relevance of the resulting predictions.

- Overlap with PheWAS Extensive support is provided to compare the predicted associations to the gold standard resulting from a PheWAS collapsing analysis.

- Feature importances It is also possible to explore which of the features selected by Phenome-wide Mantis-ML contributed most to the predicted association scores.




Citation

This web resource is provided as a companion to the paper:

Phenome-wide identification of therapeutic genetic targets, leveraging knowledge graphs, graph neural networks, and UK Biobank data
Middleton L, Melas I, Vasavda C, Raies A, Rozemberczki B, Dhindsa RS, Dhindsa JS, Weido B, Wang Q, Harper AR, Edwards G, Petrovski S, Vitsios D*
Science Advances, 10, eadj1424 (2024).doi:10.1126/sciadv.adj1424
*Corresponding author.



Statistics

Phenome-wide Mantis-ML provides a rich resource to explore gene-phenotype associations. Scores are calculated for 18,626 protein-coding genes and over 5,000 diseases from three key resources. To collaborate on accessing/generating Mantis-ML results related to other phenotypes please get in touch.

diseases from the HPO resource

phenotypes in the UK Biobank included for enrichment

features integrated + the BIKG knowledge graph (8.7M edges)