The research group of Bioinformatics Institute aims at solving fundamental and applied problems in bioinformatics and systems biology
Our research group currently focuses on using large-scale genetic datasets to solve complex basic and applied problems in systems biology and medicine. We run our own research projects and participate in research efforts of other labs.

Alexander Predeus, Ph.D.
Scientific advisor
Yury Barbitoff
Research director
Anton Changalidi
Associate bioinformatician
Rostislav Skitchenko
Associate bioinformatician

Our team is working in three main directions:
Dissection of the genetic architecture of human complex traits and diseases.
We are mostly interested in integrating genome-wide association data on the phenome scale to identify key genes and molecular pathways involved in multiple phenotypes. This work is done using open UK Biobank and Finngen genetic data.
Investigation of the evolutionary aspects of human genome variation.
We utilize large-scale data sources such as the Genome Aggregation Database (gnomAD) to study mechanisms of mutation and selective pressure across protein-coding genes.

Application of the NGS bioinformatics for clinical genetics.
We collaborate with several research groups and private genetic laboratories to develop tools for interactive analysis of clinical exome sequencing data. We also make efforts to aggregate genetic data from multiple sources all across Russia to create a population exome variation reference for both research and clinical needs.
If you have any questions or for potential collaborations please contact us at
  1. Identification of Genetic Risk Factors of Severe COVID-19 Using Extensive Phenotypic Data: A Proof-of-Concept Study in a Cohort of Russian Patients Genes (Basel) 2022. doi: 10.3390/genes13030534
  2. RNA sequencing of whole blood defines the signature of high intensity exercise at altitude in elite speed skaters Genes (Basel) 2022. doi: 10.3390/genes13040574
  3. Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery BMC Genomics 2022. doi: 10.1186/s12864-022-08365-3
  4. Development of SNP Set for the Marker-Assisted Selection of Guar (Cyamopsis tetragonoloba (L.) Taub.) Based on a Custom Reference Genome Assembly Plants 2021. doi: 10.3390/plants10102063
  5. Chromosome-level genome assembly and structural variant analysis of two laboratory yeast strains from the Peterhof Genetic Collection lineage G3 2021. doi: 10.1093/g3journal/jkab029
  6. Identification of novel variants in the LDLR gene in Russian patients with familial hypercholesterolemia using targeted sequencing Biomed Rep 2021. doi: 10.3892/br.2020.1391
  7. Analysis of the Spectrum of ACE2 Variation Suggests a Possible Influence of Rare and Common Variants on Susceptibility to COVID-19 and Severity of Outcome Front Genet 2020. doi: 10.3389/fgene.2020.551220
  8. Harnessing population-specific protein truncating variants to improve the annotation of loss-of-function alleles. bioRxiv 2020. doi: 10.1101/2020.08.17.254904
  9. A data-driven review of the genetic factors of pregnancy complications Int J Mol Sci 2020. doi: 10.3390/ijms21093384
  10. Systematic dissection of biases in whole-genome and whole-exome sequencing reveals major determinants of coding sequence coverage. Sci Rep 2020. doi: 10.1038/s41598-020-59026-y
  11. Phenome-wide functional dissection of pleiotropic effects highlights key molecular pathways for human complex traits Sci Rep 2020. doi: 10.1038/s41598-020-58040-4
  12. The spectrum of pathogenic variants of the ATP7B gene in Wilson disease in the Russian Federation J Trace Elem Med Biol, 2019. doi: 10.1016/j.jtemb.2019.126420
  13. Whole‑exome sequencing in Russian children with non‑type 1 diabetes mellitus reveals a wide spectrum of genetic variants in MODY‑related and unrelated genes Mol Med Rep, 2019. doi: 10.3892/mmr.2019.10751
  14. Whole‐exome sequencing provides insights into monogenic disease prevalence in Northwest Russia Mol Genet Genomic Med, 2019. doi: 10.1002/mgg3.964
  15. Recent advances and perspectives in next generation sequencing application to the genetic research of type 2 diabetes World J Diabetes, 2019. doi: 10.4239/wjd.v10.i7.376
  16. Identification of Novel Candidate Markers of Type 2 Diabetes and Obesity in Russia by Exome Sequencing with a Limited Sample Size. Genes (Basel), 2018. doi: 10.3390/genes9080415
  17. Catching hidden variation: systematic correction of reference minor allele annotation in clinical varian calling. Genet Med, 2017. doi: 10.1038/gim.2017.168
  18. Effect of gene-lifestyle interaction on gestational diabetes risk. Oncotarget, 2017. doi: 10.18632/oncotarget.22999
All of the code pertinent to our active and past projects can be found on our GitHub page.

We develop and maintain several program products that address a variety of issues arising when working with genomic and transcriptomic data.

This is the list of software or databases which is being developed or maintained in our group:

GeneQuery - a webserver for transcriptome-based hypothesis generation. Developed by Alexander Predeus under supervision of Maxim Artyomov. During his M.S. thesis work was improved by Ivan Arbuzov under supervision of Alexander Predeus.

RMAHunter - a web-based tool to systematically analyze and correct reference minor alleles in variant calling data. The tool was developed by Yury Barbitoff and Igor Bezdvornykh under supervision of Alexander Predeus.

LSEA - a command-line tool for gene set enrichment analysis of GWAS summary statistics. The tool was developed by Anton Shikov and Yury Barbitoff.

SICER 2.0 - re-implementation of SICER algorithm for broad peak calling in appropriate ChIP-Seq experiments which was at different times developed by Yegor Prikaziuk, Dmitrii Krasheninnikov, Igor Bezdvornykh, and Evgenii Bakin under supervision of Alexander Predeus.

We also develop several tools for automation of whole-exome sequencing data analysis and interpretation for the CerbaLab Ltd.