Resources a bioinformatician, working on cancers should be aware of
Updated: Mar 27
Here we catalog SIX essential resources for bioinformatics research in the field of cancer.
cBioPortal: Pre-processed cancer omics data
Processed genotype and expression data from across hundreds of studies.
Data across studies are uniformly structured.
All pertinent clinical information is available.
To our knowledge, by far, the best resource for cancer informatics researchers.
It encompasses all consortia scale data such as TCGA.
Here is the link: https://www.cbioportal.org/.
Note that cBioPortal presents the variant calls as-is from the respective study. So if the study has matched normal samples, you can trust the somatic variant calls. So one has to be watchful.
genomAD: Processed exome and genome sequence data
125,748 exomes and 15,708 genome sequences from various disease-specific and population genetic studies
The data is powerful since the consortium used joint variant calling
Helpful to control for polymorphisms
Here is the link: https://gnomad.broadinstitute.org/about
Literature curated trustworthy somatic variants in cancer.
Here is the link: https://cancer.sanger.ac.uk/cosmic
One-stop-shop for cancer (from TCGA) and normal (from GTEx) gene expression data.
The server allows all possible ways of comparing gene expression differences across cancer and healthy tissues.
Here is the link: http://gepia.cancer-pku.cn/
Cancer Dependency Map
Image credit: https://depmap.org/portal/
Excellent resource combining genome-wide loss of function screens using CRISPR/shRNA, omic data of established cancer cell lines, drug sensitivity data.
An excellent resource for AI-based precision oncology research.
Here is the link: https://depmap.org/portal/
Single-cell RNA-seq data, compiled from 28 projects.
20 cancer types
600,000 + cells
Here is the link: https://ngdc.cncb.ac.cn/cancerscem/