在运行程序前,我们需要导入必要的包:
library(TCGAbiolinks)
library(SummarizedExperiment)
TCGAbiolinks从TCGA样本中检索分子亚型信息。函数PanCancerAtlas_subtypes
和TCGAquery_subtype
可用于获取分子亚型的信息表。
分子亚型实际上指的是测序的数据亚型,比如mRNA、miRNA、DNA甲基化等
虽然PanCancerAtlas_subtypes
函数可以从synapse中获得精准的分子亚型表(可能具有最新的分子亚型),但TCGAquery_subtype
函数能从TCGA标记的文章中获得完整的分子亚型信息,并带有样本信息。
PanCancerAtlas_subtypes
:精准的分子亚型。从synapse中检索数据及其相应的描述可以参考:https://www.synapse.org/#!Synapse:syn8402849
Synapse目前发布了一个文件,其中包含TCGA中所有可用分子亚型,包括所有肿瘤类型和所有分子平台(即测序平台))。我们可以使用PanCancerAtlas_subtypes
函数获取信息:
subtypes <- PanCancerAtlas_subtypes()
DT::datatable(subtypes,
filter = 'top',
options = list(scrollX = TRUE, keys = TRUE, pageLength = 5),
rownames = FALSE)
pan.samplesID | cancer.type | Subtype_mRNA | Subtype_DNAmeth | Subtype_protein | Subtype_miRNA | Subtype_CNA | Subtype_Integrative | Subtype_other | Subtype_Selected |
---|---|---|---|---|---|---|---|---|---|
TCGA-OR-A5J1 | ACC | steroid-phenotype-high+proliferation | CIMP-high | miRNA_1 | Quiet | COC3 | C1A | ACC.CIMP-high | |
TCGA-OR-A5J2 | ACC | steroid-phenotype-high+proliferation | CIMP-low | 1 | miRNA_1 | Noisy | COC3 | C1A | ACC.CIMP-low |
TCGA-OR-A5J3 | ACC | steroid-phenotype-high | CIMP-intermediate | 3 | miRNA_6 | Chromosomal | COC2 | C1A | ACC.CIMP-intermediate |
TCGA-OR-A5J4 | ACC | CIMP-high | miRNA_6 | Chromosomal | ACC.CIMP-high | ||||
TCGA-OR-A5J5 | ACC | steroid-phenotype-high | CIMP-intermediate | miRNA_2 | Chromosomal | COC2 | C1A | ACC.CIMP-intermediate |
只显示了一部分数据
选择“Subtype_Selected”列作为最突出的亚型分类(来自其他列)
All available molecular data based-subtype | Selected subtype | Number of samples | Link to file | Reference | link to paper | |
---|---|---|---|---|---|---|
ACC | mRNA, DNAmeth, protein, miRNA, CNA, COC, C1A.C1B | DNAmeth | 91 | Link | Cancer Cell 2016 | Link |
AML | mRNA and miRNA | mRNA | 187 | Link | NEJM 2013 | Link |
BLCA | mRNA subtypes | mRNA | 129 | Link | Nature 2014 | Link |
BRCA | PAM50 (mRNA) | PAM50 | 1218 | Link | Nature 2012 | Link |
GBM/LGG* | mRNA, DNAmeth, protein, Supervised_DNAmeth | Supervised_DNAmeth | 1122 | Link | Cell 2016 | Link |
Pan-GI (preliminary) ESCA/STAD/COAD/READ | Molecular_Subtype | Molecular_Subtype | 1011 | Link | Cancer Cell 2018 | Link |
HNSC | mRNA, DNAmeth, RPPA, miRNA, CNA, Paradigm | mRNA | 279 | Link (TabS7.2) | Nature 2015 | Link |
KICH | Eosinophilic | Eosinophilic | 66 | Link | Cancer Cell 2014 | Link |
KIRC | mRNA, miRNA | mRNA | 442 | Link | Nature 2013 | Link |
KIRP | mRNA, DNAmeth, protein, miRNA, CNA, COC | COC | 161 | Link | NEJM 2015 | Link |
LIHC (preliminary) | mRNA, DNAmeth, protein, miRNA, CNA, Paradigma, iCluster | iCluster | 196 | [Link](https://wiki.nci.nih.gov/download/attachments/139067884/Supplementary Tables-1-2016.xlsx?version=1&modificationDate=1452270515000&api=v2) (Table S1A) | not published | |
LUAD | DNAmeth, iCluster | iCluster | 230 | Link (Table S7) | Nature 2014 | Link |
LUSC | mRNA | mRNA | 178 | Link (Data file S7.5) | Nature 2012 | Link |
OVCA | mRNA | mRNA | 489 | Link | Nature 2011 | Link |
PCPG | mRNA, DNAmeth, protein, miRNA, CNA | mRNA | 178 | tableS2 | Cancer Cell 2017 | Link |
PRAD | mRNA, DNAmeth, protein, miRNA, CNA, icluster, mutation/fusion | mutation/fusion | 333 | Link | Cell 2015 | Link |
SKCM | mRNA, DNAmeth, protein, miRNA, mutation | mutation | 331 | Link (Table S1D) | Cell 2015 | Link |
THCA | mRNA, DNAmeth, protein, miRNA, CNA, histology | mRNA | 496 | Link (Table S2 - Tab1) | Cell 2014 | Link |
UCEC | iCluster, MSI, CNA, mRNA | iCluster - updated according to Pan-Gyne/Pathways groups | 538 | Link (datafile S1.1) | Nature 2013 | Link |
Link | ||||||
UCS (preliminary) | mRNA | mRNA | 57 | Link | not published |
TCGAquery_subtype
:使用分子亚型数据。癌症基因组图谱(TCGA)研究网络报告了各种疾病的综合全基因组研究。我们在包中添加了这些报告定义的一些分子亚型:
TCGA dataset | Link | Paper | Journal |
---|---|---|---|
ACC | doi:10.1016/j.ccell.2016.04.002 | Comprehensive Pan-Genomic Characterization of Adrenocortical Carcinoma. | Cancer cell 2016 |
BRCA | https://www.cell.com/cancer-cell/fulltext/S1535-6108(18)30119-3 | A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers | Cancer cell 2018 |
BLCA | http://www.cell.com/cell/fulltext/S0092-8674(17)31056-5 | Comprehensive Molecular Characterization of Muscle-Invasive Bladder Cancer Cell 2017 | |
CHOL | http://www.sciencedirect.com/science/article/pii/S2211124717302140?via%3Dihub | Integrative Genomic Analysis of Cholangiocarcinoma Identifies Distinct IDH-Mutant Molecular Profiles | Cell Reports 2017 |
COAD | http://www.nature.com/nature/journal/v487/n7407/abs/nature11252.html | Comprehensive molecular characterization of human colon and rectal cancer | Nature 2012 |
ESCA | https://www.nature.com/articles/nature20805 | Integrated genomic characterization of oesophageal carcinoma | Nature 2017 |
GBM | http://dx.doi.org/10.1016/j.cell.2015.12.028 | Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma | Cell 2016 |
HNSC | http://www.nature.com/nature/journal/v517/n7536/abs/nature14129.html | Comprehensive genomic characterization of head and neck squamous cell carcinomas | Nature 2015 |
KICH | http://www.sciencedirect.com/science/article/pii/S1535610814003043 | The Somatic Genomic Landscape of Chromophobe Renal Cell Carcinoma | Cancer cell 2014 |
KIRC | http://www.nature.com/nature/journal/v499/n7456/abs/nature12222.html | Comprehensive molecular characterization of clear cell renal cell carcinoma | Nature 2013 |
KIRP | http://www.nejm.org/doi/full/10.1056/NEJMoa1505917 | Comprehensive Molecular Characterization of Papillary Renal-Cell Carcinoma | NEJM 2016 |
LIHC | http://linkinghub.elsevier.com/retrieve/pii/S0092-8674(17)30639-6 | Comprehensive and Integrative Genomic Characterization of Hepatocellular Carcinoma | Cell 2017 |
LGG | http://dx.doi.org/10.1016/j.cell.2015.12.028 | Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma | Cell 2016 |
LUAD | http://www.nature.com/nature/journal/v511/n7511/abs/nature13385.html | Comprehensive molecular profiling of lung adenocarcinoma | Nature 2014 |
LUSC | http://www.nature.com/nature/journal/v489/n7417/abs/nature11404.html | Comprehensive genomic characterization of squamous cell lung cancers | Nature 2012 |
PAAD | http://www.cell.com/cancer-cell/fulltext/S1535-6108(17)30299-4 | Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma | Cancer Cell 2017 |
PCPG | http://dx.doi.org/10.1016/j.ccell.2017.01.001 | Comprehensive Molecular Characterization of Pheochromocytoma and Paraganglioma | Cancer cell 2017 |
PRAD | http://www.sciencedirect.com/science/article/pii/S0092867415013392 | The Molecular Taxonomy of Primary Prostate Cancer | Cell 2015 |
READ | http://www.nature.com/nature/journal/v487/n7407/abs/nature11252.html | Comprehensive molecular characterization of human colon and rectal cancer | Nature 2012 |
SARC | http://www.cell.com/cell/fulltext/S0092-8674(17)31203-5 | Comprehensive and Integrated Genomic Characterization of Adult Soft Tissue Sarcomas | Cell 2017 |
SKCM | http://www.sciencedirect.com/science/article/pii/S0092867415006340 | Genomic Classification of Cutaneous Melanoma | Cell 2015 |
STAD | http://www.nature.com/nature/journal/v511/n7511/abs/nature13385.html | Comprehensive molecular characterization of gastric adenocarcinoma | Nature 2013 |
THCA | http://www.sciencedirect.com/science/article/pii/S0092867414012380 | Integrated Genomic Characterization of Papillary Thyroid Carcinoma | Cell 2014 |
UCEC | http://www.nature.com/nature/journal/v497/n7447/abs/nature12113.html | Integrated genomic characterization of endometrial carcinoma | Nature 2013 |
UCS | http://www.cell.com/cancer-cell/fulltext/S1535-6108(17)30053-3 | Integrated Molecular Characterization of Uterine Carcinosarcoma Cancer | Cell 2017 |
UVM | http://www.cell.com/cancer-cell/fulltext/S1535-6108(17)30295-7 | Integrative Analysis Identifies Four Molecular and Clinical Subsets in Uveal Melanoma | Cancer Cell 2017 |
这些分子亚型将通过GDCprepare
自动添加到summaExperiment
对象中。但您也可以TCGAquery_subtype
函数来检索此信息。
lgg.gbm.subtype <- TCGAquery_subtype(tumor = "lgg")
LGG亚型的子集如下所示:
patient | Tissue.source.site | Study | BCR | Whole.exome | Whole.genome | RNAseq | SNP6 | U133a | HM450 | HM27 | RPPA | Histology | Grade | Age..years.at.diagnosis. | Gender | Survival..months. | Vital.status..1.dead. | Karnofsky.Performance.Score | Mutation.Count | Percent.aneuploidy | IDH.status | X1p.19q.codeletion | IDH.codel.subtype | MGMT.promoter.status | Chr.7.gain.Chr.10.loss | Chr.19.20.co.gain | TERT.promoter.status | TERT.expression..log2. | TERT.expression.status | ATRX.status | DAXX.status | Telomere.Maintenance | BRAF.V600E.status | BRAF.KIAA1549.fusion | ABSOLUTE.purity | ABSOLUTE.ploidy | ESTIMATE.stromal.score | ESTIMATE.immune.score | ESTIMATE.combined.score | Original.Subtype | Transcriptome.Subtype | Pan.Glioma.RNA.Expression.Cluster | IDH.specific.RNA.Expression.Cluster | Pan.Glioma.DNA.Methylation.Cluster | IDH.specific.DNA.Methylation.Cluster | Supervised.DNA.Methylation.Cluster | Random.Forest.Sturm.Cluster | RPPA.cluster | Telomere.length.estimate.in.blood.normal..Kb. | Telomere.length.estimate.in.tumor..Kb. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
TCGA-CS-4938 | Thomas Jefferson University | Brain Lower Grade Glioma | IGC | Yes | Yes | Yes | Yes | No | Yes | No | Yes | astrocytoma | G2 | 31 | female | 4.6982507 | 0 | 90 | 15 | 0.069411737 | Mutant | non-codel | IDHmut-non-codel | Unmethylated | No combined CNA | No chr 19/20 gain | WT | 0 | Not expressed | Mutant | WT | ATRX | WT | WT | 0.79 | 2 | 422.0321 | 804.9614 | 1226.9936 | IDHmut-non-codel | LGr3 | IDHmut-R3 | LGm2 | IDHmut-K2 | G-CIMP-high | IDH | K2 | 8.1119 | 5.1859 | |
TCGA-CS-4941 | Thomas Jefferson University | Brain Lower Grade Glioma | IGC | Yes | Yes | Yes | Yes | No | Yes | No | No | astrocytoma | G3 | 67 | male | 7.6880466 | 1 | 90 | 50 | 0.224149226 | WT | non-codel | IDHwt | Methylated | Gain chr 7 & loss chr 10 | No chr 19/20 gain | Mutant | 3.8073549220576 | Expressed | WT | WT | TERT | WT | WT | 0.61 | 2.05 | 706.8712 | 2283.2336 | 2990.1048 | IDHwt | CL | LGr4 | IDHwt-R2 | LGm5 | IDHwt-K2 | Mesenchymal-like | Mesenchymal | 1.4191 | 1.6129 | |
TCGA-CS-4942 | Thomas Jefferson University | Brain Lower Grade Glioma | IGC | Yes | No | Yes | Yes | No | Yes | No | Yes | astrocytoma | G3 | 44 | female | 43.8612915 | 1 | 90 | 24 | 0.093693446 | Mutant | non-codel | IDHmut-non-codel | Unmethylated | No combined CNA | No chr 19/20 gain | WT | 0 | Not expressed | Mutant | WT | ATRX | WT | WT | 0.76 | 2.05 | 563.4763 | 2076.2917 | 2639.768 | IDHmut-non-codel | PN | LGr3 | IDHmut-R3 | LGm2 | IDHmut-K2 | G-CIMP-high | IDH | K2 | ||
TCGA-CS-4943 | Thomas Jefferson University | Brain Lower Grade Glioma | IGC | Yes | No | Yes | Yes | No | Yes | No | Yes | astrocytoma | G3 | 37 | male | 18.1359048 | 0 | 50 | 30 | 0.172371054 | Mutant | non-codel | IDHmut-non-codel | Methylated | No combined CNA | No chr 19/20 gain | WT | 0 | Not expressed | Mutant | WT | ATRX | WT | WT | 0.83 | 3.92 | 460.2841 | 819.1347 | 1279.4188 | IDHmut-non-codel | PN | LGr3 | IDHmut-R3 | LGm2 | IDHmut-K2 | G-CIMP-high | IDH | K2 | ||
TCGA-CS-4944 | Thomas Jefferson University | Brain Lower Grade Glioma | IGC | Yes | Yes | Yes | Yes | No | Yes | No | Yes | astrocytoma | G2 | 50 | male | 10.6121327 | 0 | 90 | 20 | 0.060307007 | Mutant | non-codel | IDHmut-non-codel | Methylated | No combined CNA | No chr 19/20 gain | Mutant | 2.58496250072116 | Expressed | WT | WT | TERT | WT | WT | 0.74 | 1.94 | 701.1345 | 1281.992 | 1983.1265 | IDHmut-non-codel | LGr3 | IDHmut-R3 | LGm2 | IDHmut-K2 | G-CIMP-high | IDH | K2 | 3.2645 | 1.4981 |
sessionInfo()
## R version 3.5.3 (2019-03-11)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.6 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.8-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.8-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] grid parallel stats4 stats graphics grDevices utils
## [8] datasets methods base
##
## other attached packages:
## [1] TCGAbiolinks_2.10.5 maftools_1.8.0
## [3] bigmemory_4.5.33 png_0.1-7
## [5] DT_0.5 dplyr_0.8.0.1
## [7] SummarizedExperiment_1.12.0 DelayedArray_0.8.0
## [9] BiocParallel_1.16.6 matrixStats_0.54.0
## [11] Biobase_2.42.0 GenomicRanges_1.34.0
## [13] GenomeInfoDb_1.18.2 IRanges_2.16.0
## [15] S4Vectors_0.20.1 BiocGenerics_0.28.0
## [17] testthat_2.0.1
##
## loaded via a namespace (and not attached):
## [1] R.utils_2.8.0 tidyselect_0.2.5
## [3] RSQLite_2.1.1 AnnotationDbi_1.44.0
## [5] htmlwidgets_1.3 devtools_2.0.1
## [7] DESeq_1.34.1 munsell_0.5.0
## [9] codetools_0.2-16 preprocessCore_1.44.0
## [11] withr_2.1.2 colorspace_1.4-1
## [13] highr_0.7 knitr_1.22
## [15] rstudioapi_0.10 NMF_0.21.0
## [17] labeling_0.3 GenomeInfoDbData_1.2.0
## [19] hwriter_1.3.2 KMsurv_0.1-5
## [21] bit64_0.9-7 rprojroot_1.3-2
## [23] downloader_0.4 generics_0.0.2
## [25] xfun_0.5 ggthemes_4.1.0
## [27] randomForest_4.6-14 EDASeq_2.16.3
## [29] R6_2.4.0 doParallel_1.0.14
## [31] locfit_1.5-9.1 bitops_1.0-6
## [33] assertthat_0.2.0 promises_1.0.1
## [35] scales_1.0.0 gtable_0.2.0
## [37] sva_3.30.1 processx_3.3.0
## [39] wheatmap_0.1.0 sesameData_1.0.0
## [41] rlang_0.3.1 genefilter_1.64.0
## [43] cmprsk_2.2-7 GlobalOptions_0.1.0
## [45] splines_3.5.3 rtracklayer_1.42.2
## [47] lazyeval_0.2.2 wordcloud_2.6
## [49] selectr_0.4-1 broom_0.5.1
## [51] reshape2_1.4.3 BiocManager_1.30.4
## [53] yaml_2.2.0 GenomicFeatures_1.34.6
## [55] crosstalk_1.0.0 backports_1.1.3
## [57] httpuv_1.5.0 tools_3.5.3
## [59] usethis_1.4.0 gridBase_0.4-7
## [61] ggplot2_3.1.0 RColorBrewer_1.1-2
## [63] DNAcopy_1.56.0 sessioninfo_1.1.1
## [65] Rcpp_1.0.1 plyr_1.8.4
## [67] progress_1.2.0 zlibbioc_1.28.0
## [69] purrr_0.3.2 RCurl_1.95-4.12
## [71] ps_1.3.0 prettyunits_1.0.2
## [73] ggpubr_0.2 GetoptLong_0.1.7
## [75] cowplot_0.9.4 zoo_1.8-4
## [77] ggrepel_0.8.0 cluster_2.0.7-1
## [79] fs_1.2.7 magrittr_1.5
## [81] data.table_1.12.0 circlize_0.4.5
## [83] survminer_0.4.3 pkgload_1.0.2
## [85] aroma.light_3.12.0 hms_0.4.2
## [87] mime_0.6 evaluate_0.13
## [89] xtable_1.8-3 XML_3.98-1.19
## [91] mclust_5.4.3 gridExtra_2.3
## [93] shape_1.4.4 compiler_3.5.3
## [95] biomaRt_2.38.0 tibble_2.1.1
## [97] crayon_1.3.4 R.oo_1.22.0
## [99] htmltools_0.3.6 mgcv_1.8-27
## [101] later_0.8.0 tidyr_0.8.3
## [103] geneplotter_1.60.0 DBI_1.0.0
## [105] ExperimentHub_1.8.0 matlab_1.0.2
## [107] ComplexHeatmap_1.20.0 BiocStyle_2.10.0
## [109] ShortRead_1.40.0 Matrix_1.2-16
## [111] readr_1.3.1 cli_1.1.0
## [113] R.methodsS3_1.7.1 bigmemory.sri_0.1.3
## [115] pkgconfig_2.0.2 km.ci_0.5-2
## [117] sesame_1.0.0 registry_0.5-1
## [119] GenomicAlignments_1.18.1 xml2_1.2.0
## [121] foreach_1.4.4 annotate_1.60.1
## [123] rngtools_1.3.1 pkgmaker_0.27
## [125] XVector_0.22.0 bibtex_0.4.2
## [127] rvest_0.3.2 stringr_1.4.0
## [129] callr_3.2.0 digest_0.6.18
## [131] ConsensusClusterPlus_1.46.0 Biostrings_2.50.2
## [133] rmarkdown_1.12 survMisc_0.5.5
## [135] edgeR_3.24.3 curl_3.3
## [137] shiny_1.2.0 Rsamtools_1.34.1
## [139] rjson_0.2.20 nlme_3.1-137
## [141] jsonlite_1.6 BSgenome_1.50.0
## [143] desc_1.2.0 limma_3.38.3
## [145] pillar_1.3.1 lattice_0.20-38
## [147] httr_1.4.0 pkgbuild_1.0.2
## [149] survival_2.43-3 interactiveDisplayBase_1.20.0
## [151] glue_1.3.1 remotes_2.0.2
## [153] iterators_1.0.10 bit_1.1-14
## [155] stringi_1.4.3 blob_1.1.1
## [157] AnnotationHub_2.14.5 latticeExtra_0.6-28
## [159] memoise_1.1.0
本文由 石九流 创作,如果您觉得本文不错,请随意赞赏
采用 知识共享署名4.0 国际许可协议进行许可
本站文章除注明转载/出处外,均为本站原创或翻译,转载前请务必署名
原文链接:https://blog.computsystmed.com/archives/translation-tcgabiolinks-compilation-of-tcga-molecular-subtypes
最后更新:2019-05-25 17:37:45
Update your browser to view this website correctly. Update my browser now