首页 > 技术文章 > 探索gff/gtf格式

leezx 2017-02-13 17:42 原文

参考:

GFF格式说明

Generic Feature Format Version 3 (GFF3)

先下载一个 gtf 文件浏览一下

1       havana  gene    11869   14409   .       +       .       gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2";
1       havana  transcript      11869   14409   .       +       .       gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2"; transcript_name "DDX11L1-002"; transcript_source "havana"; transcript_biotype "processed_transcript"; havana_transcript "OTTHUMT00000362751"; havana_transcript_version "1"; tag "basic"; transcript_support_level "1";
有一个 R 的版本,可以看一看:R的bioconductor包TxDb.Hsapiens.UCSC.hg19.knownGene详解

另外,看看 Bioconductor的数据包library(org.Hs.eg.db)简介,了解一些基本的常识。

推荐阅读