首页 > 技术文章 > TCGA_DNA_seq Analysis

qiniqnyang 2017-09-06 13:57 原文

DNA-Seq Alignment Command Line Parameters

STEP 1: CONVERTING BAMS TO FASTQS WITH BIOBAMBAM - BIOBAMBAM2 2.0.54

 

bamtofastq \
collate=1 \
exclude=QCFAIL,SECONDARY,SUPPLEMENTARY \
filename= <input.bam> \
gz=1 \
inputformat=bam
level=5 \
outputdir= <output_path> \
outputperreadgroup=1 \
outputperreadgroupsuffixF=_1.fq.gz \
outputperreadgroupsuffixF2=_2.fq.gz \
outputperreadgroupsuffixO=_o1.fq.gz \
outputperreadgroupsuffixO2=_o2.fq.gz \
outputperreadgroupsuffixS=_s.fq.gz \
tryoq=1 \

 

STEP 2: BWA ALIGNMENT - BWA 0.7.15 - SAMTOOLS 1.3.1

If mean read length is greater than or equal to 70bp:

bwa mem \
-t 8 \
-T 0 \
-R <read_group> \
<reference> \
<fastq_1.fq.gz> \
<fastq_2.fq.gz> |
samtools view \
-Shb
-o <output.bam> -

 If mean read length is less than 70bp:

bwa aln -t 8 <reference> <fastq_1.fq.gz> > <sai_1.sai> &&
bwa aln -t 8 <reference> <fastq_2.fq.gz> > <sai_2.sai> &&
bwa sampe -r <read_group> <reference> <sai_1.sai> <sai_2.sai> <fastq_1.fq.gz> <fastq_2.fq.gz> | samtools view -Shb -o <output.bam> -

 

If the quality scores are encoded as Illumina 1.3 or 1.5, use BWA aln with the "-l" flag.
STEP 3: BAM SORT - PICARD 2.6.0

java -jar picard.jar SortSam \
CREATE_INDEX=true \
INPUT=<input.bam> \
OUTPUT=<output.bam> \
SORT_ORDER=coordinate \
VALIDATION_STRINGENCY=STRICT

 STEP 4: BAM MERGE - PICARD 2.6.0 

java -jar picard.jar MergeSamFiles \
ASSUME_SORTED=false \
CREATE_INDEX=true \                 
[INPUT= <input.bam>]  \
MERGE_SEQUENCE_DICTIONARIES=false \
OUTPUT= <output_path> \
SORT_ORDER=coordinate \
USE_THREADING=true \
VALIDATION_STRINGENCY=STRICT

 STEP 5: MARK DUPLICATES - PICARD 2.6.0 

java -jar picard.jar MarkDuplicates \
CREATE_INDEX=true \
INPUT=<input.bam> \
VALIDATION_STRINGENCY=STRICT

DNA-Seq Co-Cleaning Command Line Parameters

STEP 1: REALIGNTARGETCREATOR

Shell

java -jar GenomeAnalysisTK.jar \
-T RealignerTargetCreator \
-R <reference>
-known <known_indels.vcf>
[ -I <input.bam> ]
-o <realign_target.intervals>

STEP 2: INDELREALIGNER

Shell

java -jar GenomeAnalysisTK.jar \
-T IndelRealigner \
-R <reference> \
-known <known_indels.vcf> \
-targetIntervals <realign_target.intervals> \
--noOriginalAlignmentTags \
[ -I <input.bam> ] \
-nWayOut <output.map>

STEP 3: BASERECALIBRATOR

Shell

java -jar GenomeAnalysisTK.jar \
-T BaseRecalibrator \
-R <reference> \
-I <input.bam> \
-knownSites <dbsnp.vcf>
-o <bqsr.grp>


STEP 4: PRINTREADS

Shell

java -jar GenomeAnalysisTK.jar \
-T PrintReads \
-R <reference> \
-I <input.bam> \
--BQSR <bqsr.grp> \
-o <output.bam>

 

Variant Call Command-Line Parameters

MUSE

MuSEv1.0rc_submission_c039ffa

Step 1: MuSE call

Shell

MuSE call \
-f <reference> \
-r <region> \ 
<tumor.bam> \
<normal.bam> \
-O <intermediate_muse_call.txt>


Step 2: MuSE sump
Shell

MuSE sump \
-I <intermediate_muse_call.txt> \ 
-E \ 
-D <dbsnp_known_snp_sites.vcf> \
-O <muse_variants.vcf> 


Note: -E is used for WXS data and -G can be used for WGS data.
MUTECT2

GATK nightly-2016-02-25-gf39d340

Shell

java -jar GenomeAnalysisTK.jar \
-T MuTect2 \
-R <reference> \
-L <region> \
-I:tumor <tumor.bam> \
-I:normal <normal.bam> \
--normal_panel <pon.vcf> \ 
--cosmic <cosmic.vcf> \
--dbsnp <dbsnp.vcf> \
--contamination_fraction_to_filter 0.02 \ 
-o <mutect_variants.vcf> \
--output_mode EMIT_VARIANTS_ONLY \
--disable_auto_index_creation_and_locking_when_reading_rods


SOMATICSNIPER

Somatic-sniper v1.0.5.0

Shell

bam-somaticsniper \
-q 0 \
-Q 15 \
-s 0.01 \
-T 0.85 \
-N 2 \
-r 0.001 \
-n NORMAL \
-t TUMOR \
-F vcf \
-f ref.fa \
<tumor.bam> \
<normal.bam> \
<somaticsniper_variants.vcf>


VARSCAN

Step 1: Mpileup; Samtools 1.1

Shell

samtools mpileup \
-f <reference> \
-q 1 \
-B \
<normal.bam> \
<tumor.bam> >
<intermediate_mpileup.pileup>

 

Step 2: Varscan Somatic; Varscan.v2.3.9
java -jar VarScan.jar somatic \
<intermediate_mpileup.pileup> \
<output_path> \
--mpileup 1 \
--min-coverage 8 \
--min-coverage-normal 8 \
--min-coverage-tumor 6 \
--min-var-freq 0.10 \
--min-freq-for-hom 0.75 \
--normal-purity 1.0 \
--tumor-purity 1.00 \
--p-value 0.99 \
--somatic-p-value 0.05 \
--strand-filter 0 \
--output-vcf


Step 3: Varscan ProcessSomatic; Varscan.v2.3.9
Shell

java -jar VarScan.jar processSomatic \
<intermediate_varscan_somatic.vcf> \
--min-tumor-freq 0.10 \
--max-normal-freq 0.05 \
--p-value 0.07

 

推荐阅读