首页 > 解决方案 > 输出文件(染色体块)在 nextflow 中合并

问题描述

我有一个 nextflow 过程,它为每个染色体生成多个块到一个通道中,比如说,imputation它看起来像,

chr1.imputed.chunk1.gen.gz chr1.imputed.chunk2.gen.gz chr1.imputed.chunk3.gen.gz 
chr1.imputed.chunk1.stats chr1.imputed.chunk2.stats chr1.imputed.chunk3.stats
chr1.imputed.chunk1.bgen chr1.imputed.chunk2.bgen chr1.imputed.chunk3.bgen
.....

每条染色体有很多块(22 条染色体)。对于要获取的每种类型的文件集,我如何有效地将它们合并到各自的染色体中,

chr1.imputed.merged.gen.gz
chr1.imputed.merged.stats
chr1.imputed.merged.bgen

获得合并输出后,我想删除所有块。有什么帮助吗?

生成这些块的实际代码是:

process imputation {
publishDir params.out, mode:'copy'
input:
tuple val(chrom),val(chunk_array),val(chunk_start),val(chunk_end),path(in_haps),path(refs),path(maps) from imp_ch
output:
tuple val("${chrom}"),path("${chrom}.*") into imputed
script:
def (haps,sample)=in_haps
def (haplotype, legend, samples)=refs
"""
impute4.1.2_r300.3 -g "${haps}" -h "${haplotype}" -l "${legend}" -m "${maps}" -o "${chrom}.step10.imputed.chunk${chunk_array}" -no_maf_align -o_gz -int "${chunk_start}" "${chunk_end}" -Ne 20000 -buffer 1000 -seed 54321

if [[ \$(gunzip -c "${chrom}.step10.imputed.chunk${chunk_array}.gen.gz" | head -c1 | wc -c) == "0" ]]
then 
 echo  "${chrom}.step10.imputed.chunk${chunk_array}.gen.gz" is empty
else
 qctool_v2.0.8_rhel -g "${chrom}.step10.imputed.chunk${chunk_array}.gen.gz" -snp-stats -osnp "${chrom}.step10.imputed.chunk${chunk_array}.snp.stats"
 qctool_v2.0.8_rhel -g "${chrom}.step10.imputed.chunk${chunk_array}.gen.gz" -og "${chrom}.step10.imputed.chunk${chunk_array}.bgen" -os "${chrom}.step10.imputed.chunk${chunk_array}.sample"
fi
 """

标签: nextflow

解决方案


您能否发布生成您显示的代码段的实际代码

不看你的代码,我建议你可以试试这个http://nextflow-io.github.io/patterns/index.html#_process_per_file_range


推荐阅读