nextflow - 输出文件(染色体块)在 nextflow 中合并
问题描述
我有一个 nextflow 过程,它为每个染色体生成多个块到一个通道中,比如说,imputation
它看起来像,
chr1.imputed.chunk1.gen.gz chr1.imputed.chunk2.gen.gz chr1.imputed.chunk3.gen.gz
chr1.imputed.chunk1.stats chr1.imputed.chunk2.stats chr1.imputed.chunk3.stats
chr1.imputed.chunk1.bgen chr1.imputed.chunk2.bgen chr1.imputed.chunk3.bgen
.....
每条染色体有很多块(22 条染色体)。对于要获取的每种类型的文件集,我如何有效地将它们合并到各自的染色体中,
chr1.imputed.merged.gen.gz
chr1.imputed.merged.stats
chr1.imputed.merged.bgen
获得合并输出后,我想删除所有块。有什么帮助吗?
生成这些块的实际代码是:
process imputation {
publishDir params.out, mode:'copy'
input:
tuple val(chrom),val(chunk_array),val(chunk_start),val(chunk_end),path(in_haps),path(refs),path(maps) from imp_ch
output:
tuple val("${chrom}"),path("${chrom}.*") into imputed
script:
def (haps,sample)=in_haps
def (haplotype, legend, samples)=refs
"""
impute4.1.2_r300.3 -g "${haps}" -h "${haplotype}" -l "${legend}" -m "${maps}" -o "${chrom}.step10.imputed.chunk${chunk_array}" -no_maf_align -o_gz -int "${chunk_start}" "${chunk_end}" -Ne 20000 -buffer 1000 -seed 54321
if [[ \$(gunzip -c "${chrom}.step10.imputed.chunk${chunk_array}.gen.gz" | head -c1 | wc -c) == "0" ]]
then
echo "${chrom}.step10.imputed.chunk${chunk_array}.gen.gz" is empty
else
qctool_v2.0.8_rhel -g "${chrom}.step10.imputed.chunk${chunk_array}.gen.gz" -snp-stats -osnp "${chrom}.step10.imputed.chunk${chunk_array}.snp.stats"
qctool_v2.0.8_rhel -g "${chrom}.step10.imputed.chunk${chunk_array}.gen.gz" -og "${chrom}.step10.imputed.chunk${chunk_array}.bgen" -os "${chrom}.step10.imputed.chunk${chunk_array}.sample"
fi
"""
解决方案
您能否发布生成您显示的代码段的实际代码
不看你的代码,我建议你可以试试这个http://nextflow-io.github.io/patterns/index.html#_process_per_file_range
推荐阅读
- aws-device-farm - 为什么 Appium python find_element_by_id 在 AWS 设备场中失败但在本地工作?
- github - 创建 github 站点/自定义页面/多个项目
- python - one_shot_iterator,占位符,无法捕获占位符
- c - 如何在不输入命令提示符的情况下将光标移动到下一行?
- c++ - 将一个类的对象声明为另一个类的成员 (C++)
- spring - xwEx 对 log4j2 的含义?
- c# - C#:通过单例类访问类的实例时如何修复 System.TypeInitializationException?
- python - 试图在 Python 中获取 .wav 文件的频率
- arrays - 将2个数组乘以第三个空白数组范围VBA(Excel)
- python - ImportError:没有名为 controller.api python 的模块?