首页 > 解决方案 > How to give file input from a dir and produce the output in a different dir using GNU parallel?

问题描述

I am trying to use parallel for bam to sort and index using samtools and producing the output in a given output_dir but facing some problems.

I tried so far the following, which is working but I don't want that dir name "1" within output_dir and also getting results files within input_dir.

parallel --results output_dir 'samtools sort -o {.}.sorted.bam {}' ::: input_dir/*.bam

This, from comments, is not working:

parallel 'samtools sort -o output_dir/{.}.sorted.bam {}' ::: input_dir/*.bam

I get the error

“[E::hts_open_format] Failed to open file output_dir/input_dir/A-8_20181222_0036.sorted.bam”

Note: This is just one tool (samtools) I am asking but I will be using other tools that produce output using --output / -o flag.

标签: bashbioinformaticsgnu-parallel

解决方案


If your question is "how can I add a different directory instead of the input directory", just put it verbatim before the {/.} token. (You had {.} but we also want to trim the directory name.)

parallel 'samtools sort -o output_dir/{/.}.sorted.bam {}' ::: input_dir/*.bam

See the manual for more ideas, there is a large number of transformations you can perform on the input token.


推荐阅读