首页 > 解决方案 > 使用 awk 的条件语句未按预期工作

问题描述

我正在尝试根据第 2 列中的值选择文件中的名称。我为此使用 awk 而不是进入 R 以提高速度,但我没有得到我期望的结果。

EDIT:
gshuf -n 20 file.csv
targets,log2FoldChange
TRINITY_GG_37011_c0_g1_i1.mrna1,2.837606866
TRINITY_GG_8817_c1_g1_i1.mrna1,-1.895959897
TRINITY_GG_73755_c2_g1_i1.mrna1,2.23502917
TRINITY_GG_63035_c0_g1_i1.mrna1,2.185122911
TRINITY_GG_111654_c0_g1_i1.mrna1,8.101066537
TRINITY_GG_59126_c0_g1_i4.mrna1,3.482842141
TRINITY_GG_37271_c0_g1_i6.mrna1,-3.046035487
TRINITY_GG_53334_c0_g1_i3.mrna1,-3.96110701
TRINITY_GG_26406_c0_g1_i2.mrna1,9.391942576
TRINITY_GG_113831_c0_g1_i1.mrna1,3.22109874
TRINITY_GG_114771_c0_g1_i7.mrna1,7.109622418
TRINITY_GG_125067_c0_g1_i9.mrna1,23.02443794
TRINITY_GG_32340_c1_g1_i9.mrna1,5.983333292
TRINITY_GG_101900_c0_g1_i1.mrna1,-3.48623125
TRINITY_GG_3412_c0_g1_i2.mrna1,2.515568648
TRINITY_GG_122872_c0_g1_i7.mrna1,9.993553116
TRINITY_GG_18380_c0_g1_i1.mrna1,-4.455484395
TRINITY_GG_69309_c0_g2_i11.mrna1,-6.927214772
TRINITY_GG_68534_c7_g1_i1.mrna1,-3.149415191
TRINITY_GG_95195_c0_g1_i11.mrna1,7.607035309

cat file.csv | wc -l
   10687

#To get >=2.5 
cat file.csv | awk -F, '{if($2>=2.5)print $1}'| wc -l
    3308


#Between -2.5 and 2.5
cat file.csv | awk -F, '{if($2>-2.5 &&  $2 < 2.5)print $1}'| wc -l
    5451

#To get <=2.5 
cat file.csv| awk -F, '{if($2<=-2.5)print $1}'| wc -l
    1929

但是我已经手动检查过,到处都是。

#This should only print when column 2 <= -2.5
cat file.csv | awk -F, '{if($2<=-2.5)print $1,$2}'| head
TRINITY_GG_63049_c0_g1_i1.mrna1 -0.397269608
TRINITY_GG_148283_c0_g1_i1.mrna1 -0.410665303
TRINITY_GG_107346_c0_g1_i3.mrna1 -0.444588319
TRINITY_GG_25844_c1_g1_i1.mrna1 -0.455797238
TRINITY_GG_95_c1_g1_i1.mrna1 -0.467825233
TRINITY_GG_138461_c2_g1_i1.mrna1 -0.471162154
TRINITY_GG_111467_c0_g1_i4.mrna1 -0.473621231

任何人都可以建议有什么问题吗?

标签: unixawk

解决方案


推荐阅读