首页 > 解决方案 > 如何保留表中以分号分隔的字段的第一个、最小值、最大值?

问题描述

我有一个制表符分隔的文件,每个字段有许多分号分隔的值。这是文件:

Name First Last
foo3;foo3;foo3;foo3;foo3    11869;12010;12179;12613;12613   12227;12057;12227;12721;12697
bar10;bar10;bar10   14404;15005;15796   14501;15038;15947
locM;locM;locM;locM 29554;30267;30564;30976 30039;30667;30667;31109

我想用这个文件做以下[在 BASH 或 R 中]:

(1) 在第一列,只保留一个条目。

(2) 在第二列中,仅保留以分号分隔的数字列表中的最小值。

(3) 在第三列中,仅保留以分号分隔的数字列表中的最大值

这是所需的输出:

Name First Last
foo3    11869   12721
bar10   14404   15947
locM    29554   31109

应该注意的是,最高和最低值并不总是分别在分号分隔的列表中的第一个和最后一个值。

更新(解决问题的业余想法):

标签: dataframeawksed

解决方案


您能否仅根据您展示的示例尝试以下、编写和测试。

awk '
{
  split($1,array,";")
  num1=split($2,array1,";")
  num2=split($3,array2,";")
  min=array1[1]
  for(i=2;i<=num1;i++){
    min=(min<array1[i]?min:array1[i])
  }
  max=array2[1]
  for(i=2;i<=num2;i++){
    max=(max>array2[i]?max:array2[i])
  }
  print array[1],min,max
}'  Input_file

说明:为上述添加详细说明。

awk '                                                        ##Starting awk program from here.
{
  split($1,array,";")                                        ##Splitting 1st field into array with separator as ; here.
  num1=split($2,array1,";")                                  ##Splitting 2nd field into array1 with separator as ; here.
  num2=split($3,array2,";")                                  ##Splitting 3rd field into array2 with separator as ; here.
  for(i=1;i<=num1;i++){                                      ##Running for loop till vale of num1 which is total elements in array1.
    min=(min<array1[i]?(min?min:array1[i]):array1[i])        ##Creating min here which is using ternary operator to check if current element of array1 is lesser than min then keep it else keep min current value and so on comparing each element one by one here.
  }
  for(i=1;i<=num2;i++){                                      ##Running for loop till value of mun2 which is total elements in array2.
    max=(max>array2[i]?max:array2[i])                        ##Creating max here using ternary operator checks if max value greater than array2 current value then keep it as it is else assign current value of array2 to max here.
  }
  print array[1],min,max                                     ##Printing 1st element of array then min and max here.
  min=max=""                                                 ##Nullifying variables min and max here.
}'  Input_file                                               ##Mentioning Input_file names here.

推荐阅读