首页 > 解决方案 > 如何根据列名过滤 CSV 文件中的列

问题描述

我正在使用如下的 CSV 数据。我不想使用 csv 文件中的用户和时间戳。我可能会添加几列或删除列。

我没有在 Text CSV 中找到任何合适的方法。请让我知道是否有任何方法或模块可用

UniqueId, Name, description, user,timestamp     
1,jana,testing,janardar,12-10-2018:00:
sub _filter_common_columns_from_csv{

    my $csvfile = shift;
    my $CSV = Text::CSV_XS->new(
                                {
                                    binary => 1,
                                    auto_diag => 3,
                                    allow_quotes => 0,
                                    eol => $/ 
                                });
    my $_columns ||= do {
    open(my $fh, '<', $csvfile) or die $!;
    my @cols = @{ $CSV->getline($fh) };
    close $fh or die $!;
    for (@cols) { s/^\s+//; s/\s+$//; }
        \@cols;
    };
    my @columns = @{ $_columns };     
    my %deleted;                        
    my @regexes = qw(user timestamp);
    foreach my $regex (@regexes) {
            foreach my $i (0 .. ($#columns - 1)) {
                    my $col = $columns[$i];
                       $deleted{$i} = $col if $col =~ /$regex/;
            }
    }

    my @wanted_columns = grep { !$deleted{$_} } 0 .. $#columns - 1;
    my $input_temp = "$ENV{HOME}/output/temp_test.csv";


    open my $tem, ">",$input_temp or die "$input_temp: $!";

    open(my $fh, '<', $csvfile) or die $!;

    while (my $row = $CSV->getline($fh)) {
           my @fields = @$row;
              $CSV->print($tem, [ @fields[@wanted_columns] ]) or $CSV->error_diag;
    }
    close $fh or die $!;
    close $tem or die $!;

    return $input_temp;
}

标签: perlcsv

解决方案


getline_hr

use warnings;
use strict;
use feature 'say';

use List::MoreUtils qw(any);
use Text::CSV;

my $file = shift @ARGV || die "Usage: $0 filename\n";

my @exclude_cols = qw(user timestamp);

my $csv = Text::CSV->new ( { binary => 1 } ) 
    or die "Cannot use CSV: ".Text::CSV->error_diag (); 

open my $fh, '<', $file or die "Can't open $file: $!";

my @cols  = @{ $csv->getline($fh) };

my @wanted_cols = grep { 
    my $name = $_; 
    not any { $name eq $_ } @exclude_cols;
} @cols;

my $row = {}; 
$csv->bind_columns (\@{$row}{@cols});

while ($csv->getline($fh)) {
    my @wanted_fields = @$row{ @wanted_cols };
    say "@wanted_fields";
}

该语法@$row{@wanted_cols}适用于哈希切片,它@wanted_cols从 hashref中返回键的值列表$row


推荐阅读