首页 > 解决方案 > Perl将字符剪切为循环正则表达式,打印到行尾

问题描述

我有这些数据,我想在其中删除日期,并打印从首字母到结尾的所有内容。我映射了首字母。

30th Mar 2020 5:53:18 pm Charlie Brown: BJ: Bloomberg Runs
30th Mar 2020 5:53:27 pm Charlie Brown: DS: ICE DATA = INC1018483661
30th Mar 2020 6:42:43 pm Boris Yeltsin: Cortese's ICE logs is for the Bloomberg Runs issue
30th Mar 2020 6:43:28 pm Charlie Brown: yeap
31st Mar 2020 4:11:22 am Ishtar Johnson: VK : RE: XS2018777099 & XS2018777172 - INC1018491954
31st Mar 2020 6:31:17 am Tommy Boy: NW: RE: SABSM 6.125 YTW - INC1018495843
31st Mar 2020 7:26:40 am Tommy Boy: AP: RE: Rolling 7yrs - INC1018497102
31st Mar 2020 7:45:36 am Tommy Boy: JK: RE: Chris White books - INC1018497380

这是代码 -

#!/usr/bin/perl

use strict;
use warnings;

my @team = ("AP","II","DS","WJ", "JK","LC","BJ") ;
my ( $team_regex ) = map {qr /$_/} join "|", map {quotemeta} @team;

my @orderdTeam ;
my $filename = shift @ARGV ;
open(my $fh, '<', $filename) or die "Could not open file $filename $!";
while (my $line = <$fh> ) {
        #$line =~ /($team_regex .*)/s  ;
        $line = /($team_regex .*)/s  ;
        print "$line\n";

}
close $fh;

出于某种原因,我得到了这些未初始化的错误。

johnswal@NYKPWM2037968 ~
$ ./cut_date_symphony.pl fooberry
Use of uninitialized value $_ in pattern match (m//) at ./cut_date_symphony.pl line 14, <$fh> line 1.
Use of uninitialized value $_ in pattern match (m//) at ./cut_date_symphony.pl line 14, <$fh> line 2.
Use of uninitialized value $_ in pattern match (m//) at ./cut_date_symphony.pl line 14, <$fh> line 3.
Use of uninitialized value $_ in pattern match (m//) at ./cut_date_symphony.pl line 14, <$fh> line 4.
Use of uninitialized value $_ in pattern match (m//) at ./cut_date_symphony.pl line 14, <$fh> line 5.
Use of uninitialized value $_ in pattern match (m//) at ./cut_date_symphony.pl line 14, <$fh> line 6.
Use of uninitialized value $_ in pattern match (m//) at ./cut_date_symphony.pl line 14, <$fh> line 7.
Use of uninitialized value $_ in pattern match (m//) at ./cut_date_symphony.pl line 14, <$fh> line 8.

注释行只打印出整行 - 它不会删除日期或时间

#$line =~ /($team_regex .*)/s  ;

所以这就是我要找的。“Tommy Boy NW:”和“Ishtar Johnson VK:”是我们团队的一员,但来自欧洲。只会显示地图数组“@team_regex”门票中的美国团队成员。并且时间和日期将被删除。

BJ: Bloomberg Runs
DS: ICE DATA = INC1018483661
AP: RE: Rolling 7yrs - INC1018497102
JK: RE: Chris White books - INC1018497380

标签: regexperlcut

解决方案


第 14 行是这一行:

$line = /($team_regex .*)/s  ;

匹配运算符 ( /.../) 适用于使用=~运算符绑定到它的变量,或者$_如果没有给出这样的变量。您不使用=~,因此匹配运算符尝试匹配$。并且$_不包含任何数据,因此 Perl 会为您提供您所看到的“未定义值”警告。

我认为您想将正则表达式与$line. 所以你需要使用=~而不是=- 在你的注释行中。

$line =~ /($team_regex .*)/s  ;

但是在上面的评论中,您解释说您已将其注释掉,因为:

注释行不会删除任何字符 - 它会打印整个行

当然,它这样做是因为您没有编写任何代码来$line进行任何更改。但是你想要的是在$1比赛之后,所以打印出来。

$line =~ /($team_regex .*)/s  ;
print $1;

但是正则表达式变量$1只能在成功匹配时设置,因此在打印它们之前检查匹配是否有效很重要。您可以通过将匹配运算符放在if语句中来做到这一点。

if ($line =~ /($team_regex .*)/s) {
  print $1;
}

更新:哦,这不起作用,因为您数据中的团队代码后跟一个冒号,而不是一个空格(正如您的正则表达式所假设的那样)。所以把它改成这样:

if ($line =~ /($team_regex:.*)/s) {
  print $1;
}

推荐阅读