首页 > 解决方案 > 在 perl 中,我如何模拟 unix fmt?

问题描述

有时,在 perl 脚本中,我想对长字符串 --- 或对包含许多长行的文件 --- 在 unix 命令行上执行以下操作:

fmt myfile
# or
echo Now is the time, and also all work and no play makes Jack a dull boy, over and over | fmt -20

也就是说,我想模拟 unix 执行的最基本任务fmt:断开文件的行以使它们不会跑出屏幕,并保持段落分开。

下面的脚本做到了。我错过了什么吗?在 perl 中有更简单的方法吗?

#!/usr/bin/perl
#
use strict; use warnings;
use Getopt::Long; 

my $diagnose = 0; # not used
my $maxlimit   = 50;

#https://stackoverflow.com/questions/11526517/should-you-check-the-return-code-from-getoptlonggetoptions
die unless GetOptions (
        'diagnose!' => \$diagnose, # not used
        'maxlimit=i' => \$maxlimit,
    );

$/ = undef; # slurp entire file into one string
my $DATA = <>;
my $rgx_split_prg = qr/\n\s*\n/; # any all-whitespace line will break paragraphs 
my $rgx_split_line = qr/\n+/;

my @Paragraphs = split ( $rgx_split_prg, $DATA );
my $index_of_last_paragraph = scalar @Paragraphs;
my $countparagraph = 0;
foreach my $paragraph (@Paragraphs)
{
    $countparagraph++; 
    $paragraph =~ s/^\s*//; # remove leading whitespace
    my $multilinestring = '';
    my $localtotal = 0;
    foreach my $line (split ($rgx_split_line, $paragraph) )
    {
        foreach my $el (split('\s+', $line,))
        {
            next unless ($el=~/\S/);
            $localtotal+=length $el;
            $localtotal+=1; # interword space uses a column, so count it 
            $multilinestring = join('', $multilinestring, $el, ' ',);
            if($localtotal > $maxlimit)
            {
                $multilinestring = join('', $multilinestring, "\n",);
                $localtotal=0; 
            }
        }
    
    }
    # 1st newline makes last (non-whitespace / nontrivial) line in the paragraph end in a newline, i.e., not pathological.
    # 2nd newline places a pure newline *BETWEEN* paragraphs --- so, not if it's the last paragraph.
    if($countparagraph == $index_of_last_paragraph)
    {
        $multilinestring =~ s/\s*\z/\n/s; # 
    }
    else
    {
        $multilinestring =~ s/\s*\z/\n\n/s; # separate paragraphs by exactly one (1) pure newline.
    }
    print $multilinestring; 
}


例子:

> echo Now is the time, and also all work and no play makes Jack a dull boy, over and over | myfmt -m 20
Now is the time, and 
also all work and no 
play makes Jack a dull 
boy, over and over

标签: perl

解决方案


推荐阅读