首页 > 解决方案 > 使用 sed 或 awk 或 perl 替换数据的双引号限定符


我有带有|分隔符和"限定符的 txt 文件。我想将限定符更改为~符号,我遇到的问题是实际列值文本有双引号。


"Live Your Dreams: Be You"|"20 Feb 2018"|"2 formats and editions"|"Are you being swept away by life being busy? Are things seemingly out of your control? Do you want to calm the chaos in your life? Are you ready to transform your life? In 
"Live Your Dreams"
now AMAZON BESTSELLER, readers are shown how to take immediate control of their mental, emotional, physical and entrepreneurial destiny."|"All this and more as you immerse yourself in the story that opens up like scenes from "a Bollywood movie""|"Indian Edition"

我已经尝试过sedawk通过引用堆栈溢出和 unix.com 中的内容,但列内的双引号会产生问题。


~Live Your Dreams: Be You~|~20 Feb 2018~|~2 formats and editions~|~Are you being swept away by life being busy? Are things seemingly out of your control? Do you want to calm the chaos in your life? Are you ready to transform your life? In 
"Live Your Dreams"
now AMAZON BESTSELLER, readers are shown how to take immediate control of their mental, emotional, physical and entrepreneurial destiny.~|~All this and more as you immerse yourself in the story that opens up like scenes from "a Bollywood movie"~|~Indian Edition~

代码尝试: sed 's_"([^*])"_~\1~_g' data.txt > tdata.txt

根据上述 sed 的结果:

"Live Your Dreams: Be You~|~20 Feb 2018~|~2 formats and editions~|~Are you being swept away by life being busy? Are things seemingly out of your control? Do you want to calm the chaos in your life? Are you ready to transform your life? In 
"Live Your Dreams"
now AMAZON BESTSELLER, readers are shown how to take immediate control of their mental, emotional, physical and entrepreneurial destiny.~|~All this and more as you immerse yourself in the story that opens up like scenes from "a Bollywood movie"~|~Indian Edition~



标签: perlawksed


您实际拥有的是格式错误的 CSV 数据,其中分隔符 char 为|.

由于“内部”引号未转义,因此格式错误:在包含引号的 CSV 字段中,引号应重复,如下所示

1,2,"field,with,commas","this field ""contains quotes"" that are duplicated"
# ..................................^^...............^^


"Live Your Dreams: Be You"|"20 Feb 2018"|"2 formats and editions"|"Are you being swept away by life being busy? Are things seemingly out of your control? Do you want to calm the chaos in your life? Are you ready to transform your life? In 
""Live Your Dreams""
now AMAZON BESTSELLER, readers are shown how to take immediate control of their mental, emotional, physical and entrepreneurial destiny."|"All this and more as you immerse yourself in the story that opens up like scenes from ""a Bollywood movie"""|"Indian Edition"

如果第 2 行和第 3 行的内部引号被正确转义,那么您可以使用 CSV 解析器来转换输出引号。Perl 的 csv 解析器可以处理包含换行符的字段:

perl -MText::CSV -e '
    open my $fh, "<:encoding(UTF-8)", shift(@ARGV);
    my $csv_in = Text::CSV->new({ quote_char => "\"", sep_char => "|", binary => 1 });
    my $csv_out = Text::CSV->new({ quote_char => "~", escape_char => "~", sep => "|", binary => 1 });
    while (my $row = $csv_in->getline($fh)) {
        $csv_out->say(STDOUT, $row);
    $csv_in->eof or $csv_in->error_diag();
' file.csv
~Live Your Dreams: Be You~|~20 Feb 2018~|~2 formats and editions~|~Are you being swept away by life being busy? Are things seemingly out of your control? Do you want to calm the chaos in your life? Are you ready to transform your life? In 
"Live Your Dreams"
now AMAZON BESTSELLER, readers are shown how to take immediate control of their mental, emotional, physical and entrepreneurial destiny.~|~All this and more as you immerse yourself in the story that opens up like scenes from "a Bollywood movie"~|~Indian Edition~
