首页 > 解决方案 > 如何从一行的开头获取正则表达式并将其复制到下一行的开头?

问题描述

我正在编写一个脚本,该脚本将文本从 pdf 文档转换为 CSV 格式以供以后使用。我遇到了一个问题,我需要将其他信息附加到某些行以完成数据并且不知道如何使用sed. 该文档如下所示:

# "date","description","cost","total"
"31 01 19","Purchase from SHOP","1.23","1.23"
"Direct debit to COMPANY","2.34","3.57"
"Purchase from SHOP","3.45","7.02"
"01 02 19","Received from PERSON","1.23","5.79"
"Purchase to SHOP","4.56","10.35"

什么时候应该是这样的:

# "date","description","cost","total"
"31 01 19","Purchase from SHOP","1.23","1.23"
"31 01 19","Direct debit to COMPANY","2.34","3.57"
"31 01 19","Purchase from SHOP","3.45","7.02"
"01 02 19","Received from PERSON","1.23","5.79"
"01 02 19","Purchase to SHOP","4.56","10.35"

我怎么能做到这一点sed

我试过了:

/^(\"[[:digit:]]{2} [[:digit:]]{2} [[:digit:]]{2}\",)/{
    h
    N
    /^(\"[^\"]*\",\"(0|[1-9][[:digit:]]{,2}(,[[:digit:]]{1,3})*)\.[[:digit:]]{2})\",?{2})/{
        G
        s/((.*))\n((.*))/\2,\1/
    }
}

但这似乎无济于事,即使对正则表达式进行了测试以确保它们与我所追求的相匹配。我在这里做错了什么还是有更好的方法来做到这一点?

标签: linuxshellsedposix

解决方案


这可能对您有用(GNU sed):

sed -E 'N;/\n".. .. .."/!s/^([^,]+,).*\n/&\1/;P;D' file

附加以下行,它不以日期开头,插入上一行日期,打印/删除上一行并重复。


推荐阅读