首页 > 解决方案 > 尝试将 ISO 8601 持续时间格式转换为在 bash 中添加日期

问题描述

抱歉,这可能看起来很具体,但归结为我无法弄清楚的 sed 问题。

上下文:我正在尝试编写一个 bash 函数,该函数接受任意 ISO 8601 格式的持续时间(例如P1DPT12M1SP15MPT12M40S等)并将其转换为我可以用来通过以下方式添加到日期的字符串:date -d "$(date) + 1 day + 1 minute"

这是我到目前为止所拥有的:

duration_parser() {\
    duration=$1\
    duration=$(sed 's/PT\(.*\)\([[:digit:]]\)S/PT\1 + \2 second/g' <<< $duration)
    duration=$(sed 's/PT\(.*\)\([[:digit:]]\)M/PT\1 + \2 minute/g' <<< $duration)
    duration=$(sed 's/PT\(.*\)\([[:digit:]]\)H/PT\1 + \2 hour/g' <<< $duration)
    duration=$(sed 's/PT//g' <<< $duration)
    duration=$(sed 's/P\(.*\)\([[:digit:]]\)D/P\1 + \2 day/g' <<< $duration)
    duration=$(sed 's/P\(.*\)\([[:digit:]]\)W/P\1 + \2 week/g' <<< $duration)
    duration=$(sed 's/P\(.*\)\([[:digit:]]\)M/P\1 + \2 month/g' <<< $duration)
    duration=$(sed 's/P\(.*\)\([[:digit:]]\)Y/P\1 + \2 year/g' <<< $duration)
    duration=$(sed 's/P//g' <<< $duration)
    echo $duration
}\

date -d "$(date) $(duration_parser PT6M3S)"

适用于每个单位为一位数的持续时间,例如

date -d "$(date) $(duration_parser PT6M3S)"

有效,但是当单位是多位数时,例如 60 分钟:

date -d "$(date) $(duration_parser PT60M3S)"

它不是。我似乎无法 sed 拿起所有的数字......

有没有办法用 sed 做到这一点,或者这不是最好的方法吗?还有更简单的方法吗?哈哈

标签: bashdatesed

解决方案


sed拥有自己的语言,您可以在其中链接命令,甚至处理错误。

我以这个结束:

dur_to_dateadd() {
    # https://en.wikipedia.org/wiki/ISO_8601#Durations
    # PnYnMnDTnHnMnS <- we handle only this
    <<<"$1" sed -E '
        # it has to start with p
        /^P/!{
            s/.*/ERROR: Invalid input - it has to start with P: "&"/
            q1
        }
        s/^P//

        # add an unredable 0x01 on the end
        # it serves as our "line separator"
        s/$/\x01/

        # parse from the beginning, add to the end after \x01
        s/^([0-9]*([,.][0-9]*)?)Y(.*)/\3 + \1 year/
        s/^([0-9]*([,.][0-9]*)?)M(.*)/\3 + \1 month/
        s/^([0-9]*([,.][0-9]*)?)D(.*)/\3 + \1 day/
        /^T/{
            s///
            s/^([0-9]*([,.][0-9]*)?)H(.*)/\3 + \1 hour/
            s/^([0-9]*([,.][0-9]*)?)M(.*)/\3 + \1 minute/
            s/^([0-9]*([,.][0-9]*)?)S(.*)/\3 + \1 second/
        }

        # we should have parsed it all
        # so our separator \x01 has to be the first character
        /^\x01/!{
          # there is something unparsed in the input
            s/\x01.*//
            s/.*/ERROR: Unparsable input: "&"/
            q1
        }
        # remove the \x01
        s///

        # just convert , to . in case of floats
        s/,/./g
    '
}

dur_to_dateadd "P3Y6M4DT12H30M5S"
dur_to_dateadd "P23DT23H"
dur_to_dateadd "P4Y"
dur_to_dateadd "PT0S"
dur_to_dateadd "P0D"
dur_to_dateadd "P1M"
dur_to_dateadd "PT1M"
dur_to_dateadd "P0,5Y"
dur_to_dateadd "P0.5Y"
dur_to_dateadd "PT36H"
dur_to_dateadd "P1DT12H"
dur_to_dateadd "invalid" || echo error
dur_to_dateadd "P1Dinvalid" || echo error
dur_to_dateadd "PinvalidDT" || echo error

输出:

 + 3 year + 6 month + 4 day + 12 hour + 30 minute + 5 second
 + 23 day + 23 hour
 + 4 year
 + 0 second
 + 0 day
 + 1 month
 + 1 minute
 + 0.5 year
 + 0.5 year
 + 36 hour
 + 1 day + 12 hour
ERROR: Invalid input - it has to start with P: "invalid"
error
ERROR: Unparsable input: "invalid"
error
ERROR: Unparsable input: "invalidDT"
error

在repl上测试。

简短描述:首先,我删除了首字母,并在输入的末尾^P附加了一个不可读的字符\x01。它充当“行分隔符”,将未解析的输入与已解析的输入/输出字符串分开。然后我们从头开始解析输入 - 我们处理^<number>Y, then^<number>M等等。如果例如^<number>Y匹配,那么我们+ \1 year在字符串的末尾添加 after\x01和其他任何内容。解析的部分在我们解析它们时被删除。然后是一些错误检查 - 如果输入的所有内容都被解析,则\x01应该是模式空间中的第一个字符。如果是,我们将其删除并结束 - 打印模式空间。

只是为了好玩,下面我还添加了对句柄PnW PYYYYMMDDThhmmssPYYYY-MM-DDThh:mm:ss格式的支持,它们很容易解析,你可以用一个匹配来匹配所有内容。

dur_to_dateadd() {
    # https://en.wikipedia.org/wiki/ISO_8601#Durations
    # We support formats:
    # PnYnMnDTnHnMnS
    # PnW
    # PYYYYMMDDThhmmss 
    # PYYYY-MM-DDThh:mm:ss
    <<<"$1" sed -E '
        # it has to start with p
        /^P/!{
            s/.*/ERROR: Invalid input - it has to start with P: "&"/
            q1
        }
        s///

        # add an unredable 0x01 on the end
        # it serves as our "line separator"
        s/$/\x01/

        # handle PnW format
        /^([0-9]*([,.][0-9]*)?)W(.*)/{
            s//\3 + \1 week/
            b finish
        }

        # handle PYYYYMMDDThhmmss format
        /^([0-9]{4})([0-9]{2})([0-9]{2})T([0-9]{2})([0-9]{2})([0-9]{2})(.*)/{
            s//\7 + \1 year + \2 month + \3 day + \4 hour + \5 minute + \6 second/
            b finish
        }

        # handle PYYYY-MM-DDThh:mm:ss format
        /^([0-9]{4})-([0-9]{2})-([0-9]{2})T([0-9]{2}):([0-9]{2}):([0-9]{2})(.*)/{
            s//\7 + \1 year + \2 month + \3 day + \4 hour + \5 minute + \6 second/
            b finish
        }


        # PnYnMnDTnHnMnS format
        # parse from the beginning, add to the end after \x01
        s/^([0-9]*([,.][0-9]*)?)Y(.*)/\3 + \1 year/
        s/^([0-9]*([,.][0-9]*)?)M(.*)/\3 + \1 month/
        s/^([0-9]*([,.][0-9]*)?)D(.*)/\3 + \1 day/
        /^T/{
            s///
            s/^([0-9]*([,.][0-9]*)?)H(.*)/\3 + \1 hour/
            s/^([0-9]*([,.][0-9]*)?)M(.*)/\3 + \1 minute/
            s/^([0-9]*([,.][0-9]*)?)S(.*)/\3 + \1 second/
        }

        : finish

        # we should have parsed it all
        # so our separator \x01 has to be the first cahracter
        /^\x01/!{
          # there is something unparsed in the input
            s/\x01.*//
            s/.*/ERROR: Unparsable input: "&"/
            q1
        }
        # remove the \x01
        s///

        # just convert , to . in case of floats
        s/,/./g
    '
}

推荐阅读