首页 > 解决方案 > Javascript 正则表达式:第二次出现块:ABC.js 音乐符号

问题描述

ABC是一种音乐符号;我正在研究将其解析为应用程序一部分的模式。

有时,一个曲调的多个演绎版本位于一个 ABC 文件中,我只需要获得第一个演绎版本——或者在理想的世界中,我指定的任何演绎版本。再现的开始由 X: 字符串表示。

无法提前知道文件中有多少演绎版。

在 Javascript 中,例如,我如何返回以下示例中的第一个演绎版(从第一个 X:包括到第二个开头),如果没有第二个,则返回第一个演绎版,并返回首先,如果有两个以上的演绎。

到目前为止,我的工作([\s\S]*)(?=X:)在两个再现示例中成功,但在一个或两个以上的再现中失败。

在前瞻中添加一个“或”结束的文件条件可以让单个再现案例工作,但在一个和三个再现案例中失败,例如\([\s\S]*)(?=X:|$)

任何帮助表示赞赏......许多人将使用解析 ABC 的好方法。

两个演绎的示例如下所示——对于一个三个演绎的示例,只需在最后添加一行 X: ,然后从第二个中删除所有内容X:

编辑:人们很友好地要求提供更好的例子,他们不适合发表评论,所以这里有一些

破誓很有趣,因为它有不止一个 ABC,而且它们没有按顺序编号:

X:56
T:Broken Pledge, The
R:reel
D:De Dannan: Selected Reels and Jigs.
Z:Also played in Edor, see #734
Z:id:hn-reel-56
M:C|
K:Ddor
dcAG ADDB|cAGF ECCE|D2 (3EFG Addc|AcGc Aefe|
dcAG FGAB|c2Bd cAGE|D2 (3EFG AddB|cAGE FDD2:|
|:dcAG Acde|~f3d ecAB|cAGE GAcd|ec~c2 eage|
dcAG Acde|fedf ecAG|~F3G AddB|cAGE FDD2:|
P:"Variations:"
|:dcAG ~A3B|cAGF ECCE|DEFG Addc|(3ABc Gc Aefe|
dcAG FGAB|c2Bd cAGE|DEFG AddB|A2GE FDD2:|
|:dcAG Acde|~f3d ecAB|cAGE GAcd|ec~c2 eage|
dcAG Acde|~f3d ecAG|FEFG AddB|A2GE FDD2:|

X:2
T:Broken Pledge, The
M:C
L:1/8
Q:250
K:D
dcAG A2 dB | cAGF EDC2 | DEFG Ad ~d2 | AcGc Adfe |
dcAG A2 dB | cAGF EDC2 | DEFG Ad ~d2 | AcGc ADD2 :|
|: dcAG A2 de | fedf edAB | cAGE GAcd | ec ~c2 eage |
dcAG A2 de | fedf edcA | F3 E FGAB | cAGE {F}ED D2 :||

Huish the Cat 很有趣,因为它有很多版本,编号都一样。你可以看到 X:whatever 是完全任意的:

X:1
T:Huish the Cat
M:6/8
L:1/8
N:”Author and date unknown.”
R:Air
Q:"Quick"
S:Byrne, the harper, 1802
B:Bunting – Ancient Music of Ireland (1840, p. 3)
Z:AK/Fiddler’s Companion
K:C
(G>A).G c2(e|d<).d.A c2z|(G>A).G .c2 d|(ec).A .A2G|
(G>A).G .c2(e|d<).d.A .c2e|(g>f).e .f2d|(ec).A A2G:|
|:(gf).e .f2d|(ed).c .f2d|(gf).e .f2d|(ec).A A2G|
(gf).e .f2d|(ed).c .f2.d|(G>A).G f2d|(ec).A [F2A2]G:|]

X:1
T:Hunt the Cat
M:6/8
L:1/8
R:Jig
Q:”Allegro”
B:William Forde – 300 National Melodies of the British Isles (c. 1841, p.  26, No. 87)
B: https://www.itma.ie/digital-library/text/300-national-melodies-of-the-british-isles.-vol.-3-100.-irish-airs
N:William Forde (c.1795–1850) was a musician, music collector and scholar from County Cork
Z:AK/Fiddler’s Companion
K:D
A>BA d2f|e<eB d3|A>BA d2e|fdB B2A|
A>BA d2f|e<eB d2f|a>gf g2e|fdB B2A:|
|:agf g2e|FED G2E|agf g2e|fdB B2A|
agf g2e|fed g2e|A>BA g2e|fdB B2A:|]

X:1
T:Huish the Cat
M:6/8
L:1/8
R:Jig
Q:"Quick"
B:P.M. Haverty – One Hundred Irish Airs vol. 1 (1858, No. 87, p. 37)
Z:AK/Fiddler’s Companion
K:C
(G>A).G .c2(e|d<).d.A c2z|(G>A).G .c2d|(ec).A .A2G|
(G>A).G .c2(e|d<).d.A .c2|(g>f).e .f2d|(cA).A A2G:|
|:(gf).e .f2d|(ed).c .f2d|(gf).e .f2d|(ec).A A2G|
(gf).e .f2d|(ed).c .f2.d|(G>A).G f2d|(ec).A [F2A2] G:|]

X:1
T:Huish the Cat
M:6/8
L:1/8
R:Single Jig
S:O'Neill - Dance Music of Ireland: 1001 Gems (1907), No. 382
Z:AK/Fiddler's Companion
K:C
G>AG c2e|d<dA c2e|G>AG c2d|ecA A2c|
G>AG c2e|d<dA c2e|g>fe f2d|ecA A2G:|
|:gfe f2d|edc f2d|gfe f2d|ecA A2G|
gfe f2d|edc f2d|G>AG f2d|ecA A2G:||

X:1
T:Hunt the Cat
M:6/8
L:1/8
B:Roche, vol. 3 (1927, p. 114)
K:Ddor
DED D2A|AGE c3|DED D2A|AGE E2D|
DED D2A|AGE c3|ABc d2B AGE E2D:|
|:dcA AGE|AGE c3|dcA AGE|AGE E2D|
dcA AGE|AGE c3|ABc d2c|AGE E2D:||

LowBack 车很乱,有百分号之类的

X:1
%
T:Lowbacked Car [1], The
M:6/8
L:1/8
R:Air
S:James Goodman (1828─1896) music manuscript collection, 
S:vol. 3, p. 133. Mid-19th century, County Cork
Z:AK/Fiddler’s Companion
K:G
G|G2B B2d|c2A z2F|G2B d2d|d3 z2G|
c2c A2A|B2B G2B|c2A G2F|G3 z2G|
G2c c2e|e2d d2G|G2c c2e|d3 z2G|
G2g !fermata!g2e|e2d dcB|A2G A2B|!fermata!d3 z2A|
GED G2G|G3 z2B|AGE A2A|A3z B/c/|
dcB dcB|gfe !fermata!d2 B/A/|GED G2G|(G3 G2)||
X:1
%
T:Low Backed Car (1)
M:6/8
L:1/8
B:Howe - Musicians's Omnibus No. 2 (p. 107)
Z:AK/Fiddler's Companion
R:G
G|G2B B2d|c3 A2d|G2 B2 d2d|(d3 d2)B|
c2c A2A |B3 G2G A2A F2F|(G3 G2)||d|
d2g g2e|e2d d2B|d2g g2e|(e3 d2)d|
d2g g2e|e2d d2B|BAG A2B|d2c B2A|
.G.E.E .G2G|(G3 G2)B|AGE A2A|A3 ABc|
(.d.c.B) (.d.c.B)|(.a.a.d) .e.d.B|.G.E.D|(G3 G2)|]
X:1
%
T:Low Backed Car [1], The
M:6/8
L:1/8
R:Jig
B:Kerr - Merry Melodies, vol. 2, No. 257  (c. 1880's)
Z:AK/Fiddler's Companion
K:G
D|G2B B2d|d2c A2F|G2B d2d|(d3 d2) B|
cBc A2A|BAB GAB|c2A G2F|(G3 G2):||
B|G2g g2e|e2d d2B|G2g g2e|d3 cBA|
G2g g2e|e2d dcB|A2G A2B|d3 cBA|
GED G2G|(G3 G2)B|AGE A2A|A3 (ABc)|
dcB dcB|Gfe dBA|GED G2G|(G3 G2)||

Lowbacked Car for 6 是单曲的模态案例,我们需要将其作为最常见的情况处理:

X:1
T:Jaunting Car for Six
M:9/8
L:1/8
R:Slip Jig
S:Kerr - Merry Melodies, vol. 3, No. 233 (c. 1880's)
Z:AK/Fiddler's Companion
K:A
efe c2c c3|efe cde fga|efe c2c c3|BcB B2c def:|
|:e2a agf ecA|e2a agf e3|e2a agf ecA|BcB B2c def:|| 

标签: javascriptregexregex-lookaroundsabcjs

解决方案


这是对答案的完全重写,对不起。以下函数返回您当前感兴趣的信息(它可以扩展以返回更多信息,例如,再现的标题作为与数组共享索引的renditions数组)。

function getAbcInfo(abc) {
    let renditions = ('\n' + abc).split(/[\r\n]+(?=[ \t\u00a0]*X[ \t\u00a0]*:[ \t\u00a0]*\d+)/);
    renditions.push(renditions.pop().replace(/[\r\n]+$/, ''))
    renditions.unshift(renditions.shift().replace(/^[\r\n]+/, ''))
    let x = ['']
    let indicesOfX = {'': [0]}
    for (let i = 1; i < renditions.length; i++) {
        let n = renditions[i].match(/^[ \t\u00a0]*X[ \t\u00a0]*:[ \t\u00a0]*(\d+)/)[1]
        x[i] = n
        if (n in indicesOfX) {
            indicesOfX[n].push(i)
        } else {
            indicesOfX[n] = [i]
        }
    }
    return {renditions: renditions, x: x, indicesOfX: indicesOfX}
}

console.log(JSON.stringify(getAbcInfo(brokenPledge)));
// {"renditions":["","X:56…&quot;,"X:2…&quot;],"x":["","56","2"],"indicesOfX":{"2":[2],"56":[1],"":[0]}}
console.log(JSON.stringify(getAbcInfo(huishTheCat)));
// {"renditions":["","X:1…&quot;,"X:1….","X:1…&quot;,"X:1…&quot;,"X:1…&quot;],"x":["","1","1","1","1","1"],"indicesOfX":{"1":[1,2,3,4,5],"":[0]}}
console.log(JSON.stringify(getAbcInfo(lowbackedCar)));
// {"renditions":["","X:1…&quot;,"X:1…&quot;,"X:1…&quot;],"x":["","1","1","1"],"indicesOfX":{"1":[1,2,3],"":[0]}}
console.log(JSON.stringify(getAbcInfo(commonCase)));
// {"renditions":["","X:1…&quot;],"x":["","1"],"indicesOfX":{"1":[1],"":[0]}}
console.log(JSON.stringify(getAbcInfo(brokenPledgeWithoutTheFirstLine)));
// {"renditions":["T:Broken Pledge…&quot;,"X:2…&quot;],"x":["","2"],"indicesOfX":{"2":[1],"":[0]}}

renditions数组始终包含X:索引处第一个(如果有)之前的内容0。这通常是空字符串,但它可能是一个带有标准允许的字段的标题,或者如果它的X:行被简单地省略(违反标准,但人类并不总是遵循标准),甚至是完整的再现。

从索引1开始, 的项目renditions是以(实际上允许空格,请参阅正则表达式)开头的再现X:,并且去除了尾随换行符。

数组与x数组共享索引renditions,给出每个再现n的行的。X:n由于 index 处的“再现”0没有X:n行(它是“未命名”,或者更确切地说是“未编号”),因此x数组将始终在 index 处具有空字符串0

indicesOfX对象允许您在renditions给定的nof中获取索引数组X:n。换句话说,它反转了x数组的键值关系。

如果您想扩展函数以将titles数组添加到输出中,请不要忘记您不能简单地匹配 a T:,因为您必须考虑空格(我使用的正则表达式允许空格、制表符和非分隔空格 - 不要使用\s*,因为它包括\n),还因为T: 必须以换行符开头,除了 index 处的再现0,它可以在字符串的开头。文本T:以换行符 ( [\r\n]) 结尾。

顺便说一句,您可能希望通过将 all 替换为空来“规范化”换行符\r,或者,如果您担心换行符所在的位置可能存在旧的 Mac Classic 文件,则将 all\r替换\r\n\n,然后将所有保留\r\n. 一旦您确定周围没有换行符,您可以使用and (multiline) 标志\r同时匹配新行的开头和字符串的开头。^m


推荐阅读