首页 > 解决方案 > C# 正则表达式 - 获得壮举。超出音频文件名

问题描述

我有一个名为“The Kemist, Nyanda ft. Braindead - Mayhem 2 (Dj Reg Refix)”的音频文件。现在我想用 Regex 提取特征(是的,Regex,因为这个样本只是多个操作的开始,使用 Regex 会更简单),所以我会得到“Braindead”。

到目前为止,我所拥有的是:

    public const string Feature1 = "ft?.\\s";
    public const string Feature2 = "feat?.\\s";
    public const string Feature3 = "featuring\\s";

    public const string Hyphen1 = "-";
    public const string Comma1 = ",";
    public const string And = "&";

    public const string BracketOpen1 = "(";
    public const string BracketOpen2 = "[";
    public const string BracketOpen3 = "{";

    public const string BracketClosed1 = ")";
    public const string BracketClosed2 = "]";
    public const string BracketClosed3 = "}";

    /// <summary>
    /// The words / Signs / Chars which indicate a new Artist / Feature / Title
    /// </summary>
    public static List<string> WordStopper = new List<string>()
    {
        Feature1, Feature2, Feature3,
        BracketOpen1, BracketOpen2, BracketOpen3,
        BracketClosed1, BracketClosed2, BracketClosed3,
        Hyphen1, Comma1
    };

    /// <summary>
    /// The start of a new feature
    /// </summary>
    public static List<string> FeatureBeginning = new List<string>()
    {
        Feature1,
        Feature2,
        Feature3
    };

    private static List<string> GetFeatures(string filename)
    {
        // Set the left side
        string starter = string.Join("|", FeatureBeginning.Select(w => w));

        // Set the right side
        string stopper = string.Join("|", WordStopper.Select(w => w));

        // Get the matches
        MatchCollection matches = Regex.Matches(filename, $"{starter}(\\.+){stopper}", RegexOptions.IgnoreCase);

        return null;
    }

这给了我以下错误:“{System.ArgumentException: parsing 'ft?.\s|feat?.\s|featuring\s(.+)ft?.\s|feat?.\s|featuring\s| (|[|{|)|]|}|-|,' - 不够 ) 的。”

我在这里做错了什么?

标签: c#regex

解决方案


这应该有效:

public const string Feature1 = @"ft?.\s";
public const string Feature2 = @"feat?.\s";
public const string Feature3 = @"featuring\s";

public const string Hyphen1 = "-";
public const string Comma1 = ",";
public const string And = "&";

public const string BracketOpen1 = @"\(";
public const string BracketOpen2 = @"\[";
public const string BracketOpen3 = @"\{";

public const string BracketClosed1 = @"\)";
public const string BracketClosed2 = @"\]";
public const string BracketClosed3 = @"\}";

/// <summary>
/// The words / Signs / Chars which indicate a new Artist / Feature / Title
/// </summary>
public static List<string> WordStopper = new List<string>()
{
    Feature1, Feature2, Feature3,
    BracketOpen1, BracketOpen2, BracketOpen3,
    BracketClosed1, BracketClosed2, BracketClosed3,
    Hyphen1, Comma1
};

/// <summary>
/// The start of a new feature
/// </summary>
public static List<string> FeatureBeginning = new List<string>()
{
    Feature1,
    Feature2,
    Feature3
};

public static List<string> GetFeatures(string filename)
{
    // Set the left side
    string starter = "(" + string.Join(")|(", FeatureBeginning.ToArray()) + ")";

    // Set the right side
    string stopper = "(" + string.Join(")|(", WordStopper.ToArray()) + ")";

    // Get the matches
    MatchCollection matches = Regex.Matches(filename, "(?<=(" + starter + "))(.+?)(?=(" + stopper + "))", RegexOptions.IgnoreCase | RegexOptions.Singleline);

    return null;
}

你必须检查你的一些没有逃脱的表达。你也匹配所有东西,直到最后一个塞子,直到第一个塞子。


推荐阅读