首页 > 解决方案 > 连接的列应具有相同的类型(如何将 DateTime 转换或强制转换为正确的类型) ML.NET C# SQL

问题描述

我想使用一种DateTime类型作为我的机器学习模型的特征。

它会产生以下错误:

System.InvalidOperationException:连接的列应具有相同的类型。“DateTime”列的类型为 DateTime,但预期的列类型为 Single。

我有以下代码:

public static IEstimator<ITransformer> BuildTrainingPipeLine(MLContext mLContext)
{
    // Data process configuration with pipeline data transformations 
    var dataProcessPipeline = mLContext.Transforms.Categorical.OneHotEncoding(outputColumnName: "LatitudeEncoded", inputColumnName:"Latitude")
        .Append(mLContext.Transforms.Categorical.OneHotEncoding(outputColumnName:"LongitudeEncoded", inputColumnName:"Longitude"))
        .Append(mLContext.Transforms.Concatenate("Features", new[] { "LatitudeEncoded", "LongitudeEncoded", "DateTime", "Temperature", "Unit" }));

    // Set the training algorithm 
    var trainer = mLContext.Regression.Trainers.FastTree(new FastTreeRegressionTrainer.Options()
    {
        NumberOfLeaves = 20,
        MinimumExampleCountPerLeaf = 10,
        NumberOfTrees = 500,
        LearningRate = 0.2822519f,
        Shrinkage = 2.151229f,
        LabelColumnName = "FillLevel",
        FeatureColumnName = "Features"
    });

    var trainingPipeline = dataProcessPipeline.Append(trainer);

    return trainingPipeline;
}

错误发生在这行代码上:

ITransformer model = trainigPipeline.fit(dataView);

public static ITransformer Train(MLContext mLContext, IDataView dataView, IEstimator<ITransformer> trainingPipeline)
{
    Console.WriteLine("Training start");
    ITransformer model = trainingPipeline.Fit(dataView);
    Console.WriteLine("Training done");
    return model;
}

我的主要方法如下所示:

static void Main(string[] args)
{
    var mLContext = new MLContext();
    var loader = mLContext.Data.CreateDatabaseLoader<Message>();
    var connectionString = GetDbConnection();

    var sqlCommand = "SELECT CAST(MessageId as REAL) as MessageId, CAST(DateTime as string) as DateTime, CAST(FillLevel as REAL) as FillLevel, " +
        "CAST(Temperature as REAL) as Temperature, CAST(Latitude as REAL) as Latitude, CAST(Longitude as REAL) as Longitude, " +
        "CAST(MessageType as REAL) as MessageType, CAST(Unit as REAL) as Unit from Test WHERE Unit = 1";

    var dbSource = new DatabaseSource(SqlClientFactory.Instance, connectionString, sqlCommand);
    Console.WriteLine("Loading data from database");
    IDataView data = loader.Load(dbSource);
    var set = mLContext.Data.TrainTestSplit(data, testFraction: 0.2);
    Console.WriteLine("Preparing training operations");
    var trainingData = set.TrainSet;
    var testData = set.TestSet;
    IEstimator<ITransformer> trainingPipeline = BuildTrainingPipeLine(mLContext);
    ITransformer model = Train(mLContext, trainingData, trainingPipeline);
    Evaluate(mLContext, model, testData, trainingPipeline);
}

GetDbConnection()功能:

private static string GetDbConnection()
{
    var builder = new ConfigurationBuilder().SetBasePath(Directory.GetCurrentDirectory()).AddJsonFile("appsettings.json", optional: true, reloadOnChange: true);
    return builder.Build().GetConnectionString("DbConnection");
}

我的消息类如下所示:

public class Message
{
    [ColumnName("MessageId"), LoadColumn(0)]
    public float Messageid;

    [ColumnName("DateTime"), LoadColumn(1)]
    public DateTime DateTime;

    [ColumnName("FillLevel"), LoadColumn(2)]
    public float FillLevel;

    [ColumnName("Temperature"), LoadColumn(3)]
    public float Temperature;

    [ColumnName("Latitude"), LoadColumn(4)]
    public float Latitude;

    [ColumnName("Longitude"), LoadColumn(5)]
    public float Longitude;

    [ColumnName("MessageType"), LoadColumn(6)]
    public float MessageType;

    [ColumnName("Unit"), LoadColumn(7)]
    public float Unit;
}

标签: c#sql-serverml.net

解决方案


您可以使用 CustomMappingFunctions 或 General Conversions 来解决此问题:

将自定义输出添加到您的 ModelInput:

public class Message
{
    [ColumnName("MessageId"), LoadColumn(0)]
    public float Messageid;

    [ColumnName("DateTime"), LoadColumn(1)]
    public DateTime DateTime;

    [ColumnName("FillLevel"), LoadColumn(2)]
    public float FillLevel;

    [ColumnName("Temperature"), LoadColumn(3)]
    public float Temperature;

    [ColumnName("Latitude"), LoadColumn(4)]
    public float Latitude;

    [ColumnName("Longitude"), LoadColumn(5)]
    public float Longitude;

    [ColumnName("MessageType"), LoadColumn(6)]
    public float MessageType;

    [ColumnName("Unit"), LoadColumn(7)]
    public float Unit;
}

public class CustomMappingOutput
{
    [ColumnName("CustomMappingOutput")]
    public float CustomDateHour { get; set; }
}

创建自定义映射:

    [CustomMappingFactoryAttribute("CustomDateMapping")]
    private class CustomDate : CustomMappingFactory<ModelInput, CustomMappingOutput>
    {
        public static void CustomAction(ModelInput input, CustomMappingOutput
            output)
        {
            var customDate = Convert.ToDateTime(input.Date);
            output.CustomDateHour = (float)customDate.Hour;
        }

        public override Action<ModelInput, CustomMappingOutput> GetMapping()
            => CustomAction;
    }

将其添加到管道中:

    var dataProcessPipeline =
        _mlContext.Transforms.CustomMapping(new CustomDate().GetMapping(),"CustomDateMapping")
            .Append(_mlContext.Transforms.Concatenate("Features",
                "CustomMappingOutput",
                nameof(ModelInput.CO2)))
            .AppendCacheCheckpoint(_mlContext);

看这里:MlNetcookBook

一般转换:

_mlContext.Transforms.Conversion.ConvertType(nameof(ModelInput.date),outputKind:DataKind.Single)

转换扩展目录


推荐阅读