c# - 连接的列应具有相同的类型(如何将 DateTime 转换或强制转换为正确的类型) ML.NET C# SQL
问题描述
我想使用一种DateTime
类型作为我的机器学习模型的特征。
它会产生以下错误:
System.InvalidOperationException:连接的列应具有相同的类型。“DateTime”列的类型为 DateTime,但预期的列类型为 Single。
我有以下代码:
public static IEstimator<ITransformer> BuildTrainingPipeLine(MLContext mLContext)
{
// Data process configuration with pipeline data transformations
var dataProcessPipeline = mLContext.Transforms.Categorical.OneHotEncoding(outputColumnName: "LatitudeEncoded", inputColumnName:"Latitude")
.Append(mLContext.Transforms.Categorical.OneHotEncoding(outputColumnName:"LongitudeEncoded", inputColumnName:"Longitude"))
.Append(mLContext.Transforms.Concatenate("Features", new[] { "LatitudeEncoded", "LongitudeEncoded", "DateTime", "Temperature", "Unit" }));
// Set the training algorithm
var trainer = mLContext.Regression.Trainers.FastTree(new FastTreeRegressionTrainer.Options()
{
NumberOfLeaves = 20,
MinimumExampleCountPerLeaf = 10,
NumberOfTrees = 500,
LearningRate = 0.2822519f,
Shrinkage = 2.151229f,
LabelColumnName = "FillLevel",
FeatureColumnName = "Features"
});
var trainingPipeline = dataProcessPipeline.Append(trainer);
return trainingPipeline;
}
错误发生在这行代码上:
ITransformer model = trainigPipeline.fit(dataView);
public static ITransformer Train(MLContext mLContext, IDataView dataView, IEstimator<ITransformer> trainingPipeline)
{
Console.WriteLine("Training start");
ITransformer model = trainingPipeline.Fit(dataView);
Console.WriteLine("Training done");
return model;
}
我的主要方法如下所示:
static void Main(string[] args)
{
var mLContext = new MLContext();
var loader = mLContext.Data.CreateDatabaseLoader<Message>();
var connectionString = GetDbConnection();
var sqlCommand = "SELECT CAST(MessageId as REAL) as MessageId, CAST(DateTime as string) as DateTime, CAST(FillLevel as REAL) as FillLevel, " +
"CAST(Temperature as REAL) as Temperature, CAST(Latitude as REAL) as Latitude, CAST(Longitude as REAL) as Longitude, " +
"CAST(MessageType as REAL) as MessageType, CAST(Unit as REAL) as Unit from Test WHERE Unit = 1";
var dbSource = new DatabaseSource(SqlClientFactory.Instance, connectionString, sqlCommand);
Console.WriteLine("Loading data from database");
IDataView data = loader.Load(dbSource);
var set = mLContext.Data.TrainTestSplit(data, testFraction: 0.2);
Console.WriteLine("Preparing training operations");
var trainingData = set.TrainSet;
var testData = set.TestSet;
IEstimator<ITransformer> trainingPipeline = BuildTrainingPipeLine(mLContext);
ITransformer model = Train(mLContext, trainingData, trainingPipeline);
Evaluate(mLContext, model, testData, trainingPipeline);
}
GetDbConnection()
功能:
private static string GetDbConnection()
{
var builder = new ConfigurationBuilder().SetBasePath(Directory.GetCurrentDirectory()).AddJsonFile("appsettings.json", optional: true, reloadOnChange: true);
return builder.Build().GetConnectionString("DbConnection");
}
我的消息类如下所示:
public class Message
{
[ColumnName("MessageId"), LoadColumn(0)]
public float Messageid;
[ColumnName("DateTime"), LoadColumn(1)]
public DateTime DateTime;
[ColumnName("FillLevel"), LoadColumn(2)]
public float FillLevel;
[ColumnName("Temperature"), LoadColumn(3)]
public float Temperature;
[ColumnName("Latitude"), LoadColumn(4)]
public float Latitude;
[ColumnName("Longitude"), LoadColumn(5)]
public float Longitude;
[ColumnName("MessageType"), LoadColumn(6)]
public float MessageType;
[ColumnName("Unit"), LoadColumn(7)]
public float Unit;
}
解决方案
您可以使用 CustomMappingFunctions 或 General Conversions 来解决此问题:
将自定义输出添加到您的 ModelInput:
public class Message
{
[ColumnName("MessageId"), LoadColumn(0)]
public float Messageid;
[ColumnName("DateTime"), LoadColumn(1)]
public DateTime DateTime;
[ColumnName("FillLevel"), LoadColumn(2)]
public float FillLevel;
[ColumnName("Temperature"), LoadColumn(3)]
public float Temperature;
[ColumnName("Latitude"), LoadColumn(4)]
public float Latitude;
[ColumnName("Longitude"), LoadColumn(5)]
public float Longitude;
[ColumnName("MessageType"), LoadColumn(6)]
public float MessageType;
[ColumnName("Unit"), LoadColumn(7)]
public float Unit;
}
public class CustomMappingOutput
{
[ColumnName("CustomMappingOutput")]
public float CustomDateHour { get; set; }
}
创建自定义映射:
[CustomMappingFactoryAttribute("CustomDateMapping")]
private class CustomDate : CustomMappingFactory<ModelInput, CustomMappingOutput>
{
public static void CustomAction(ModelInput input, CustomMappingOutput
output)
{
var customDate = Convert.ToDateTime(input.Date);
output.CustomDateHour = (float)customDate.Hour;
}
public override Action<ModelInput, CustomMappingOutput> GetMapping()
=> CustomAction;
}
将其添加到管道中:
var dataProcessPipeline =
_mlContext.Transforms.CustomMapping(new CustomDate().GetMapping(),"CustomDateMapping")
.Append(_mlContext.Transforms.Concatenate("Features",
"CustomMappingOutput",
nameof(ModelInput.CO2)))
.AppendCacheCheckpoint(_mlContext);
看这里:MlNetcookBook
一般转换:
_mlContext.Transforms.Conversion.ConvertType(nameof(ModelInput.date),outputKind:DataKind.Single)
推荐阅读
- c++ - 将 1d 或 2d 向量传递给同一类的构造函数
- excel - Last Loop Iterations 不会重命名目标工作表
- c# - 为什么在提升 PropertyChanged 后不更新 Picker (Xamarin.Forms) 中的项目?
- game-maker - 如果 else 在 gml 中不起作用,我该怎么办?
- git - 在 Android Studio 中使用 Github 时如何停止监视文件?
- python - 是否有对图像进行归一化的 GStreamer 过滤器?
- c# - Unity:将密钥发送到后台窗口
- dataframe - 如何在python中使用(行,列)指定值迭代数据框以进行绘图
- actionscript-3 - 计时器不会造成延迟
- php - 如何让 admin-ajax.php 本地化和脚本入队