python - 获取ValueError:输入字符串和数字数据集时无法将字符串转换为浮点数:“'H4'”
问题描述
我正在制作一个神经网络来从我的扑克游戏中对我的扑克机器人的行为进行分类。我正在使用一个简单的神经网络代码来执行我的任务。但是当我将自己的数据集放入代码中时,会出现错误。神经网络是否接受像我这样的字符串和数字数据集?
错误说:
ValueError:无法将字符串转换为浮点数:“'H4'”
这是我的代码:
from keras.models import Sequential
from keras.layers import Dense, Dropout
from sklearn.model_selection import train_test_split
import numpy
import matplotlib.pyplot as plt
numpy.random.seed(2)
# e load ang dataset
dataset = numpy.loadtxt("monteCarlo.csv", delimiter=",")
# split input (X) and output (Y) variables, splitting csv data
X = dataset[:,0:8]
Y = dataset[:,8]
#split x,y train,test
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, random_state=42)
# create model, add dense layers one by one specifying activation function sigmoid
model = Sequential()
model.add(Dense(15, input_dim=8, activation='relu')) # input layer requires input_dim param
model.add(Dense(10, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dropout(.2))
model.add(Dense(1, activation='sigmoid'))
# compile the model, adam gradient descent (optimized)
# adam or adamax
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=['accuracy'])
# call the function to fit to the data (training the network)
history = model.fit_(x_train, y_train, epochs = 1000, batch_size=20, validation_data=(x_test, y_test))
# save the model
model.save('pokerClassifier.h5')
#evaluate model
scores = model.evaluate(X, Y, verbose=1)
print('Test loss: ',scores[0])
print('accuracy: ',scores[1]*100 ,'%')
#plot accuracy
plt.figure(1)
plt.plot(history.history['acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epochs')
plt.legend(['test','train'], loc='upper left')
plt.show()
plt.figure(2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epochs')
plt.legend(['train','test'], loc='upper left')
plt.show()
这是我的 CSV 或数据集:
'H4','D7','D3','C5','C6',0.82,'C'
'H4','D1','D3','C2','C6',0.22,'F'
'H4','D7','D9','C9','C9',0.55,'C'
'H4','D7','D3','C5','C6',0.82,'C'
'H4','D1','D3','C2','C6',0.22,'F'
'H4','D7','D9','C9','C9',0.55,'C'
'H4','D7','D3','C5','C6',0.82,'C'
'H4','D1','D3','C2','C6',0.22,'F'
'H4','D7','D9','C9','C9',0.55,'C'
'H4','D7','A3','C5','C6',0.84,'C'
'H4','D1','D3','C9','C6',0.44,'F'
解决方案
loadtxt
使用字符串 dtype:
In [4]: data = np.loadtxt('h4.csv', delimiter=',', dtype='U4')
In [5]: data
Out[5]:
array([["'H4'", "'D7'", "'D3'", "'C5'", "'C6'", '0.82', "'C'"],
["'H4'", "'D1'", "'D3'", "'C2'", "'C6'", '0.22', "'F'"],
["'H4'", "'D7'", "'D9'", "'C9'", "'C9'", '0.55', "'C'"],
["'H4'", "'D7'", "'D3'", "'C5'", "'C6'", '0.82', "'C'"],
["'H4'", "'D1'", "'D3'", "'C2'", "'C6'", '0.22', "'F'"],
["'H4'", "'D7'", "'D9'", "'C9'", "'C9'", '0.55', "'C'"],
["'H4'", "'D7'", "'D3'", "'C5'", "'C6'", '0.82', "'C'"],
["'H4'", "'D1'", "'D3'", "'C2'", "'C6'", '0.22', "'F'"],
["'H4'", "'D7'", "'D9'", "'C9'", "'C9'", '0.55', "'C'"],
["'H4'", "'D7'", "'A3'", "'C5'", "'C6'", '0.84', "'C'"],
["'H4'", "'D1'", "'D3'", "'C9'", "'C6'", '0.44', "'F'"]],
dtype='<U4')
genfromtxt
使用None
数据类型:
In [7]: data = np.genfromtxt('h4.csv', delimiter=',', dtype=None, encoding=None)
...:
In [8]: data
Out[8]:
array([("'H4'", "'D7'", "'D3'", "'C5'", "'C6'", 0.82, "'C'"),
("'H4'", "'D1'", "'D3'", "'C2'", "'C6'", 0.22, "'F'"),
("'H4'", "'D7'", "'D9'", "'C9'", "'C9'", 0.55, "'C'"),
("'H4'", "'D7'", "'D3'", "'C5'", "'C6'", 0.82, "'C'"),
("'H4'", "'D1'", "'D3'", "'C2'", "'C6'", 0.22, "'F'"),
("'H4'", "'D7'", "'D9'", "'C9'", "'C9'", 0.55, "'C'"),
("'H4'", "'D7'", "'D3'", "'C5'", "'C6'", 0.82, "'C'"),
("'H4'", "'D1'", "'D3'", "'C2'", "'C6'", 0.22, "'F'"),
("'H4'", "'D7'", "'D9'", "'C9'", "'C9'", 0.55, "'C'"),
("'H4'", "'D7'", "'A3'", "'C5'", "'C6'", 0.84, "'C'"),
("'H4'", "'D1'", "'D3'", "'C9'", "'C6'", 0.44, "'F'")],
dtype=[('f0', '<U4'), ('f1', '<U4'), ('f2', '<U4'), ('f3', '<U4'), ('f4', '<U4'), ('f5', '<f8'), ('f6', '<U3')])
In [9]: data['f0']
Out[9]:
array(["'H4'", "'H4'", "'H4'", "'H4'", "'H4'", "'H4'", "'H4'", "'H4'",
"'H4'", "'H4'", "'H4'"], dtype='<U4')
In [11]: data['f5']
Out[11]: array([0.82, 0.22, 0.55, 0.82, 0.22, 0.55, 0.82, 0.22, 0.55, 0.84, 0.44])
这没有列;取而代之的是命名字段。它是一个结构化数组。但请注意,“f5”列现在加载为浮点数,而其他列是字符串。
与熊猫
In [15]: df = pd.read_csv('h4.csv')
In [16]: df
Out[16]:
'H4' 'D7' 'D3' 'C5' 'C6' 0.82 'C'
0 'H4' 'D1' 'D3' 'C2' 'C6' 0.22 'F'
1 'H4' 'D7' 'D9' 'C9' 'C9' 0.55 'C'
2 'H4' 'D7' 'D3' 'C5' 'C6' 0.82 'C'
3 'H4' 'D1' 'D3' 'C2' 'C6' 0.22 'F'
4 'H4' 'D7' 'D9' 'C9' 'C9' 0.55 'C'
5 'H4' 'D7' 'D3' 'C5' 'C6' 0.82 'C'
6 'H4' 'D1' 'D3' 'C2' 'C6' 0.22 'F'
7 'H4' 'D7' 'D9' 'C9' 'C9' 0.55 'C'
8 'H4' 'D7' 'A3' 'C5' 'C6' 0.84 'C'
9 'H4' 'D1' 'D3' 'C9' 'C6' 0.44 'F'
In [17]: df.dtypes
Out[17]:
'H4' object
'D7' object
'D3' object
'C5' object
'C6' object
0.82 float64
'C' object
dtype: object
In [18]: df.values
Out[18]:
array([["'H4'", "'D1'", "'D3'", "'C2'", "'C6'", 0.22, "'F'"],
["'H4'", "'D7'", "'D9'", "'C9'", "'C9'", 0.55, "'C'"],
["'H4'", "'D7'", "'D3'", "'C5'", "'C6'", 0.82, "'C'"],
["'H4'", "'D1'", "'D3'", "'C2'", "'C6'", 0.22, "'F'"],
["'H4'", "'D7'", "'D9'", "'C9'", "'C9'", 0.55, "'C'"],
["'H4'", "'D7'", "'D3'", "'C5'", "'C6'", 0.82, "'C'"],
["'H4'", "'D1'", "'D3'", "'C2'", "'C6'", 0.22, "'F'"],
["'H4'", "'D7'", "'D9'", "'C9'", "'C9'", 0.55, "'C'"],
["'H4'", "'D7'", "'A3'", "'C5'", "'C6'", 0.84, "'C'"],
["'H4'", "'D1'", "'D3'", "'C9'", "'C6'", 0.44, "'F'"]],
dtype=object)
请注意,这是object
dtype,以适应字符串和浮点数的混合(加上pandas
始终使用object
而不是numpy
字符串 dtypes。
或者作为更接近结构化数组的东西:
In [19]: df.to_records()
Out[19]:
rec.array([(0, "'H4'", "'D1'", "'D3'", "'C2'", "'C6'", 0.22, "'F'"),
(1, "'H4'", "'D7'", "'D9'", "'C9'", "'C9'", 0.55, "'C'"),
(2, "'H4'", "'D7'", "'D3'", "'C5'", "'C6'", 0.82, "'C'"),
(3, "'H4'", "'D1'", "'D3'", "'C2'", "'C6'", 0.22, "'F'"),
(4, "'H4'", "'D7'", "'D9'", "'C9'", "'C9'", 0.55, "'C'"),
(5, "'H4'", "'D7'", "'D3'", "'C5'", "'C6'", 0.82, "'C'"),
(6, "'H4'", "'D1'", "'D3'", "'C2'", "'C6'", 0.22, "'F'"),
(7, "'H4'", "'D7'", "'D9'", "'C9'", "'C9'", 0.55, "'C'"),
(8, "'H4'", "'D7'", "'A3'", "'C5'", "'C6'", 0.84, "'C'"),
(9, "'H4'", "'D1'", "'D3'", "'C9'", "'C6'", 0.44, "'F'")],
dtype=[('index', '<i8'), ("'H4'", 'O'), ("'D7'", 'O'), ("'D3'", 'O'), ("'C5'", 'O'), ("'C6'", 'O'), ('0.82', '<f8'), ("'C'", 'O')])
推荐阅读
- python - 熊猫查询字符串返回空结果
- express - 如何使用 express 中的多路由“app.static()”
- redux - 如何在reactjs中对输入字段进行调度?
- java - 如何在spring中使用注释基础枚举值和ConstraintValidator验证数字请求参数
- laravel - 如何反序列化 Laravel Eloquent 模型,即反向 toArray()、attributesToArray() 或 toJson()?
- docker - 在 Docker 中为 Pytorch 模型访问 GPU
- python - 在外部加载的字符串中包含局部变量
- terraform - 通过 ssh 连接到 ec2 机器时出现问题
- javascript - 访问子对象
- java - 按下按钮后 Java Swing GUI 冻结