python - 如何使用数组作为特征元素?
问题描述
我制作了一个由布尔数组组成的功能,因此权重会根据它们所代表的字符串进行调整。但是这些数组并没有被接受。
这是我正在使用的两个功能。目标是“流派”,培训功能是“install_per_review”
9516 23.8
7236 20.4
5781 4.0
2409 12.5
6052 59.4
...
9648 inf
6125 196.5
3305 11.7
8841 7.1
720 29.0
Name: install_per_review, Length: 500, dtype: float64
9516 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
7236 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
5781 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
2409 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
6052 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
...
9648 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
6125 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
3305 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
8841 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
720 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
Name: genres, Length: 500, dtype: object
这会将字符串数组转换为布尔数组数组
genre_features = apps_dataframe["genres"]
# Multi-hot encoding
genres = set()
for genre in genre_features:
genres.add(genre)
encoded_features = []
for genre in genre_features:
genre_feature = []
for g in genres:
if g == genre:
genre_feature.append(1)
else:
genre_feature.append(0)
encoded_features.append(genre_feature)
apps_dataframe["genres"] = encoded_features
然后将数据用于模型训练
my_feature = input_feature
my_feature_data = apps_dataframe[[my_feature]]
my_label = "genres"
targets = apps_dataframe[my_label]
.....
sample = apps_dataframe.sample(n=500)
feature_sample = sample[my_feature]
label_sample = sample[my_label]
print(feature_sample)
print(label_sample)
plt.scatter(feature_sample, label_sample)
但是 Tensorflow 抱怨是因为我将每个元素设置为一个数组(即使我想这样做)
File "/Users/lukire01/.vscode/extensions/ms-python.python-2019.3.6558/pythonFiles/ptvsd_launcher.py", line 45, in <module>
main(ptvsdArgs)
File "/Users/lukire01/.vscode/extensions/ms-python.python-2019.3.6558/pythonFiles/lib/python/ptvsd/__main__.py", line 391, in main
run()
File "/Users/lukire01/.vscode/extensions/ms-python.python-2019.3.6558/pythonFiles/lib/python/ptvsd/__main__.py", line 272, in run_file
runpy.run_path(target, run_name='__main__')
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/lukire01/Code/tf_demo/app_store_model.py", line 207, in <module>
input_feature="install_per_review"
File "/Users/lukire01/Code/tf_demo/app_store_model.py", line 97, in train_model
plt.scatter(feature_sample, label_sample)
File "/usr/local/lib/python3.7/site-packages/matplotlib/pyplot.py", line 2862, in scatter
is not None else {}), **kwargs)
File "/usr/local/lib/python3.7/site-packages/matplotlib/__init__.py", line 1810, in inner
return func(ax, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 4297, in scatter
alpha=alpha
File "/usr/local/lib/python3.7/site-packages/matplotlib/collections.py", line 899, in __init__
Collection.__init__(self, **kwargs)
File "/usr/local/lib/python3.7/site-packages/matplotlib/collections.py", line 155, in __init__
offsets = np.asanyarray(offsets, float)
File "/usr/local/lib/python3.7/site-packages/numpy/core/numeric.py", line 591, in asanyarray
return array(a, dtype, copy=False, order=order, subok=True)
ValueError: setting an array element with a sequence.```
解决方案
推荐阅读
- javascript - 为什么启用keep-alive时ioredis客户端超时?
- python - 如果两个字典中的项目都匹配,则减法工作错误
- css - 如何为元素的变换设置动画,使其从上方落向屏幕?
- python - Python 学习 - if、elif 和 else 语句 - 某些条件为真但没有被执行(打印),为什么?
- javascript - 将标签焦点赋予新加载的页面 angularJS
- java - 根据 Java 中的输入调用服务/方法
- xcode - xcode11.4 & ionic4 构建错误:架构 x86_64 的重复符号
- angular - 在 typescript 中使用 *ngFor 迭代接口数组
- django - 如何在 Django 中覆盖外部应用程序模板?
- python - 在拆分合并为一列的多个列时需要帮助