首页 > 解决方案 > 如何使用数组作为特征元素?

问题描述

我制作了一个由布尔数组组成的功能,因此权重会根据它们所代表的字符串进行调整。但是这些数组并没有被接受。

这是我正在使用的两个功能。目标是“流派”,培训功能是“install_per_review”

9516    23.8
7236    20.4
5781     4.0
2409    12.5
6052    59.4
        ... 
9648     inf
6125   196.5
3305    11.7
8841     7.1
720     29.0
Name: install_per_review, Length: 500, dtype: float64
9516    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
7236    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
5781    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
2409    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
6052    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
                              ...                        
9648    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
6125    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
3305    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
8841    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
720     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
Name: genres, Length: 500, dtype: object

这会将字符串数组转换为布尔数组数组

genre_features = apps_dataframe["genres"]

# Multi-hot encoding
genres = set()
for genre in genre_features:
    genres.add(genre)

encoded_features = []
for genre in genre_features:
    genre_feature = []
    for g in genres:
        if g == genre:
            genre_feature.append(1)
        else:
            genre_feature.append(0)
    encoded_features.append(genre_feature)

apps_dataframe["genres"] = encoded_features

然后将数据用于模型训练

    my_feature = input_feature
    my_feature_data = apps_dataframe[[my_feature]]
    my_label = "genres"
    targets = apps_dataframe[my_label]
    .....
    sample = apps_dataframe.sample(n=500)
    feature_sample = sample[my_feature]
    label_sample = sample[my_label]
    print(feature_sample)
    print(label_sample)
    plt.scatter(feature_sample, label_sample)

但是 Tensorflow 抱怨是因为我将每个元素设置为一个数组(即使我想这样做)

  File "/Users/lukire01/.vscode/extensions/ms-python.python-2019.3.6558/pythonFiles/ptvsd_launcher.py", line 45, in <module>
    main(ptvsdArgs)
  File "/Users/lukire01/.vscode/extensions/ms-python.python-2019.3.6558/pythonFiles/lib/python/ptvsd/__main__.py", line 391, in main
    run()
  File "/Users/lukire01/.vscode/extensions/ms-python.python-2019.3.6558/pythonFiles/lib/python/ptvsd/__main__.py", line 272, in run_file
    runpy.run_path(target, run_name='__main__')
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/lukire01/Code/tf_demo/app_store_model.py", line 207, in <module>
    input_feature="install_per_review"
  File "/Users/lukire01/Code/tf_demo/app_store_model.py", line 97, in train_model
    plt.scatter(feature_sample, label_sample)
  File "/usr/local/lib/python3.7/site-packages/matplotlib/pyplot.py", line 2862, in scatter
    is not None else {}), **kwargs)
  File "/usr/local/lib/python3.7/site-packages/matplotlib/__init__.py", line 1810, in inner
    return func(ax, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 4297, in scatter
    alpha=alpha
  File "/usr/local/lib/python3.7/site-packages/matplotlib/collections.py", line 899, in __init__
    Collection.__init__(self, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/matplotlib/collections.py", line 155, in __init__
    offsets = np.asanyarray(offsets, float)
  File "/usr/local/lib/python3.7/site-packages/numpy/core/numeric.py", line 591, in asanyarray
    return array(a, dtype, copy=False, order=order, subok=True)
ValueError: setting an array element with a sequence.```

标签: pythontensorflow

解决方案


推荐阅读