首页 > 解决方案 > Python:无法过滤 CSV,ValueError:只能将大小为 1 的数组转换为 Python 标量

问题描述

我有一个这样的 CSV:

| path                            | artists | item | id                 |  
| ------------------------------- | ------- | ---- | ------------------ |
| ../gifs/dwight\_harry\_0.gif    | dh      | 0    | wUIh5rHf5QCyhIk8Ay |
| ../gifs/dwight\_beatles\_0.gif  | db      | 0    | OqBPFGQkA2rmoouv4A |
| ../gifs/michael\_harry\_0.gif   | mh      | 0    | ITAra7ShPMXQecbFGZ |
| ../gifs/michael\_beatles\_0.gif | mb      | 0    | ryKfGjpxOvDM38b3sk |
| ../gifs/michael\_beatles\_0.gif | mb      | 1    | hgKfGjpxOdfM38b3sk |

我想先按项目过滤它,然后artists一个一个地选择每个值。例如,对于item = 1,我只会选择mb。这是我尝试过的:

item = 0 
i = str(item)

df = pd.read_csv("../data/urls.csv")
# Select Item
df = df.loc[df['item'] == i]
    
dh_id = df.loc[df['artists'] == 'dh']['id'].item()
db_id = df.loc[df['artists'] == 'db']['id'].item()
mh_id = df.loc[df['artists'] == 'mh']['id'].item()
mb_id = df.loc[df['artists'] == 'mb']['id'].item()

这给了我以下错误:

  File "D:\write_pages.py", line 67, in writepage
    dh_id = df.loc[df['artists'] == 'dh']['id'].item()

ValueError: can only convert an array of size 1 to a Python scalar

我究竟做错了什么?

标签: pythonpandasdataframe

解决方案


国际大学联盟:

尝试:

df=df.apply(lambda x:x.str.strip(),axis=1)
out=df.groupby('artists')['id'].agg(lambda x:x.value_counts().idxmax())

的输出out

artists
db    OqBPFGQkA2rmoouv4A
dh    wUIh5rHf5QCyhIk8Ay
mb    hgKfGjpxOdfM38b3sk
mh    ITAra7ShPMXQecbFGZ
Name: id, dtype: object

或者

删除“艺术家”的重复值:

out=df.sort_values('item',ascending=False).drop_duplicates('artists')

输出:

     path                               artists     item    id
0   ../gifs/michael\_beatles\_0.gif     mb          1       hgKfGjpxOdfM38b3sk
1   ../gifs/dwight\_harry\_0.gif        dh          0       wUIh5rHf5QCyhIk8Ay
2   ../gifs/dwight\_beatles\_0.gif      db          0       OqBPFGQkA2rmoouv4A
3   ../gifs/michael\_harry\_0.gif       mh          0       ITAra7ShPMXQecbFGZ

注意:您收到此错误是因为df.loc[df['artists'] == 'mb']['id']给您 2 个值并且根据文档Series.itemIt will raise value error If the data is not length-1 and the value need to be scaler.


推荐阅读