首页 > 解决方案 > 操作熊猫数据框然后保存更改

问题描述

我正在尝试操作熊猫数据框以从随机整数生成器的输出中选择一个随机索引,然后使用该值,并将“已使用”列设置为“是”,然后再次保存 csv。

我的代码如下:

import random
import pandas

# Read in the dataframe:
df = pandas.read_csv('./data/voucher_codes.csv', encoding='utf-8')

# Generate a random integer (this would be nice if the max value was the number of rows - but I'll figure this out later!)
index = random.randint(0, 3)

# Store this code in memory selecting only when Used = 'No':
voucher_code = df.loc[df['Used'] == 'No'].iloc[[index]]['Voucher Code'].values[0]

# Update the column associated to the above voucher code to "Yes'
df.loc[df['Voucher Code'] == voucher_code]['Used'] = 'Yes'

# Save said dataframe, to be consistent with what voucher codes have been used:
df.to_csv('./data/voucher_codes.csv', sep=',')

.csv 虽然不会被覆盖!

示例数据可能有用:

000001,No
000002,No
000003,No
000004,No

这是来自@ALollz 建议的数据框:

,Unnamed: 0,Voucher Code,Used
0,0,000001,No
1,1,000002,No
2,2,000003,No
3,3,000004,No

标签: pythonpython-3.xpandas

解决方案


我运行了你的代码并得到了这个错误,所以 DataFrame 没有被修改:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

所以为了匹配 .loc[row_indexer, col_indexer] 的格式,我改变了这一行

df.loc[df['Voucher Code'] == voucher_code, 'Used'] = 'Yes'

PS您可以使用df.shape生成最多行数的随机整数

index = random.randint(0, df.shape[0])

希望这有帮助


推荐阅读