python - 使此代码针对多个 excel 文件运行
问题描述
所以我想为几个excel文件运行这个脚本,所以我将导入几个excel文件而不是df3,并将所有结果合并到一个数据框中。
这是主要的代码示例
import pandas as pd
d = {'City': ['Tokyo','Tokyo','Lisbon','Tokyo','Tokyo','Lisbon','Lisbon','Lisbon','Tokyo','Lisbon','Tokyo','Tokyo','Tokyo','Lisbon','Tokyo','Tokyo','Lisbon','Lisbon','Lisbon','Tokyo','Lisbon','Tokyo'],
'Card': ['Visa','Visa','Master Card','Master Card','Visa','Master Card','Visa','Visa','Master Card','Visa','Master Card','Visa','Visa','Master Card','Master Card','Visa','Master Card','Visa','Visa','Master Card','Visa','Master Card'],
'Colateral':['Yes','No','Yes','No','No','No','No','Yes','Yes','No','Yes','Yes','No','Yes','No','No','No','Yes','Yes','No','No','No'],
'Client Number':[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22],
'DebtPaid':[0.8,0.1,0.5,0.30,0,0.2,0.4,1,0.60,1,0.5,0.2,0,0.3,0,0,0.2,0,0.1,0.70,0.5,0.1]}
df = pd.DataFrame(data=d)
df2=df.groupby(['City','Card','Colateral'])['DebtPaid'].\
value_counts(bins=[-0.001,0,0.25,0.5,0.75,1,1.001,2],normalize=True)
d = {'City': ['Tokyo','Tokyo','Lisbon','Tokyo','Tokyo','Lisbon','Lisbon','Lisbon','Tokyo','Lisbon','Tokyo','Tokyo','Tokyo','Lisbon','Tokyo','Tokyo','Lisbon','Lisbon','Lisbon','Tokyo','Lisbon','Tokyo'],
'Card': ['Visa','Visa','Master Card','Master Card','Visa','Master Card','Visa','Visa','Master Card','Visa','Master Card','Visa','Visa','Master Card','Master Card','Visa','Master Card','Visa','Visa','Master Card','Visa','Master Card'],
'Colateral':['Yes','No','Yes','No','No','No','No','Yes','Yes','No','Yes','Yes','No','Yes','No','No','No','Yes','Yes','No','No','No'],
'Client Number':[23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44],
'Total Debt':[100,240,200,1000,50,20,345,10,600,40,50,20,100,30,100,600,200,200,150,700,50,120]}
df3 = pd.DataFrame(data=d)
#First merge dataframes
df_out = df2.rename('Prob').reset_index().merge(df3, on=['City', 'Card', 'Colateral'])
#Use the right and left attributes of pd.Interval
df_out['lower'] = [x.left for x in df_out['DebtPaid']]
df_out['upper'] = [x.right for x in df_out['DebtPaid']]
#Calculate lower and upper partial prices
df_out['l_partial'] = df_out[['lower', 'Prob', 'Total Debt']].prod(axis=1)
df_out['u_partial'] = df_out[['upper', 'Prob', 'Total Debt']].prod(axis=1)
#Sum partial prices to get lower and upper price grouped on Client Number
final = df_out.groupby('Client Number')[['l_partial', 'u_partial']]\
.agg(lower_price=('l_partial', 'sum'),
upper_price=('u_partial', 'sum')).clip(0,np.inf)
w = (final['upper_price'].sum() + final['lower_price'].sum()) / 2
y = 1000
z = ((w/y)-1)*100
d1 = {'1': [w,y,z],
'Index':['Estimate','Real','Error']}
results = pd.DataFrame(data=d1).set_index('Index')
print(results)
这是我试图做的,以便在没有成功的情况下使用几个 excel 文件运行脚本:
files = [1,2,3,4,5]
for x in files:
df3 = pd.read_excel(str(x) + '.xlsx')
#First merge dataframes
df_out = df2.rename('Prob').reset_index().merge(df3, on=['City', 'Card', 'Colateral'])
#Use the right and left attributes of pd.Interval
df_out['lower'] = [x.left for x in df_out['DebtPaid']]
df_out['upper'] = [x.right for x in df_out['DebtPaid']]
#Calculate lower and upper partial prices
df_out['l_partial'] = df_out[['lower', 'Prob', 'Total Debt']].prod(axis=1)
df_out['u_partial'] = df_out[['upper', 'Prob', 'Total Debt']].prod(axis=1)
#Sum partial prices to get lower and upper price grouped on Client Number
final = df_out.groupby('Client Number')[['l_partial', 'u_partial']]\
.agg(lower_price=('l_partial', 'sum'),
upper_price=('u_partial', 'sum')).clip(0,np.inf)
w = (final['upper_price'].sum() + final['lower_price'].sum()) / 2
y = 1000
z = ((w/y)-1)*100
d1 = {x : [w,y,z],
'Index':['Estimate','Real','Error']}
results = pd.DataFrame(data=d1).set_index('Index')
results
它只显示一个 excel 文件的结果。你知道我怎么能解决这个问题吗?
解决方案
第一个问题在这里:
df3 = pd.read_excel(x &".xlsx").format(x)
在 Visual Basic 和 VBA 中,字符串与&
.
在 Python 中,运算符是+
,但您需要确保两边都有一个字符串。
由于files
只包含数字,x
因此也将是一个数字。要将其转换为字符串,请使用str(x)
:
df3 = pd.read_excel(str(x) + ".xlsx").format(x)
下一个问题可能在这里:
results = pd.DataFrame(data=d1).set_index('Index')
对于第二个文件,这将替换第一个文件的结果。您需要找到一种方法来组合您的数据。可能像这里描述的那样
推荐阅读
- typescript - 在 TypeScript 中,是否可以声明一个属性名称都属于某个枚举的对象?
- php - 在第 4 行的 C:\Users\Logan\Cart\bootstrap\app.php 中的 bootstrap\app.php:4 #1 {main} 中找不到类 'app'
- r - 从深度嵌套的子目录中读取文件,并用 R 中的子目录全路径名重命名它们
- c - 仅使用套接字实现单进程管道
- dsl - Xtext DSL 编辑器引用了我的模型中定义的 Java 类
- python - 确定文本语言和纠正 Python 中的拼写错误的最佳算法是什么?
- c# - 我怎样才能扔掉捡到的物体?
- react-native - 如何使用 useFocusEffect 挂钩
- google-my-business-api - 无法获取任何帐户信息或评论信息
- spatial - 如何在适合逻辑回归正交方案的泊松品脱过程模型中包含调查权重?