python - 在 Pandas 中使用相同格式从一个 Excel 获取数据并写入另一个 Excel
问题描述
我正在尝试在 python 中使用 pandas 读取两个 Excel 文件。从一个获取数据,然后在某个条件下将数据写入另一个。我们将文件称为sourcefile1.xlsx和sourcefile2.xlsx。
以下是 Excel 文件的内容:
sourcefile1.xlsx 对和字段结果是合并的主要标头。cable_type、cable_name、cable_pair、caller_id、result 是代表每一列的二级标题。
-----------------Pair--------------- -----Field Result----- cable_type cable_name cable_pair caller_id result primary 2 103 n/a not match primary 1 33 22222222 match primary 5 342 22222222 match secondary 2 12 n/a not match secondary 4 144 44444444 match
源文件2.xlsx
-blank- -----Secondary Pairs---- ------Primary Pairs------ caller_id caller_id adsl result caller_id adsl result 11111111 4/144 2/103 22222222 2/12 4/144 44444444 7/55 4/144 NULL 8/123 1/11 NULL NULL 2/22
预期的输出将基于此伪代码写入sourcefile2 :
if caller_id(sourcefile1) != 'N/A':
if cable_type(sourcefile1) = 'primary':
if caller_id(sourcefile1) = caller_id(sourcefile2) - primary pairs:
write caller_id(sourcefile1) to caller_id(sourcefile2) - primary pairs
write result(sourcefile1) to result(sourcefile2) - primary pairs
elif caller_id(sourcefile1) != 'N/A':
if cable_type(sourcefile1) = 'secondary':
if caller_id(sourcefile1) = caller_id(sourcefile2) - secondary pairs:
write caller_id(sourcefile1) to caller_id(sourcefile2) - secondary pairs
write result(sourcefile1) to result(sourcefile2) - secondary pairs
elif caller_id(sourcefile1) = 'N/A':
if cable_type(sourcefile1) = 'primary':
if cable_name + cable_pair(sourcefile1) = adsl(sourcefile2) - primary pairs:
write caller_id(sourcefile1) to caller_id(sourcefile2) - primary pairs
write result(sourcefile1) to result(sourcefile2) - primary pairs
elif caller(sourcefile1) = 'N/A':
if cable_type(sourcefile1) = 'secondary':
if cable_name + cable_pair(sourcefile1) = adsl(sourcefile2) - secondary pairs:
write caller_id(sourcefile1) to caller_id(sourcefile2) - secondary pairs
write result(sourcefile1) to result(sourcefile2) - secondary pairs
这是我想要达到的输出。
-blank- -----Secondary Pairs---- ------Primary Pairs------
caller_id caller_id adsl result caller_id adsl result
11111111 4/144 n/a 2/103 not match
22222222 n/a 2/12 not match 22222222 4/144 match
44444444 44444444 7/55 match 4/144
NULL 8/123 1/11
NULL NULL 2/22
我正在尝试将 caller_id 从sourcefile1匹配到sourcefile2并根据它们的cable_type将它们写入主对或辅助对。如果caller_id是n/a,那么我需要匹配的是adsl。结果是给定的数据,我只需要获取与caller_id或adsl在同一行中的任何内容。
到目前为止,我能够匹配caller_id但我重新创建了sourcefile1和sourcefile2并删除了主标题。这是我的代码:
import pandas as pd
df1 = pd.read_excel('sourcefile2.xlsx')
df2 = pd.read_excel('sourcefile1.xlsx', 'v0.02')
forPrimary1 = df1.columns[40]
forSecondary1 = df1.columns[23]
ComparisonResult = df2.columns[22]
forAdsl = df1.columns[39]
CallerID = df2.columns[13]
forPrimary = df1.columns[37]
forSecondary = df1.columns[16]
df3 = pd.read_excel('PrimarySecondary.xlsx')
df4 = pd.read_excel('adslFile.xlsx')
df5 = pd.read_excel('PrimarySecondary2.xlsx')
# df1['svc_no'] = df1['svc_no']
df2['Adsl'] = df2[['cable_name', 'pair']].apply(lambda x: '/'.join(x.astype(str)), axis=1)
newPrim = df2[[caller_id, 'result', 'Adsl']] [(df2['cable_type'] == 'Primary')]
newSec = df2[[caller_id, 'result']] [(df2['cable_type'] == 'Secondary')]
newPrim.to_excel('newPrimary.xlsx')
newSec.to_excel('newSecondary.xlsx')
frame = pd.read_excel('newPrimary.xlsx')
frame1 = pd.read_excel('newSecondary.xlsx')
df1['b_line_stat'] = df1['b_line_stat'].fillna('NULL')
df1['DP_e_pr'] = df1['DP_e_pr'].fillna('NULL')
df1['DP_e_st'] = df1['DP_e_st'].fillna('NULL')
df1['DP'] = df1['DP'].fillna('NULL')
df1['CAB_d_st'] = df1['CAB_d_st'].fillna('NULL')
df1['CAB_d_pr'] = df1['CAB_d_pr'].fillna('NULL')
df1['port_status'] = df1['port_status'].fillna('NULL')
name1 = df1.columns[17]
name2 = df1.columns[18]
name3 = df1.columns[19]
name4 = df1.columns[20]
name5 = df1.columns[21]
name6 = df1.columns[22]
name7 = df1.columns[38]
df1[name1] = df1['b_line_stat']
df1[name2] = df1['CAB_d_st']
df1[name3] = df1['CAB_d_pr']
df1[name4] = df1['DP_e_st']
df1[name5] = df1['DP_e_pr']
df1[name6] = df1['DP']
df1[name7] = df1['port_status']
frame = frame[frame['caller_id'].isin(df1['caller_id'])]
df1[forPrimary1] = frame['result']
frame1 = frame1[frame1['caller_id'].isin(df1['caller_id'])]
df1[forSecondary1] = frame1['result']
df1[df1['caller_id'].isin(df3['Primary'])]
df1[forPrimary] = df1['caller_id'].fillna('n/a')
df1[df1['adsl'].isin(df2['Adsl'])]
df1[forAdsl] = df1['adsl'].fillna('NULL')
df1[df1['caller_id'].isin(df3['Secondary'])]
df1[forSecondary] = df1['caller_id'].fillna('n/a')
df1['caller_id'] = df1['caller_id'].fillna('NULL')
df1['adsl'] = df1['adsl'].fillna('NULL')
df1.to_excel('dp_util_ANT715-M.xlsx', index=False)
writer = pd.ExcelWriter('dp_util_ANT715-M.xlsx', engine='xlsxwriter')
df1.to_excel(writer, sheet_name='Sheet1')
编辑:我在脚本中使用的变量来匹配我的问题。
解决方案
推荐阅读
- javascript - 如何创建上次活动记录器?
- mongodb - 如何在 Mongoose 中使用枚举验证字符串数组?
- python - Python 计算器 - 使用十进制模块评估语句字符串
- javascript - 如何使未选中的复选框值为 0 和选中的值为 1?
- php - 如何在 Laravel 8 中将异常存储在数据库中
- wpf - WritableBitmap:内存分配不匹配 - 如何解决?
- c++ - 我的自定义向量有一个运行时错误调试断言失败:_CrtIsValidHeapPointer(block)
- ruby-on-rails - Bundler 在 Rails 中的 bundle install 上找不到兼容的版本
- vue.js - 使用 nmp 运行构建后的空站点(vue 2 cli)
- firebase - Flutter 通过 Firebase 的纬度和经度显示附近的用户