首页 > 解决方案 > 熊猫多重合并创建多维重复列

问题描述

我的目标是根据类似的主机名、序列号、类别将 4 个 excel 工作表合并为 1 个......我正在使用下面的 pandas 合并功能。

InventoryDf = pd.read_excel("Inventory.xlsx", sheet_name='Inventory')
SoftwareDf = pd.read_excel("Inventory.xlsx", sheet_name='Software')
HardwarewareDf = pd.read_excel("Inventory.xlsx", sheet_name='Hardware')
CoverageDf = pd.read_excel("Inventory.xlsx", sheet_name='Coverage')
data_frames = [InventoryDf, SoftwareDf, HardwarewareDf, CoverageDf]
merge = partial(pd.merge, on=['Priority','Category','Product Family','Host Name','Serial Number'], how='outer')
merge = reduce(merge, data_frames)

问题是每个工作表都有一个“IP 地址”列,其中包含大部分相似的 IP。出于某种原因,合并数据框包含 4 列,有 2 个重复名称:“IP 地址_x”、“IP 地址_x”、“IP 地址_y”、“IP 地址_y”

我想将这 4 列合并为 1,但我不能,因为它们的名称相似。我没有手动重命名它们,因为有大约 30 个数据框列,而且很乏味。

是否有合并它们的说法,以便:

  1. 如果ip相同,合并
  2. 如果 IP 不同,请使用左侧的第一个“IP 地址_x”列
  3. 如果缺少一列,如果 IP 不为空,则只有第一个“IP Address_x”

这是工作表的示例,我有更多列,例如:名称、网址、站点名称、城市...

库存Df

+-----------+---------------+------------+----------+----------+
| Host Name | Serial Number | IP Address | Priority | Category |
+-----------+---------------+------------+----------+----------+
| SwitchA   | 1230          | 1.1.1.1    | 1        | Switch   |
+-----------+---------------+------------+----------+----------+
| SwitchA   | 1231          | 1.1.1.1    | 1        | Switch   |
+-----------+---------------+------------+----------+----------+
| SwitchB   | 1240          | 1.1.1.2    | 2        | Switch   |
+-----------+---------------+------------+----------+----------+

硬件Df

+-----------+---------------+------------+----------+----------+
| Host Name | Serial Number | IP Address | Priority | Category |
+-----------+---------------+------------+----------+----------+
| SwitchA   | 1230          | 1.1.0.1    | 1        | Switch   |
+-----------+---------------+------------+----------+----------+
| SwitchD   | 1250          | 1.2.2.2    | 1        | Switch   |
+-----------+---------------+------------+----------+----------+
| SwitchE   | 1260          | 1.3.3.3    | 2        | Switch   |
+-----------+---------------+------------+----------+----------+

软件Df

+-----------+---------------+------------+----------+----------+---------+
| Host Name | Serial Number | IP Address | Priority | Category | Version |
+-----------+---------------+------------+----------+----------+---------+
| SwitchA   | 1230          | 1.1.1.1    | 1        | Switch   | X       |
+-----------+---------------+------------+----------+----------+---------+
| SwitchA   | 1231          | 1.1.1.1    | 1        | Switch   | X       |
+-----------+---------------+------------+----------+----------+---------+
| SwitchB   | 1240          | 1.1.1.2    | 2        | Switch   | Y       |
+-----------+---------------+------------+----------+----------+---------+

覆盖率

+-----------+---------------+------------+----------+----------+-------------+-------+
| Host Name | Serial Number | IP Address | Priority | Category | Coverage    | Price |
+-----------+---------------+------------+----------+----------+-------------+-------+
| SwitchA   | 1230          | 1.1.1.1    | 1        | Switch   | Not Covered | 100   |
+-----------+---------------+------------+----------+----------+-------------+-------+
| SwitchA   | 1231          | 1.1.1.1    | 1        | Switch   | Covered     | 300   |
+-----------+---------------+------------+----------+----------+-------------+-------+
| SwitchB   | 1240          | 1.1.1.2    | 2        | Switch   | Not Covered | 200   |
+-----------+---------------+------------+----------+----------+-------------+-------+

预期结果(IP 地址被合并,即使 SwitchA 的某些地址不同)

+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| Host Name | Serial Number | IP Address | Priority | Category | Version | Coverage    | Price |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchA   | 1230          | 1.1.1.1    | 1        | Switch   | X       | Not Covered | 100   |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchA   | 1231          | 1.1.1.1    | 1        | Switch   | X       | Covered     | 300   |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchB   | 1240          | 1.1.1.2    | 2        | Switch   | Y       | Not Covered | 200   |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchD   | 1250          | 1.2.2.2    | 1        | Switch   |         |             |       |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchE   | 1260          | 1.3.3.3    | 2        | Switch   |         |             |       |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+

结果的原始提取。注意冗余列丢失,IP 地址_x

 Source.Name_x  Priority Item Type_x  Category                            Product Family       Product ID_x Software Type_x OS Version_x Suggested Version 1_x                       Host Name  IP Address_x Serial Number                                                  Source.Name_y       Product ID_y Software Type_y OS Version_y           Current Milestone_x Suggested Version 1_y Suggested Version 2 Suggested Version 3  IP Address_y SW End of Life SW End of Sale                                                                                                                    URL_x                                                  Source.Name_x IP Address_x Item Type_y      Product ID_x Current Milestone_y   Hardware Lifecycle Status    Replacement PID Replacement PID Info  Replacement PID Price  Replacement PID Price Discount Replacement PID Service Level  Replacement PID Service Price  Current PID Service Price  Replacement PID Service Price Discount HW End of Life HW End of Sale                                                                                                                    URL_y                                                  Source.Name_y Item Type       Product ID_y  IP Address_y      Coverage Status    Contract Status    Contract Number Coverage Start Date Coverage End Date SLA type
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches   WS-C2960X-24PS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF5-A1    10.1.1.8   XXXXX  software_02_Jul_07_54_15.xlsx   WS-C2960X-24PS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>    10.1.1.8     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis   WS-C2960X-24PS-L    10.1.1.8    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches  WS-C2960X-48LPS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF6-A1    10.1.1.9   YYYYY  software_02_Jul_07_54_15.xlsx  WS-C2960X-48LPS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>    10.1.1.9     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis  WS-C2960X-48LPS-L    10.1.1.9    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches   WS-C2960X-24PS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF7-A1   10.1.1.11   ZZZZZZ  software_02_Jul_07_54_15.xlsx   WS-C2960X-24PS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>   10.1.1.11     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis   WS-C2960X-24PS-L   10.1.1.11    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches   WS-C2960X-24PS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF8-A1   10.1.1.12   QQQQQ  software_02_Jul_07_54_15.xlsx   WS-C2960X-24PS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>   10.1.1.12     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis   WS-C2960X-24PS-L   10.1.1.12    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches  WS-C2960X-48LPS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF9-A1   10.1.1.13   WWWWW  software_02_Jul_07_54_15.xlsx  WS-C2960X-48LPS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>   10.1.1.13     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis  WS-C2960X-48LPS-L   10.1.1.13    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_30_08.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-C Series Switches   WS-C2960C-12PC-L             IOS    15.2(4)E6             15.2(7)E2                   SWITCH-MGK-A1   10.1.1.39   EEEEEE  software_02_Jul_08_14_40.xlsx   WS-C2960C-12PC-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>   10.1.1.39     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html  hardware_02_Jul_07_25_04.xlsx  10.1.1.39     Chassis  WS-C2960C-12PC-L  EoL Date Announced  EOL in more than 24 months  WS-C2960L-16PS-LL                 None                  920.7                              50                          PSUT                         215.16                     122.76                                       0     2025-10-31     2020-10-30           https://www.cisco.com/c/en/us/products/switches/catalyst-2960-c-series-switches/eos-eol-notice-c51-743071.html  coverage_24_Jul_10_37_26.xlsx   Chassis   WS-C2960C-12PC-L   10.1.1.39  Uncovered with ELLW        No Contract        No Contract                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches  WS-C2960X-48LPS-L             IOS    15.2(4)E6             15.2(7)E2               SWITCH-SRVROOM-A1    10.1.1.2   RRRRRR  software_02_Jul_07_54_15.xlsx  WS-C2960X-48LPS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>    10.1.1.2     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis  WS-C2960X-48LPS-L    10.1.1.2    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_20_39.xlsx         3     Chassis  Switches       Cisco Catalyst 3850 Series Switches     WS-C3850-24P-S          IOS-XE       16.3.7                16.9.5               SWITCH-SRVROOM-C1    10.2.1.254   TTTTTT  software_02_Jul_07_54_33.xlsx     WS-C3850-24P-S          IOS-XE       16.3.7            End of Engineering                16.9.5                <NA>                <NA>    10.2.1.3     2023-07-31     2018-08-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-3850-series-switches/eos-eol-notice-c51-740255.html  software_02_Jul_07_02_48.xlsx   10.2.1.254        <NA>              <NA>  End of Engineering                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>     2023-07-31     2018-08-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-3850-series-switches/eos-eol-notice-c51-740255.html  coverage_24_Jul_10_07_28.xlsx   Chassis     WS-C3850-24P-S    10.2.1.254    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 software_30_Jun_15_21_13.xlsx         1        <NA>  Security  Cisco ASA 5500-X with FirePOWER Services               <NA>             ASA      9.7(1)4        9.12.3 Interim  SRVROOM-FW2.umbrellacorp.com  10.60.127.19   YYYYYY  software_02_Jul_07_55_54.xlsx         ASA5506-K9             ASA      9.7(1)4            End of Engineering        9.12.3 Interim       9.8.4 Interim                <NA>  10.1.122.9     2022-08-31     2017-08-25          http://www.cisco.com/c/en/us/products/collateral/security/asa-firepower-services/eos-eol-notice-c51-738646.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_07_48.xlsx   Chassis         ASA5506-K9  10.60.127.19    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 software_30_Jun_15_21_13.xlsx         1        <NA>  Security  Cisco ASA 5500-X with FirePOWER Services               <NA>             ASA      9.7(1)4        9.12.3 Interim  FW2.umbrellacorp.com  10.60.127.18   GGGGGGG  software_02_Jul_07_55_54.xlsx         ASA5506-K9             ASA      9.7(1)4            End of Engineering        9.12.3 Interim       9.8.4 Interim                <NA>  10.1.122.8     2022-08-31     2017-08-25          http://www.cisco.com/c/en/us/products/collateral/security/asa-firepower-services/eos-eol-notice-c51-738646.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_07_48.xlsx   Chassis         ASA5506-K9  10.60.127.18    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None

标签: pythonpandasdataframemergeduplicates

解决方案


从使用functools. 添加inspect到混合获取变量名称

  1. 遍历您的数据框列表。捕获名称并重命名IP地址
  2. 一旦合并数据框重命名最左边的IP地址
  3. fillna()从其他IP 地址列中删除它们
import inspect
import functools

def retrieve_name(var):
    callers_local_vars = inspect.currentframe().f_back.f_locals.items()
    return [var_name for var_name, var_val in callers_local_vars if var_val is var]

data_frames = [InventoryDf, SoftwareDf, HardwareDf, CoverageDf]
names = []
for df in data_frames:
    n = retrieve_name(df)[1].replace("Df", "")
    names.append(n)
    df.columns = [f"{n} {c}" if c=="IP Address" else c for c in df.columns]
# merge = functools.partial(pd.merge, on=['Priority','Category','Product Family','Host Name','Serial Number'], how='outer')
merge = functools.partial(pd.merge, on=['Priority','Category','Host Name','Serial Number'], how='outer')

merge = functools.reduce(merge, data_frames)

# take column LHS IP Address and rename it to "IP Address", fillna() from all subsequent columns
# then drop them
merge.rename(columns={f"{names[0]} IP Address":"IP Address"}, inplace=True)
for n in names[1:]:
    merge.loc[:,"IP Address"].fillna(merge.loc[:,f"{n} IP Address"], inplace=True)
    merge.drop(columns=f"{n} IP Address", inplace=True)
    


推荐阅读