首页 > 解决方案 > 如何从具有条件的 DATAFRAME 中选择列

问题描述

电子表格

DATE_ID Site    RRC_Fail#   S1_Fail#    pmCellDowntimeAuto
27-07-2021 03   S1  0   0   2
27-07-2021 03   S2  0   0   900
27-07-2021 03   S3  0   0   900
27-07-2021 03   S4  0   0   900
27-07-2021 03   S5  0   0   2
27-07-2021 03   S6  0   0   2
27-07-2021 03   S7  1   1   26
27-07-2021 03   S8  0   0   4
27-07-2021 03   S9  0   0   1800
27-07-2021 03   S10 0   0   5
27-07-2021 03   S11 0   0   1800
27-07-2021 03   S12 0   0   2
27-07-2021 03   S13 0   0   25
27-07-2021 03   S14 0   0   900
27-07-2021 03   S15 0   0   900

预期输出:-

DATE_ID Site    RRC_Fail#   S1_Fail#    pmCellDowntimeAuto
27-07-2021 03   S2  0   0   900
27-07-2021 03   S3  0   0   900
27-07-2021 03   S4  0   0   900
27-07-2021 03   S9  0   0   1800
27-07-2021 03   S11 0   0   1800
27-07-2021 03   S14 0   0   900
27-07-2021 03   S15 0   0   900

代码:-

import pandas as pd


df=pd.read_csv("test.csv", index_col=("DATE_ID"), names =['DATE_ID','Site','Zone','Status_AOL','ERBS','EUtranCellFDD','RRC_Fail#','RRC_Failure_Rate%','S1_Fail#','S1_Failure_Rate%','Downtime'], header=0 )
df2 = df.loc(['Zone == West'] | df['Downtime']>50)

df2.plot.bar(color = 'blue')
print(df2)

错误:-

文件“pandas_libs\ops.pyx”,第 233 行,在 pandas._libs.ops.vec_binop

ValueError:数组长度不同:30 vs 1

标签: pythonpandas

解决方案


您的输入数据框:

>>> df
            DATE_ID Site   Zone  RRC_Fail#  S1_Fail#  pmCellDowntimeAuto
27-07-2021        3   S1   East          0         0                   2
27-07-2021        3   S2   East          0         0                 900
27-07-2021        3   S3   East          0         0                 900
27-07-2021        3   S4   East          0         0                 900
27-07-2021        3   S5  North          0         0                   2
27-07-2021        3   S6   East          0         0                   2
27-07-2021        3   S7  North          1         1                  26
27-07-2021        3   S8  North          0         0                   4
27-07-2021        3   S9   East          0         0                1800
27-07-2021        3  S10  North          0         0                   5
27-07-2021        3  S11   East          0         0                1800
27-07-2021        3  S12   East          0         0                   2
27-07-2021        3  S13   West          0         0                  25
27-07-2021        3  S14   East          0         0                 900
27-07-2021        3  S15   East          0         0                 900

选择符合您的条件的行:

>>> df.loc[(df['Zone'] == 'West') | (df['pmCellDowntimeAuto'] > 50)]
            DATE_ID Site  Zone  RRC_Fail#  S1_Fail#  pmCellDowntimeAuto
27-07-2021        3   S2  East          0         0                 900
27-07-2021        3   S3  East          0         0                 900
27-07-2021        3   S4  East          0         0                 900
27-07-2021        3   S9  East          0         0                1800
27-07-2021        3  S11  East          0         0                1800
27-07-2021        3  S13  West          0         0                  25
27-07-2021        3  S14  East          0         0                 900
27-07-2021        3  S15  East          0         0                 900

旧答案 您的代码和输入样本不匹配。

如果您的条件是 Zone is west 或 Downtime 大于 50,则:

df2 = df.loc[(df['Zone'] == 'West') | (df['Downtime'] > 50)]

如果您的条件是 pmCellDowntimeAuto 大于或等于 900:

>>> df.loc[df['pmCellDowntimeAuto'] >= 900]
            DATE_ID Site  RRC_Fail#  S1_Fail#  pmCellDowntimeAuto
27-07-2021        3   S2          0         0                 900
27-07-2021        3   S3          0         0                 900
27-07-2021        3   S4          0         0                 900
27-07-2021        3   S9          0         0                1800
27-07-2021        3  S11          0         0                1800
27-07-2021        3  S14          0         0                 900
27-07-2021        3  S15          0         0                 900

推荐阅读