首页 > 解决方案 > 熊猫数据框中的条件搜索

问题描述

我正在尝试从降水和流的数据框中选择值。我需要在 5 天的低降水期(比如 2 毫米)之后选择流量数据,并确定流量数是否每天都在减少。任何人都可以帮助我如何编码吗?数据框如下所示,但几十年:

日期 流量(mm) 降水量(mm)
12/31/59 2.588 2.54
1960 年 1 月 1 日 1.861 0.00
1960 年 1 月 2 日 1.578 0.76

标签: pandas

解决方案


假设“日期”列是索引,条件搜索功能用于选择在 n 天的低降水量(阈值)逐日递减之后的数据流:

def conditional_search(dataframe, threshold = 0.2, n = 5):
    searched = []
    
    dataframe = dataframe.reset_index() # resets index --> 0, 1, 2...
    
    for index in dataframe.index: # iterates through the rows
        consecutives = dataframe.loc[index:index + n - 1] # next n values

        if (consecutives.shape[0] == n) & (consecutives['Precip(mm)'] < threshold).values.all(): # checks values less than threshold
            consecutives_decreasing = (consecutives['Precip(mm)'] - consecutives['Precip(mm)'].shift(-1)) > 0 # substracts precipitation column shifted up one from precipitation column and checks if they are positive

            if consecutives_decreasing.loc[consecutives_decreasing == True].shape[0] == n - 1: # checks if the number of positive subtractions are equal to n - 1
                searched.append(consecutives.set_index("Date"))
                
    return searched # returns a list of dataframes containing the searched values following the condition

例子:

输入:

            Flow(mm)  Precip(mm)
Date                            
12/31/1959     2.588        2.54
01/01/1960     1.861        2.53
01/02/1960     1.578        2.52
01/03/1960     1.578        2.51
01/04/1960     1.578        2.50
01/04/1960     1.578        3.89

输出:

[            Flow(mm)  Precip(mm)
 Date                            
 12/31/1959     2.588        2.54
 01/01/1960     1.861        2.53
 01/02/1960     1.578        2.52
 01/03/1960     1.578        2.51
 01/04/1960     1.578        2.50]

推荐阅读