首页 > 解决方案 > 应用 10 到 15 分钟值的 Pandas 重新采样

问题描述

我知道这件事有几个问题,但我没有找到合适的问题(而且我还没有找到令人满意的解决方案)我有一个 10 分钟的 Dataframe。平均 数据并希望使用特定的公式将其重新采样到 15 分钟。平均 数据。(公式:DENA-Netzstudie II)我尝试使用应用函数 (df.resample('15T').apply(cal_ten_to_fifteen_min)) ...但我失败了,因为该函数没有将正确的行移交给重新计算 10 15 分钟数据。

在这里,我对此事的解决方案可能有人有更好的主意:)

#!/usr/bin/env python3
import pandas as pd


def cal_ten_to_fifteen_min(ten_min_value: list) -> list:
    """Returns an array of 15 min values based on the calculation of DENA Netzstudie II (2010) S. 109

    :param ten_min_value: 1-dim. array of 10 min avg. values
    :return: 1-dim. array of fifteen min. values
    :type ten_min_value: int
    :rtype : float[]
    """

    def index_m(_n: int) -> int:
        """Returns an index for the calculation to transfer 10 min values in 15 min values

        :param _n: index
        :type _n: int
        :rtype : int
        :return: index m for the calculation
        """
        return int(((3 * _n + (_n % 2)) / 2) - 1)

    def weighting_g(_m: int) -> int:
        """Returns a weighting value for the calculation to transfer 10 min values in 15 min values

        :param _m: index
        :type _m: int
        :return: weighting value g for the calculation
        :rtype : int
        """
        return abs((2 * _m % 3) - 1) + 1

    fifteen_min_value = list()
    for n in range(1, int(len(ten_min_value) / 6 * 4) + 1):
        m = index_m(n)
        try:
            fifteen_min_value.append(
                (ten_min_value[m - 1] * weighting_g(m) + ten_min_value[m] * weighting_g(m + 1)) / 3)
        except:
            fifteen_min_value.append("NaN")

    return fifteen_min_value


def recalc_10_to_15_df(frame, columns, r_one_col: bool = False):
    """Recalculates a 10 min. avg. Dataframe to a 15 min. avg. Dataframe

    :param frame: DataFrame with 10 min average values
    :param columns: column names with the average values
    :param r_one_col: if true: return only the column (works only when a string is given as parameter columns)

    :type frame: pd.Dataframe
    :type columns: list[str,] or str
    :type r_one_col: bool

    :return: pd.DataFrame with 15 min. avg. Data
    :rtype: pd.DataFrame()
    """
    frame = frame.copy()

    for id in frame.index:
        if id.minute == 0:
            first = id
            break
    index = pd.date_range(first, frame.last_valid_index(), freq='15T')
    frame = frame[(frame.index >= first)]

    if isinstance(columns, str):
        output = cal_ten_to_fifteen_min(list(frame[columns].values))

        output = pd.DataFrame(output, index[:len(output)], [columns, ])
        if r_one_col or len(frame.columns) == 1:
            return output
        frame.drop(columns, axis=1)

    elif isinstance(columns, list):
        output = list()
        for column in columns:
            output.append(recalc_10_to_15_df(frame, column, True))
            frame.drop(column, axis=1)
        output = pd.concat(output, join='inner', axis=1)

    else:
        raise ValueError

    for column in frame:
        output[column] = list(frame[column].values)[:len(output)]

    return output


index = pd.date_range('2020-01-01 00:00', '2020-01-01 02:00', freq='10T')
values = list(range(1, 14))
column_name = '10_min_avg'
df = pd.DataFrame(values, index, [column_name, ])

avg_15_min = recalc_10_to_15_df(df, column_name)

标签: resampling

解决方案


推荐阅读