python - Comparing values on x-axis & referring to values on row before in Dataframes
问题描述
# I have a dataframe that Looks like this:
df = pandas.DataFrame({"R1": [8,2,3], "R2": [-21,-24,4], "R3": [-9,46,6],"R4": [16,-14,-1],"R5": [-3,36,76]})
I want to compare every value within one row against each other, to then apply a function (if value 1 in row x is bigger than value 2 in row x). I am trying to apply something like this:
If value1 in row1 > value2 in row 1:
return based_on_previous_value(value1) # trying to put results in a new dataframe
Else:
return previous_row(value1) # trying to put results in a new dataframe
def based_on_previous_value(x):
x in row_before + 1
def previous_row(x):
x in row_before
--> this Code doesn't work (just trying to Show what I am trying to do in Code)
# results put in a new dataframe
df_new = pandas.DataFrame({"R1": [8,10,11], "R2": [-21,-21,-19], "R3": [-9,-5,-2],"R4": [16,17,17],"R5": [-3,0,4]})
--> "R1" in 2nd row: 2 > -24, 2 > -14 --> value("R1" in first row) + 2 = 10 --> "R2" in 2nd row: -21 < all the other 4 values --> value("R2" in first row) + 0 = -21 --> "R3" in 2nd row: 46 > all the other 4 values --> value("R3" in first row) + 4 = -5
解决方案
Here's some code that solves your problem. I have included both the expected output and the produced one with a comparison so assert equality. The code creates a middleman dataframe with the changes needed for each row using a helper function (skipping the first row!), then applies it to the initial one row by row.
import pandas as pd
df = pd.DataFrame({"R1": [8,2,3], "R2": [-21,-24,4], "R3": [-9,46,6],"R4": [16,-14,-1],"R5": [-3,36,76]})
expected_df = pd.DataFrame({"R1": [8,10,11], "R2": [-21,-21,-19], "R3": [-9,-5,-2],"R4": [16,17,17],"R5": [-3,0,4]})
def reevaluate(series):
return series.apply(lambda x: sum(series<x))
df_changes = df.iloc[1:,:].apply(reevaluate, axis=1)
df_changes.reset_index(drop=True, inplace=True)
produced_df = df.copy()
for row in df_changes.index:
produced_df.iloc[row+1, :] = produced_df.iloc[row, :] + df_changes.iloc[row, :]
print(expected_df.equals(produced_df))
True
推荐阅读
- javascript - TypeScript - 从基类访问子类类型
- excel - 您的文件 xxxx.xlsx 未更新,因为我们无法应用用户名所做的更改
- python - 在 Firebase 身份验证时间段后禁用用户帐户
- javascript - JS函数变量看不到全局变量
- c++ - 使用函数重载向模板化函数注入行为的规则
- sql - 如何使用具有未确定/动态路径的嵌套 JSON 的 OPENJSON 方法?
- plotly-python - ufunc 'add' 不包含签名匹配类型的循环(dtype('
dtype(' 我正在尝试绘制一个网络图,以显示 1980 年从不同国家到澳大利亚的移民数量。我有两列,一列用于国家,一列用于特定年份。在我的代码的这一步中,我遇到了以下错误。有什么帮助吗?
for node, adjacencies in enumerate(G.adjacency()):
- windows - 如何通过函数调用 WSAsend 避免将多个缓冲区组合成一个 UDP 数据包?
- java - 方法java中的类参数
- python - 使用 pdblp 循环访问 CUSIP 并从 Bloomberg 中提取数据