首页 > 解决方案 > 创建具有多个条件的 if 语句,这些条件涉及列表中的特定 df 列和字符串字符

问题描述

基本上,我有一个来自调查“list_of_columns”的问题列表,这些问题也用作我的数据框中的列。在我的代码开头,我将所有空白回复替换为“空”,但一些调查受访者没有回答他们应该回答的问题(如“EOPS/CARE”或“EOPS/CARE”中是否标记“1”所示)我的数据框的 CalWORKs 列,但在与这些相应程序有关的问题中有“空”),所以我想在这些情况下将“空”重新编码为“缺失”以准确反映这一点。

这是我必须尝试解决的代码:

list_of_columns = ['E1', 'E2', 'E3', 'E5', 'E11', 'E13', 'E14', 'E17', 'E18', 'E20', 'C2', 'C7', 'C8', 'C9', 'C11', 'C12', 'NU2', 'NU7', 'NU8', 'NU10', 'NU11', 'CAL1', 'CAL2', 'CAL3', 'CAL5', 'CAL10', 'CAL12', 'CAL14', 'CAL15', 'O1'] # list of survey questions that are also columns in my df. Questions with 'E' indicate they are related to EOPS/CARE, questions with 'CAL' indicated they are related to CalWORKs, etc. 

for question in list_of_columns:

    if 'E' in question and data_final['EOPS/CARE'] == 1: # if 'E' is in the question, and the column 'EOPS/CARE' in my df is equal to 1, replace all instances of "Empty" with "Missing"

        data_final[question] = np.where(data_final[question] == "Empty", "Missing", data_final[question])

    elif 'CAL' in question and data_final['CalWORKs'] == 1: # similarly, if  'CAL' is in the question, and the column 'EOPS/CARE' in my df is equal to 1, replace all instances of "Empty" with "Missing"

        data_final[question] = np.where(data_final[question] == "Empty", "Missing", data_final[question])

    else:

        pass

当我尝试执行时,我不断收到此消息:“ValueError:系列的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all() 。”

这在 Stata 中很容易工作,但我决心在 Python 中执行此操作,因为我的其余代码已经在 Python 中。我仍在学习该语言,因此可能是由于语法。非常感谢!

标签: pythonpandasfor-loopif-statementreplace

解决方案


这只是将列放到您想要的位置的快速方法。

# as indicated in your question list_of_columns is also a column in df

df.loc[(df['list_of_columns'].str.contains('E')) & (df['EOPS/CARE'] == 1) & (df['Column name where empty would be present'] == 'Empty'),'Column Name where Empty would be present'] = 'Missing'

做同样的事情以使其他条件起作用。我无法理解问题的其余部分,但如果你澄清我可以提供进一步的帮助。

.loc会给你最大的帮助。检查文档。


推荐阅读