python - 删除 Pandas DataFrame 中的正则表达式、方括号、单引号和双引号

问题描述

我正在尝试删除正则表达式、方括号、单引号和双引号并将其替换为空字符串。我做得不对。输入如下：

Accident_type                      Injury_classification        
                          
['Strike fixed/station obj']     ["Assault in PI Cases", 'Other Injuries']
['Slip, trip, fall']             ["Work Related Injury", 'Other Injuries']
etc

我试过df['Injury_classification'].str.replace(r" \(.*\)","")了，它没有删除任何东西。代码运行了，但结果相同，没有删除任何内容。

然后我尝试了

df['Injury_classification'] = pd.DataFrame([str(line).strip('[').strip(']').strip('\'').strip('\'').strip('"') for line in df['Injury_classification']])

电流输出：

Accident_type                      Injury_classification      
                                 
empty                       Assault in PI Cases", 'Other Injuries
empty                       Work Related Injury", 'Other Injuries
etc

如您所见，仍然有一些单引号，有时还有双引号。我想知道如何处理这个问题？我有大约 20-30 个具有相似结构的列。现在，我正在逐行运行相同的命令，但对于那么多列来说效率不高。我想知道如何编写一个循环来删除所有列的正则表达式、单引号和双引号？

预期输出：

Accident_type                      Injury_classification      
                                 
Strike fixed/station obj    Assault in PI Cases, Other Injuries
Slip, trip, fall            Work Related Injury, Other Injuries
etc

谢谢

标签： pythonpandasdataframe

我会在str.replace这里使用一个字符类：

df['Injury_classification'] = df['Injury_classification'].str.replace("[\[\]\"']", "")

这将输入['Slip', 'trip', "fall"]到Slip, trip fall.

python - 删除 Pandas DataFrame 中的正则表达式、方括号、单引号和双引号

问题描述

解决方案

推荐阅读