首页 > 解决方案 > Remove Integers and special characters from column

问题描述

New to python, would like to remove special characters and integers from column values. I do want to remain with only string characters in the column. For this case is column C that I do want to remove special characters like those slashes and numerics, see table

import pandas as pd 

data = {'A':['NW', 'NB', 'UK', 'CAN', 'der'],'B':['Tom', 'nick', 'krish', 'jack','mark'], 'C':['|20|Empty,', 'Yes| -1', 'Male|-1|2-female|0', 'Yes| 1', 79]} 
df = pd.DataFrame(data) 
print(df)

Now if a row has only an integer in column C, I should delete it, I have tried this which doesnt work well

df['C'].map(lambda x: re.sub(r'\-,+', '', x))

EXPECTED OUTPUT

import pandas as pd 

data = {'A':['NW', 'NB', 'UK', 'CAN'],'B':['Tom', 'nick', 'krish', 'jack'], 'C':['Empty', 'Yes', 'Male female', 'Yes']} 
df = pd.DataFrame(data) 
print(df)

标签: pythonpandasdataframe

解决方案


您可以使用str.replace + str.strip最后使用dropna

df['C'] = df.C.str.replace('(?i)[^a-z]', ' ').str.replace('\s+', ' ').str.strip()
print(df.dropna())

输出

     A      B            C
0   NW    Tom        Empty
1   NB   nick          Yes
2   UK  krish  Male female
3  CAN   jack          Yes

推荐阅读