python - Replace a word or set of letters from a string in a dataframe only if the string starts with that word
问题描述
Assuming I have the following toy model df
:
Line Sentence
1 A MAN TAUGHT ME HOW TO DANCE.
2 WE HAVE TO CHOOSE A CAKE.
3 X RAYS CAN BE HARMFUL.
4 MY HERO IS MALCOLM X FROM THE USA.
5 THE BEST ACTOR IS JENNIFER A FULTON.
6 A SOUND THAT HAS A BIG IMPACT.
If I were to do the following:
df['Sentence'] = df['Sentence'].str.replace('A ',' ')
This would remove all characters 'A '
from all sentences. However, I only need the 'A '
removed from string sentences that start with 'A '
. Similarly, I would like to remove the 'X '
from Line 3, and not from Malcolm X in Line 4.
The final output df should look like the following:
Line Sentence
1 MAN TAUGHT ME HOW TO DANCE.
2 WE HAVE TO CHOOSE A CAKE.
3 RAYS CAN BE HARMFUL.
4 MY HERO IS MALCOLM X FROM THE USA.
5 THE BEST ACTOR IS JENNIFER A FULTON.
6 SOUND THAT HAS A BIG IMPACT.
解决方案
You can use regular expression:
df["Sentence"] = df["Sentence"].str.replace(r"^(?:A|X)(?=\s)", "", regex=True)
print(df)
Prints:
Line Sentence
0 1 MAN TAUGHT ME HOW TO DANCE.
1 2 WE HAVE TO CHOOSE A CAKE.
2 3 RAYS CAN BE HARMFUL.
3 4 MY HERO IS MALCOLM X FROM THE USA.
4 5 THE BEST ACTOR IS JENNIFER A FULTON.
5 6 SOUND THAT HAS A BIG IMPACT.
推荐阅读
- c++ - C++ 线程 1:EXC_BAD_ACCESS(代码=1,地址=0x8
- java - 在 Java 中调用具有列表属性的方法
- asp.net-mvc - 尝试使用外键插入记录时无法使用 LINQ 保存?
- swift - Swift 4 在完成处理程序中访问 textView
- ios - 应用程序从后台返回后,UITabBarController selectedIndex 未更新
- asp.net - 在 SSRS 自定义代码中添加文字控件
- javascript - 如何使用节点运行 JS 文件而不发送错误:800A1391?
- mysql - NodeJS 和 MYSQL - ER_ACCES_DENIED_ERROR: Access denied for ''@'localhost' (使用密码: NO)
- asp.net-core - asp.net core websocket多少次ping失败会被关闭
- python - Python:如何将 csv 数据转换为数组