首页 > 解决方案 > 熊猫在一列字符串中找到确切的单词和之前的单词(多个)并将其附加到python中的新列

问题描述

数据框看起来像这样

col_a
Python PY is a general purpose PY language

Programming PY language in Python PY 

Its easier to understand  PY

The syntax of the language is clean PY

此代码我试图实现此功能但无法获得预期的输出。如果有任何帮助表示赞赏。

这是我使用正则表达式处理的以下代码:

df['col_a'].str.extract(r"([a-zA-Z'-]+\s+PY)\b")

期望的输出:

col_a                                       col_b_PY     
Python PY is a general purpose language         Python PY purpose PY
Programming PY language in Python PY            Python PY Programming PY     
Its easier to understand  PY                    understand PY 
The syntax of the language is clean PY          clean  PY

标签: pythonregexpandas

解决方案


简单模式将提取所需的字符串:\w+\s+PY

解释:\w+匹配一个或多个单词字符,然后\s+匹配一个或多个空格,后跟PY.

演示


推荐阅读