python - python中列中的模式匹配
问题描述
我有两个数据框 df 和 df1。我想根据 df1 中给出的值在 df 中搜索模式。数据帧如下:
import pandas as pd
data={"id":["I983","I873","I526","I721","I536","I327","I626","I213","I625","I524"],
"coltext":[ "I could take my comment back, I would do so in a second. I have addressed my teammates and coaches and while many understand my actions were totall", "We’re just trying to see if he can get on the field as a football player, and then we’ll make decision",
"TextNow offers low-cost, international calling to over 230 countries. Stay connected longer with rates starting at less than",
"Wi-Fi can provide you with added coverage in places where cell networks don't always work - like basements and apartments. No roaming fees for Wi-Fi connection",
"Send messages and make calls on your compute",
"even have a free, Wi-Fi only version of TextNow, available for download on you",
"the rest of the players accepted apologies this spring and are welcoming him back",
"was really looking at him and watching how much this really means to him and how much he really missed us",
"I’ll deal with the problem and I’ll remedy the problem",
"The first step was for him to be able to complete what we call our bottom line program which has been completed"]}
df=pd.DataFrame(data=data)
data1={"col1":["addressed teammates coaches","football player decision","watching really missed", "bottom line program","meassges make calls"],
"col2":["international calling over","download on you","rest players accepted","deal problem remedy","understand actions totall"],
"col3":["first step him","Wi-Fi only version","cell network works","accepted apologies","stay connected longer"]}
df1=pd.DataFrame(data=data1)
例如,df1['col1'] 中的第一个元素“addressed teammates coaches”位于 df['coltext'] 中的第一个元素中,同样我想在 df['coltext'] 中搜索 df1 中每一列的每个元素。如果找到模式,则在 df 中创建第三个 col。
期望的输出:
id coltext patternMatch
I983 I could take my comment back, col1, col2
I873 We’re just trying to see if he can col1
I526 TextNow offers low-cost, col3, col2
I721 Wi-Fi can provide you with col3
I536 Send messages and make calls col1
解决方案
可能还有其他有效的方法,一种方法可能如下:
# create dictionary of data1 such that values and keys are reversed
my_dict = {item:k for k, v in data1.items() for item in v}
# for column in df check if all words are in 'coltext' for each key in dictionary
df['patternMatch'] = df['coltext'].apply(lambda row:
{v for k, v in my_dict.items()
if all(word in row for word in k.split())})
推荐阅读
- reactjs - 如何在组件中传递道具?
- angular - 无法在 ionic 3 角度的 html5 播放器中播放本地文件视频
- sql - 多个级联路径 sql server 2017 创建自引用外键
- python - 如何使用 python 在 Atom 上运行程序?
- python - 如何将 keras 预训练模型从 NCHW 转换为 NHWC 格式
- sql - 如何转换日期:2020 年 9 月 26 日星期六 00:15:00 在 oracle sql 中采用 YYYY/MM/DD HH24:MI:SS' 格式
- jenkins - Jenkinsfile在管道中的环境变量中以键和值格式加载groovy文件参数
- xaml - TabbedPage 导航和 Shell 选项卡导航有什么区别
- javascript - 使用 onRequest 的 Firebase 云函数中的中间件永远不会被击中
- flutter - 将一个父级中的小部件相对于另一个父级中的小部件对齐 - Flutter