python - 如何加入熊猫列表?
问题描述
假设我有这两个数据框:
df1:
ID Strings
1 'hello, how are you?'
2 'I like the red one.'
3 'You? I think so.'
df2:
range Strings
[1] 'hello, how are you?'
[2,3] 'I like the red one. You? I think so.'
我的目标是获取 df1 中的句子并将它们分组,以便它们与 df2 匹配。为此,我设法找到了一种方法来标记我希望他们所在的组,所以在这个例子中,1 是独立的,但句子 2 和 3 需要结合起来。
我可以通过加入来做到这一点吗?
解决方案
假设你有你的加入列表,你可以做这样的事情:
df = pd.DataFrame(['hello, how are you?','I like the red one.', 'You? I think so.'], columns=['sentence'])
# rows 1 and 2 are to be merged
join = [[0], [1,2]]
# check if the indexes are in the list items
df['joincol'] = pd.Series(df.index).apply(lambda x: [x in j for j in join]).astype(str)
df
sentence joincol
0 hello, how are you? [True, False] # this is your grouping column
1 I like the red one. [False, True]
2 You? I think so. [False, True]
# group by and keep uniques
df.groupby('joincol')['sentence'].transform(lambda x: ' '.join(x)).drop_duplicates()
# result
0 hello, how are you?
1 I like the red one. You? I think so.
Name: sentence, dtype: object
推荐阅读
- sql - 如何在 TO_TIMESTAMP 函数中添加小时数?
- angularjs - 如何针对表格行单独获取值或下拉列表和文本框 - AngularJS
- javascript - 放大和缩小,但保持 svg 居中
- daml - 无法执行快速入门指南中的某些 curl 命令
- groovy - Groovy :将一个 XML 节点替换为另一个
- angular - 如何传递/获取单选按钮值?Angular2 + 材质?
- wordpress - 将 reCAPTCHA v3 添加到 wordpress 评论
- swift - 使用 firebase 数据库和存储下载数据时索引超出范围异常
- python - 如何检测空物体?
- liquibase - Liquibase 试图创建同一个表两次