python - pandas 列中单词的位置
问题描述
我试图计算一个单词出现在列中某个位置的次数。例如:
Text
Manchester United, finally, won ...
Arsenal is one of the best ...
Beckham played for Manchester United
Manchester is a city in the UK
等等
我想我应该应用这样的东西,但我应该考虑单词而不是 char:
max_len = max(map(len, sequences))
d = defaultdict(lambda: [0]*max_len) # d[char] = [pos0, pos12, ...]
for seq in sequences:
for i, char in enumerate(seq):
d[char][i] += 1
如果可能的话,我想获得有关曼彻斯特这个词在该文本中的位置的信息。
解决方案
试试这个列表理解:
df["position"] = [ent.index("Manchester")
if "Manchester" in ent else -1
for ent in df.Text.str.split()]
df
Text position
0 Manchester United, finally, won ... 0
1 Arsenal is one of the best ... -1
2 Beckham played for Manchester United 3
3 Manchester is a city in the UK 0