首页 > 解决方案 > 我的代码部分将列表输出作为数据框列写入 csv,但之后中断

问题描述

我有一个包含两列的数据集,我想匹配两列中的字符串并在第三列中生成匹配百分比。然后我想在 CSV 中获取所有三列。这是我的代码。

    Data: 

    **RoS  FCRA**
    pink pinky 
    rose grass 
    thick thin 

代码:

from fuzzywuzzy import fuzz, process
import pandas as pd
import csv

df = pd.read_excel("/Users/shreyaagarwal/Desktop/fcra test.xlsx")
with open("myfile.csv", "w") as fh:
     writer = csv.writer(fh)
     for i in (df["RoS"]):
        for p in (df["FCRA"]):
            s = p.encode('ascii', 'ignore').decode('ascii')
            match = fuzz.partial_ratio(i,s)
            df["Fuzzymatch"] = match
            writer.writerow([i,s,match])



Desired Output: 
    **RoS  FCRA  Match**
    pink pinky 20
    pink grass 0
    pink thin 0
    rose pinky 0
    rose grass 0
    rose thin 0

标签: pythonpandascsv

解决方案


你好像是循环错误的事情和引入你从不使用的变量。我猜你想要类似的东西

from fuzzywuzzy import fuzz, process
import pandas as pd
import csv

df = pd.read_excel("test.xlsx")
with open("myfile.csv", "w") as fh:
    writer = csv.writer(fh)
    for i in df["RoS"]:
        for p in df["FCRA"]:
            match = fuzz.partial_ratio(i,p)
            writer.writerow([i,p,match])

这是一个MCVE的尝试:

import pandas as pd

df = pd.DataFrame(
    [['pink', 'pinky'], ['rose', 'grass'], ['thick', 'thin']],
    columns=['RoS', 'FCRA'])
for i in df["RoS"]:
    for p in df["FCRA"]:
        print(i, p)

结果:

('pink', 'pinky')
('pink', 'grass')
('pink', 'thin')
('rose', 'pinky')
('rose', 'grass')
('rose', 'thin')
('thick', 'pinky')
('thick', 'grass')
('thick', 'thin')

推荐阅读