python - 如果行在python中重复,则删除所有行
问题描述
我尝试删除重复的行,但出现错误代码:'Series' object has no attribute 'remove'。
我可以知道如何替换“删除”命令或修复属性错误吗?
如果该行在 allMYemail.csv 中重复,则必须删除该行。有我的代码:
import csv
import re
import json
import pandas as pd
df1 = pd.read_csv('allMYemail.csv')
df2 = pd.read_csv('MYallmatchagain.csv')
emailSet = set()
for i, row in df1.dropna().iterrows():
emailSet.add(row['0'])
# print(emailSet)
output = []
for i,row in df2.iterrows():
# print(row)
Birthdate = row['Birthdate']
Gender = row['Gender']
Mobile2 = row['Mobile2']
Salutation = row['Salutation']
email = row['email']
firstName = row['firstName']
lastName = row['lastName']
name = row['name']
areaCode = row['areaCode']
errorCode = row['errorCode']
localNumber = row['localNumber']
Status = row['Status']
Domain = row['Domain']
ReturnCode = row['ReturnCode']
matched = False
for emails in emailSet:
if emails == email:
matched = True
break
if matched:
row.remove('Birthdate')
row.remove('Gender')
row.remove('Mobile2')
row.remove('Salutation')
row.remove('email')
row.remove('firstName')
row.remove('lastName')
row.remove('name')
row.remove('areaCode')
row.remove('errorCode')
row.remove('localNumber')
row.remove('Status')
row.remove('Domain')
row.remove('ReturnCode')
else:
pass
output_obj = {}
output_obj['Birthdate'] = Birthdate
output_obj['Gender'] = Gender
output_obj['Mobile2'] = Mobile2
output_obj['Salutation'] = Salutation
output_obj['email'] = email
output_obj['firstName'] = firstName
output_obj['lastName'] = lastName
output_obj['name'] = name
output_obj['areaCode'] = areaCode
output_obj['errorCode'] = errorCode
output_obj['localNumber'] = localNumber
output_obj['Status'] = Status
output_obj['Domain'] = Domain
output_obj['ReturnCode'] = ReturnCode
output.append(output_obj)
df = pd.read_json(json.dumps(output))
# print(json.dumps(output))
df.to_csv(r'MYfinish.csv', index = None)
任何帮助将不胜感激。
解决方案
由于您的问题不清楚它想要做什么,如果您只想删除一个 df 中完全重复的行,那么 @Renaud 的解决方案将完成这项工作。如果您想根据单个列“电子邮件”中的重复项删除行,请尝试以下操作:
def firstline(d):
return(d.reset_index(drop=True).loc[0])
result_df = df.groupby('email').apply(firstline)
推荐阅读
- chef-infra - 如何使用厨师食谱中的命令输出设置厨师节点属性
- modelica - 以可变频率测量平均值
- c++ - ld:尝试编译项目时找不到架构 x86_64 的符号
- python - 从文本文件中的嵌套字典中提取值到 JSON
- php - 使用 2 个条件验证 textarea 联系表 7
- java - 多个一对一自引用关系的 JPA 注释
- python - 如何迭代列表python
- ios - 如何将Object参数数组快速传递给Alamofire
- blazor - NavigateTo 不会立即显示新页面
- python - 我应该从 Python 运行单个并行化 C 脚本还是运行一组并行的串行 C 脚本?