python - 如何过滤过滤的数据?
问题描述
例如,有些客户有发票编号。一些客户有多个发票号码。
我已经通过执行以下操作过滤掉了唯一数量的客户:
m = list(set(map(lambda x: x.Name + data)))
print("There are", len(m), "customers")
我怎么说有多少女性和男性?如果客户重复多次,则性别只应计算一次。
csv中的样本数据
列如下:state, first name, last name, gender, age, invoiceNo
.
state firstname lastname gender age invoiceNo
TX Jane DOE Female 52 36524
TX Jane DOE Female 52 65142
NY John Williams Male 68 24536
我如何找到平均年龄?
m = customer(row[0], row[1] + " " + row[2], row[3], int(row[4]), int(row[5]))
data.append(m)
m = list(set(map(lambda x: x.Name + data))
解决方案
这就是我将如何做你问的各种事情。请注意,我已添加回您从程序的早期版本中删除的类定义,并通过添加 aunique_id()
和__repr__()
方法对其进行了修改。
import csv
from pprint import pprint
class Customer:
def __init__(self, state, name, gender, age, invoice):
self.State = state
self.Name = name
self.Gender = gender
self.Age = age
self.Invoice = invoice
def unique_id(self):
""" Return identifer unique to this customer. """
return (self.State, self.Name, self.Gender, self.Age)
def __repr__(self):
classname = type(self).__name__
return (f'{classname}(State={self.State!r}, Name={self.Name!r}, '
f'Gender={self.Gender!r}, Age={self.Age!r}, invoice={self.Invoice!r})')
filename = 'salesinfo.csv'
data = []
with open(filename, 'r', newline='') as file:
reader = csv.reader(file, delimiter='\t')
next(reader) # Skip header.
for row in reader:
if not row:
continue
customer = Customer(row[0], row[1]+" "+row[2], row[3], int(row[4]), int(row[5]))
data.append(customer)
#pprint(data); print() # Show what was read.
# Determine number of unique customers (by calling class unique_id() method).
m = list(set(map(lambda c: getattr(c, 'unique_id')(), data)))
print("There are", len(m), "customers")
# Determine how many of each gender there are *and* the overall average age.
seen = set() # To avoid counting a customer more than once.
genders = dict()
average_age = 0
for customer in data:
unique_id = customer.unique_id()
if unique_id not in seen:
genders[customer.Gender] = genders.setdefault(customer.Gender, 0) + 1
average_age += customer.Age
seen.add(unique_id)
average_age = average_age / len(m)
pprint(genders) # Total number of each gender.
print(f"Average customer's age: {average_age:.1f}")
推荐阅读
- azure-cognitive-services - 上传 Business Central Base App.xlf 导致“无法提取上传文件的内容”。
- c# - 如何从 C# 获取我在 Linux 上创建的文件夹列表?
- mysql - 加入后MySql和DISTINCT
- html - 在 vb.net 中动态创建的文本框中设置宽度的问题
- node.js - 在JS文件nodejs中导入TS文件找不到模块
- mysql - 将两列的mysql数值group_concat与连接进行比较
- sql-server - 只有在所有表都更新后才需要显示更新的数据
- javascript - 如何使用移动应用程序的相机检测图像到文本
- reactjs - ReactJS:从数据列表中获取 ID
- vue.js - Vuex 不提供名为“createStore”的导出