python - 对于一列中的唯一值,获取另一列中唯一值的总数
问题描述
我有两个 pyodbc 行对象,如下所示:
('Emp1', 'Absent')
('Emp1', 'Absent')
('Emp1', 'Present')
('Emp2', 'Present')
('Emp2', 'Present')
('Emp2', 'Absent')
('Emp2', 'Present')
('Emp2', 'Absent')
我想计算每个独特员工的“在场”和“缺席”的数量,例如:
Emp1: Absent= 2, Present= 1
Emp2: Absent = 2, Present = 3
我试过了:
new = []
for row in cursor.fetchall():
if row[0] not in new:
new.append(row[0])
for x in new:
print(x, row[1].count("Present"))
print(x, row[1].count("Absent"))
但它返回了一行 000000
提前感谢您的帮助。
解决方案
它应该是这样的:
import collections
import itertools
data = [
('Emp1', 'Absent'),
('Emp1', 'Absent'),
('Emp1', 'Present'),
('Emp2', 'Present'),
('Emp2', 'Present'),
('Emp2', 'Absent'),
('Emp2', 'Present'),
('Emp2', 'Absent'),
]
sorted_data = sorted(data, key = lambda x: (x[0], x[1])) # sort our data
employees = collections.defaultdict(dict)
# group by employee
for employee, employee_group in itertools.groupby(sorted_data, lambda item: item[0]):
# group by category
for category, category_group in itertools.groupby(employee_group, lambda item: item[1]):
employees[employee][category] = sum(1 for _ in category_group)
print('employees', employees) # employees defaultdict(<class 'dict'>, {'Emp1': {'Absent': 2, 'Present': 1}, 'Emp2': {'Absent': 2, 'Present': 3}})