python - 在 Python 中过滤列表
问题描述
我有一个 Python 列表
用户名、功能、项目、描述、日期、时间、年份、版本
['erinil01', 'Oppstart', '', 'Startet programmet', '06/07/21', '12:48:54', '2021', '2']
['erinil01', 'Oppstart', '', 'Startet programmet', '06/07/21', '12:56:49', '2021', '2']
['erinil01', 'Prosjektadmin', '920208', 'Lastet prosjektet', '06/07/21', '12:59:09', '2021', '2']
['erinil01', 'Prosjektadmin', '920208', 'Lagret prosjektet', '06/07/21', '12:59:17', '2021', '2']
['erh4021', 'Oppstart', '', 'Startet programmet', '06/07/21', '13:02:38', '2021', '2']
['erinil01', 'Prosjektadmin', '921106', 'Lagt til nytt prosjekt', '06/07/21', '13:06:45', '2021', '2']
['erinil01', 'Prosjektadmin', '921107', 'Lagt til nytt prosjekt', '06/07/21', '13:07:02', '2021', '2']
['erinil01', 'Prosjektadmin', '921106', 'Lastet prosjektet', '06/07/21', '13:07:08', '2021', '2']
假设我只想根据不同的条件过滤此列表,例如用户名、功能、项目、日期、年份等。如果某些过滤器为空,则根据其他条件显示全部。
提示?
解决方案
你没有说你是如何得到这个列表的,但它看起来很像嵌套列表。
data = [
['erinil01', 'Oppstart', '', 'Startet programmet', '06/07/21', '12:48:54', '2021', '2'],
['erinil01', 'Oppstart', '', 'Startet programmet', '06/07/21', '12:56:49', '2021', '2'],
['erinil01', 'Prosjektadmin', '920208', 'Lastet prosjektet', '06/07/21', '12:59:09', '2021', '2'],
['erinil01', 'Prosjektadmin', '920208', 'Lagret prosjektet', '06/07/21', '12:59:17', '2021', '2'],
['erh4021', 'Oppstart', '', 'Startet programmet', '06/07/21', '13:02:38', '2021', '2'],
['erinil01', 'Prosjektadmin', '921106', 'Lagt til nytt prosjekt', '06/07/21', '13:06:45', '2021', '2'],
['erinil01', 'Prosjektadmin', '921107', 'Lagt til nytt prosjekt', '06/07/21', '13:07:02', '2021', '2'],
['erinil01', 'Prosjektadmin', '921106', 'Lastet prosjektet', '06/07/21', '13:07:08', '2021', '2'],
]
对于嵌套列表,您必须使用for
-loop 单独处理每一行。
对于每一行,您都可以使用索引来检查值。
这将获取所有具有空值且Project
具有索引的行[2]
filtered_data = []
for row in data:
if not row[2]:
#print('empty:', row)
filtered_data.append(row)
print('--- filtered_data ---')
for row in filtered_data:
print(row)
对于更复杂的过滤器,您必须创建更复杂if
的 .
为了使其更通用,您可以创建获取单行并返回的函数,True
或者False
如果您想保留这一行。
def selected(row):
#if not row[2]:
# return True
#else:
# return False
# shorter
return not row[2]
filtered_data = []
for row in data:
if selected(row):
#print('empty:', row)
filtered_data.append(row)
然后你甚至可以将其简化为列表理解
filtered_data = [row for row in data if selected(row)]
或使用功能filter()
filtered_data = list(filter(selected, data))
这样,您可以创建不同的功能selected()
来组合过滤器。
filtered_data = list(filter(selected_1, data))
filtered_data = list(filter(selected_2, filtered_data))
filtered_data = list(filter(selected_3, filtered_data))
顺便提一句:
如果您从数据库中获取数据,那么您可以SQL query
在从数据库中获取数据时直接过滤数据。
如果您可以保留数据,pandas.DataFrame
那么您可以使用 column'n 名称Username, Function, Project, Description, Date, Time, Year, Version
来过滤它。
编辑:
最小的工作示例
data = [
['erinil01', 'Oppstart', '', 'Startet programmet', '06/07/21', '12:48:54', '2021', '2'],
['erinil01', 'Oppstart', '', 'Startet programmet', '06/07/21', '12:56:49', '2021', '2'],
['erinil01', 'Prosjektadmin', '920208', 'Lastet prosjektet', '06/07/21', '12:59:09', '2021', '2'],
['erinil01', 'Prosjektadmin', '920208', 'Lagret prosjektet', '06/07/21', '12:59:17', '2021', '2'],
['erh4021', 'Oppstart', '', 'Startet programmet', '06/07/21', '13:02:38', '2021', '2'],
['erinil01', 'Prosjektadmin', '921106', 'Lagt til nytt prosjekt', '06/07/21', '13:06:45', '2021', '2'],
['erinil01', 'Prosjektadmin', '921107', 'Lagt til nytt prosjekt', '06/07/21', '13:07:02', '2021', '2'],
['erinil01', 'Prosjektadmin', '921106', 'Lastet prosjektet', '06/07/21', '13:07:08', '2021', '2'],
]
# --- version 1 ---
filtered_data = []
for row in data:
if (not row[2]) or (int(row[2]) > 920208):
#print('empty:', row)
filtered_data.append(row)
print('--- filtered_data ---')
for row in filtered_data:
print(row)
# --- version 2 ---
def selected(row):
#if (not row[2]) or (int(row[2]) > 920208):
# return True
#else:
# return False
# shorter
return (not row[2]) or (int(row[2]) > 920208)
def selected_1(row):
return not row[2]
def selected_2(row):
return int(row[2]) > 920208
filtered_data = []
for row in data:
if selected_1(row) or selected_2(row):
#if selected(row):
#print('empty:', row)
filtered_data.append(row)
print('--- filtered_data ---')
for row in filtered_data:
print(row)
# --- version 3 ---
def selected(row):
return (not row[2]) or (int(row[2]) > 920208)
def selected_1(row):
return not row[2]
def selected_2(row):
return int(row[2]) > 920208
filtered_data = [row for row in data if selected(row)]
filtered_data = [row for row in data if selected_1(row) or selected_2(row)]
print('--- filtered_data ---')
for row in filtered_data:
print(row)
# --- version 4 ---
def selected(row):
return (not row[2]) or (int(row[2]) > 920208)
def selected_1(row):
return not row[2]
def selected_2(row):
return int(row[2]) > 920208
#filtered_data = list(filter(selected, data))
filtered_data = list(filter(lambda row:selected_1(row) or selected_2(row), data))
print('--- filtered_data ---')
for row in filtered_data:
print(row)
编辑:
和。。。相似pandas
data = [
['erinil01', 'Oppstart', '', 'Startet programmet', '06/07/21', '12:48:54', '2021', '2'],
['erinil01', 'Oppstart', '', 'Startet programmet', '06/07/21', '12:56:49', '2021', '2'],
['erinil01', 'Prosjektadmin', '920208', 'Lastet prosjektet', '06/07/21', '12:59:09', '2021', '2'],
['erinil01', 'Prosjektadmin', '920208', 'Lagret prosjektet', '06/07/21', '12:59:17', '2021', '2'],
['erh4021', 'Oppstart', '', 'Startet programmet', '06/07/21', '13:02:38', '2021', '2'],
['erinil01', 'Prosjektadmin', '921106', 'Lagt til nytt prosjekt', '06/07/21', '13:06:45', '2021', '2'],
['erinil01', 'Prosjektadmin', '921107', 'Lagt til nytt prosjekt', '06/07/21', '13:07:02', '2021', '2'],
['erinil01', 'Prosjektadmin', '921106', 'Lastet prosjektet', '06/07/21', '13:07:08', '2021', '2'],
]
import pandas as pd
import numpy as np
df = pd.DataFrame(data, columns=['Username', 'Function', 'Project', 'Description', 'Date', 'Time', 'Year', 'Version'])
df = df.replace(r'', np.nan) # to compare empty string with `float` value `920208`
print(df)
mask1 = df['Project'].isnull() # detect `np.nan`
#print(mask1)
mask2 = (df['Project'].astype(float) > 920208)
#print(mask2)
filtered_data = df[ mask1 | mask2 ] # `|` means `or` , `&` means `and`
print('--- filtered_data ---')
print(filtered_data)
推荐阅读
- spring - 如何将输入验证错误映射到 Spring 中的特定错误代码
- java - java - 如何按排序顺序打印Java Set of Sets元素?
- java - 为什么“javax.net.ssl.SSLHandshakeException”?
- javascript - GCC预处理器如何用空行替换#define
- python - 在谷歌云函数中生成JWT - python
- json - 使用 Ajax 传递时 JSON 基元无效
- python - 注释值已更改,但未在绘图上更新
- kubernetes - 获取唯一命名空间的 Pod
- git - Windows 密码已更改,不再能够使用 Git。控制面板中没有凭据管理器
- node.js - 使用nodejs / selenium按时间顺序输出Webelements列表?