python - 用于在 Python 中过滤对象的用户界面
问题描述
在我的应用程序中,我有一个如下定义/概述的 Job 类。这个作业类的实例代表一个特定的作业运行。作业可以有多个检查点,每个检查点可以有多个命令。
Job
- JobName
- [JobCheckpoint]
- StartTime
- EndTime
- Status
- ...
JobCheckpoint
- JobCheckpointName
- [JobCommand]
- StartTime
- EndTime
- Status
- ...
JobCommand
- JobCommandName
- [Command]
- StartTime
- EndTime
- Status
- ...
在任何一天,都有大约 10 万个不同的作业在运行。作业信息保存在文件系统中。我想用 Python 设计一个用户界面来查询这些作业对象。例如用户应该能够查询
- 在 x 和 y 间隔之间运行的所有作业。
- 运行命令 x 的所有作业。
- 所有作业都处于失败状态。
- 所有作业都处于失败和终止状态。
- 特定作业的所有检查点/命令。
- 还有很多...
为了解决这个问题,我正在考虑在用户界面中提供以下方法。
get_jobs(Filter)
我不确定如何在 Python 中设计这个 Filter 类
- 支持对 Job 对象的所有此类查询。
- 并保持 API 的使用对用户来说简单/直观。
这里的线索真的很感激。
解决方案
这些是部分主观的问题。但我会尽我目前的知识和所提问题中的可用信息来回答其中的一些问题。
过滤器类会是什么样子?
这可能取决于例如存储机制。它是作为一堆 Python 对象存储在内存中,还是首先从 SQL 数据库或 NoSQL 数据库中取出。
如果它取自 SQL 数据库,您可以利用 SQL 的过滤机制。毕竟它是一种(结构化)查询语言。
在这种情况下,您的 Filter 类就像将字段值翻译成一堆 SQL 运算符/条件。
如果它是一堆没有用于查询数据的数据库机制的 Python 对象,那么您可能需要考虑自己的查询/过滤方法。
Filter 类可能正在使用 Condition 类和 Operator 类。也许你有一个 Operator 类作为一个抽象类,并有“胶水”运算符将条件粘合在一起(AND/OR)。还有另一种运算符,用于将域对象的属性与值进行比较。
对于后者,即使您没有为其设计“过滤器语言”,您也可以从 API 查询格式中获得一些灵感,此处为 Flask-Restless 指定:https ://flask-restless.readthedocs.io/en/稳定/searchformat.html#query-format
当然,如果您正在为 REST API 等设计查询接口,Flask-Restless 的查询格式可以为您提供一些关于如何处理查询的灵感。
返回域对象列表是否正确或我应该返回字典列表?
返回域对象列表的优点是能够使用继承。这至少是一种可能的优势。
某些类的粗略草图:
from abc import ABCMeta, abstractmethod
from typing import List
class DomainObjectOperatorGlue(metaclass=ABCMeta):
@abstractmethod
def operate(self, haystack: List['DomainObject'], criteria:
List['DomainObject']) -> List['DomainObject']:
pass
class DomainObjectFieldGlueOperator(metaclass=ABCMeta):
@abstractmethod
def operate(self, conditions: List[bool]) -> bool:
pass
class DomainObjectFieldGlueOperatorAnd(DomainObjectFieldGlueOperator):
def operate(self, conditions: List[bool]) -> bool:
# If all conditions are True then return True here,
# otherwise return False.
# (...)
pass
class DomainObjectFieldGlueOperatorOr(DomainObjectFieldGlueOperator):
def operate(self, conditions: List[bool]) -> bool:
# If only one (or more) of the conditions are True then return True
# otherwise, if none are True, return False.
# (...)
pass
class DomainObjectOperatorAnd(DomainObjectOperatorGlue):
def __init__(self):
pass
def operate(self, haystack: 'JobsCollection', criteria:
List['DomainObject']) -> List['DomainObject']:
"""
Returns list of haystackelements or empty list.
Includes haystackelement if all (search) 'criteria' elements
(DomainObjects) are met for haystackelement (DomainObject).
"""
result = []
for haystackelement in haystack.jobs:
# AND operator wants all criteria to be True for haystackelement (Job)
# to be included in returned search results.
criteria_all_true_for_haystackelement = True
for criterium in criteria:
if haystackelement.excludes(criterium):
criteria_all_true_for_haystackelement = False
break
if criteria_all_true_for_haystackelement:
result.append(haystackelement)
return result
class DomainObjectOperatorOr(DomainObjectOperatorGlue):
def __init__(self):
pass
def operate(self, haystack: List['DomainObject'], criteria: List['DomainObject']) -> List['DomainObject']:
"""
Returns list of haystackelements or empty list.
Includes haystackelement if all (search) 'criteria' elements (DomainObjects) are met for haystackelement (DomainObject).
"""
result = []
for haystackelement in haystack:
# OR operator wants at least ONE criterium to be True for haystackelement
# to be included in returned search results.
at_least_one_criterium_true_for_haystackelement = False
for criterium in criteria:
if haystackelement.matches(criterium):
at_least_one_criterium_true_for_haystackelement = True
break
if at_least_one_criterium_true_for_haystackelement:
result.append(haystackelement)
return result
class DomainObjectFilter(metaclass=ABCMeta):
def __init__(self, criteria: List['DomainObject'], criteria_glue:
DomainObjectOperatorGlue):
self.criteria = criteria
self.criteria_glue = criteria_glue
@abstractmethod
def apply(self, haystack: 'JobsCollection') -> List['DomainObject']:
"""
Applies filter to given 'haystack' (list of jobs with sub-objects in there);
returns filtered list of DomainObjects or empty list if none found
according to criteria (and criteria glue).
"""
return self.criteria_glue.operate(haystack, self.criteria)
class DomainObject(metaclass=ABCMeta):
def __init__(self):
pass
@abstractmethod
def matches(self, domain_object: 'DomainObject') -> bool:
""" Returns True if this DomainObject matches specified DomainObject,
False otherwise.
"""
pass
def excludes(self, domain_object: 'DomainObject') -> bool:
"""
Convenience method; the inverse of includes-method.
"""
return not self.matches(domain_object)
class Job(DomainObject):
def __init__(self, name, start, end, status, job_checkpoints:
List['JobCheckpoint']):
self.name = name
self.start = start
self.end = end
self.status = status
self.job_checkpoints = job_checkpoints
def matches(self, domain_object: 'DomainObject', field_glue:
DomainObjectFieldGlueOperator) -> bool:
"""
Returns True if this DomainObject includes specified DomainObject,
False otherwise.
"""
if domain_object is Job:
# See if specified fields in search criteria (domain_object/Job) matches this job.
# Determine here which fields user did not leave empty,
# and guess for sensible search criteria.
# Return True if it's a match, False otherwise.
condition_results = []
if domain_object.name != None:
condition_results.append(domain_object.name in self.name)
if domain_object.start != None or domain_object.end != None:
if domain_object.start == None:
# ...Use broadest start time for criteria here...
# time_range_condition = ...
condition_results.append(time_range_condition)
elif domain_object.end == None:
# ...Use broadest end time for criteria here...
# time_range_condition = ...
condition_results.append(time_range_condition)
else:
# Both start and end time specified; use specified time range.
# time_range_condition = ...
condition_results.append(time_range_condition)
# Then evaluate condition_results;
# e.g. return True if all condition_results are True here,
# false otherwise depending on implementation of field_glue class:
return field_glue.operate(condition_results)
elif domain_object is JobCheckpoint:
# Determine here which fields user did not leave empty,
# and guess for sensible search criteria.
# Return True if it's a match, False otherwise.
# First establish if parent of JobCheckpoint is 'self' (this job)
# if so, then check if search criteria for JobCheckpoint match,
# glue fields with something like:
return field_glue.operate(condition_results)
elif domain_object is JobCommand:
# (...)
if domain_object.parent_job == self:
# see if conditions pan out
return field_glue.operate(condition_results)
class JobCheckpoint(DomainObject):
def __init__(self, name, start, end, status, job_commands: List['JobCommand'], parent_job: Job):
self.name = name
self.start = start
self.end = end
self.status = status
self.job_commands = job_commands
# For easier reference;
# e.g. when search criteria matches this JobCheckpoint
# then Job associated to it can be found
# more easily.
self.parent_job = parent_job
class JobCommand(DomainObject):
def __init__(self, name, start, end, status, parent_checkpoint: JobCheckpoint, parent_job: Job):
self.name = name
self.start = start
self.end = end
self.status = status
# For easier reference;
# e.g. when search criteria matches this JobCommand
# then Job or JobCheckpoint associated to it can be found
# more easily.
self.parent_checkpoint = parent_checkpoint
self.parent_job = parent_job
class JobsCollection(DomainObject):
def __init__(self, jobs: List['Job']):
self.jobs = jobs
def get_jobs(self, filter: DomainObjectFilter) -> List[Job]:
return filter.apply(self)
def get_commands(self, job: Job) -> List[JobCommand]:
"""
Returns all commands for specified job (search criteria).
"""
result = []
for some_job in self.jobs:
if job.matches(some_job):
for job_checkpoint in job.job_checkpoints:
result.extend(job_checkpoint.job_commands)
return result
def get_checkpoints(self, job: Job) -> List[JobCheckpoint]:
"""
Returns all checkpoints for specified job (search criteria).
"""
result = []
for some_job in self.jobs:
if job.matches(some_job):
result.extend(job.job_checkpoints)
return result
推荐阅读
- php - 在共享目录中的 Docker 进程文件上运行的多个应用程序
- vue.js - v-model 和 Composition API 与提供和注入
- javascript - 待办事项列表上的本地存储
- typescript - 在 React Native 中动态更改组件的宽度
- css - 使用 Typescript 将 props 传递给由 styled-components 包装的 material-ui 按钮
- sql - How to aggregate from a snowflake pivot where keys have distinct value for each row?
- python - python - 如何在一个套接字上使用传入数据流来处理Python中的多个并行进程?
- oracle - 自治事务在 Oracle 中有自己的会话吗?
- javascript - django:从相同的初始表单发送两个 ajax 请求
- python - 使用正则表达式 python 在邮政编码后提取所有(荷兰)城市名称