首页 > 解决方案 > 模型导入的 Django Gunicorn Thread 问题

问题描述

这实际上与我之前提出的一个问题有关。我浏览了回溯和 Django 的源代码,我想我发现了发生了什么。

这是追溯 -

File "/usr/local/lib/python2.7/dist-packages/django/db/models/manager.py", line 122, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 790, in filter
return self._filter_or_exclude(False, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 808, in _filter_or_exclude
clone.query.add_q(Q(*args, **kwargs))
File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1243, in add_q
clause, _ = self._add_q(q_object, self.used_aliases)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1269, in _add_q
allow_joins=allow_joins, split_subq=split_subq,
File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1155, in build_filter
value, lookups, used_joins = self.prepare_lookup_value(value, lookups, can_reuse, allow_joins)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1005, in prepare_lookup_value
value = value.resolve_expression(self, reuse=can_reuse, allow_joins=allow_joins)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/expressions.py", line 466, in resolve_expression
return query.resolve_ref(self.name, allow_joins, reuse, summarize)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1464, in resolve_ref
self.get_initial_alias(), reuse)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1405, in setup_joins
names, opts, allow_many, fail_on_missing=True)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1330, in names_to_path
"Choices are: %s" % (name, ", ".join(available)))

这是 Django 失败的源代码 -

def names_to_path(self, names, opts, allow_many=True, fail_on_missing=False):
    path, names_with_path = [], []
    for pos, name in enumerate(names):
        cur_names_with_path = (name, [])
        if name == 'pk':
            name = opts.pk.name

        field = None
        try:
            field = opts.get_field(name)
        except FieldDoesNotExist:
            if name in self.annotation_select:
                field = self.annotation_select[name].output_field

        if field is not None:
            # Fields that contain one-to-many relations with a generic
            # model (like a GenericForeignKey) cannot generate reverse
            # relations and therefore cannot be used for reverse querying.
            if field.is_relation and not field.related_model:
                raise FieldError(
                    "Field %r does not generate an automatic reverse "
                    "relation and therefore cannot be used for reverse "
                    "querying. If it is a GenericForeignKey, consider "
                    "adding a GenericRelation." % name
                )
            try:
                model = field.model._meta.concrete_model
            except AttributeError:
                model = None
        else:
            # We didn't find the current field, so move position back
            # one step.
            pos -= 1
            if pos == -1 or fail_on_missing:
                field_names = list(get_field_names_from_opts(opts))
                available = sorted(field_names + list(self.annotation_select))
                raise FieldError("Cannot resolve keyword %r into field. "
                                 "Choices are: %s" % (name, ", ".join(available)))
            break

所以看起来field = opts.get_field(name)问题出在哪里。对于过滤器查询参数中的某些字段名称,无法以某种方式返回相应的字段对象。field变为NoneChoices are....引发异常。

过滤器查询参数都很好,因为它们在我们拥有的其他服务器实例上工作正常,它只是发生这种情况的一个服务器实例。

我的猜测是,Django 导入其所有内部模块的顺序以某种方式中断,一切都从那里变得混乱。

如果是这样,我们能做些什么来防止这种情况发生?

如果不是,可能是什么问题?

示例查询 -

sample_type = Type.objects.get(pk=322)
TypeMapping.objects.filter(from_config_type=sample_type)

谢谢!

更新:2019 年 9 月 6 日: 好的,所以昨天我们又进行了一轮FieldErrors,所以我们从负载均衡器中删除了该实例,然后我通过 SSH 连接到该服务器,发现这是一个gunicorn线程的问题。

我们的设置是我们的 Django 服务器docker在每个 EC2 实例上的容器内运行。在每个容器中,我们都有多个线程在运行。例如,做ps -elf会给

F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          
TIME CMD
4 S root         1     0  0  80   0 - 17453 core_s 02:04 ?        00:00:01 /usr/bin/python /usr/local/bin/gunicorn -b 0.0.0.0:8082 -w 2 --threads 4 -t 150 --a
1 S root        13     1  8  80   0 - 557495 SyS_ep 02:04 ?       00:13:24 /usr/bin/python /usr/local/bin/gunicorn -b 0.0.0.0:8082 -w 2 --threads 4 -t 150 --a
1 S root        14     1  5  80   0 - 586760 SyS_ep 02:04 ?       00:09:14 /usr/bin/python /usr/local/bin/gunicorn -b 0.0.0.0:8082 -w 2 --threads 4 -t 150 --a
4 S root        37     0  1  80   0 -  4570 -      04:41 pts/0    00:00:00 bash
0 R root        48    37  0  80   0 -  8607 -      04:41 pts/0    00:00:00 ps -elf

所以我猜这是一个父进程(PID 1)管理两个子进程。每个子进程有 2 个工人 ( -w 2),每个工人有 4 个线程 ( --threads 4)。这使得一个实例上总共有 16 个线程,它们可以单独服务 API 请求。

虽然这台服务器仍然不在负载均衡器上,但我CURL向 docker 容器 URL 发出了请求,我得到了FieldError一半的时间,而另一半得到了有效的响应。

所以我的猜测是该容器内的这 16 个线程中只有一个(或一些)失败了。

但我们仍然不知道如何解决这个问题。

标签: pythondjangopython-2.7gunicorndjango-1.9

解决方案


推荐阅读