首页 > 解决方案 > Circular dependency - when does it terminate?

问题描述

I am having trouble understanding how python manages imports.

Let's say I have the following application structure:

application/
- application.py
- model/
-- __init__.py
-- user.py

Let's say that the application.py file imports the model module after creating the db like so:

db = SQLAlchemy(application) 
import model

Let's also say the the model module imports the user.py file like so:

import user

Finally, let's say that the user.py file imports the db instance from the application.py file like so:

from application import db

This seems like a circular dependency to me as the application.py file indirectly requires the user.py file but the user.py file requires the db instance from the application.py file.

I know that this code does work as I tested it but can someone explain exactly how Python handles this, and when it terminates the cyclical cycle.

To summarize the problem, when user.py file imports db from the application.py file it seems to me like it would also call the import model module which creates an infinite loop.

标签: pythonpython-3.ximport

解决方案


循环进口本身并不一定是个问题。只有循环依赖是。您可以通过一些简单的实验来了解循环导入是如何解决的:

mymod1.py

import sys
print("1. from mymod1:", [x for x in sys.modules if x.startswith("mymod")])
import mymod2
print("2. from mymod1:", [x for x in sys.modules if x.startswith("mymod")])

mymod2.py

import sys
print("1. from mymod2:", [x for x in sys.modules if x.startswith("mymod")])
import mymod3
print("2. from mymod2:", [x for x in sys.modules if x.startswith("mymod")])

mymod3.py

import sys
print("1. from mymod3:", [x for x in sys.modules if x.startswith("mymod")])
import mymod1
print("2. from mymod3:", [x for x in sys.modules if x.startswith("mymod")])

现在我们有一个循环mymod1 -> mymod2 -> mymod3 -> mymod1。输入 REPL 并观察会发生什么:

>>> import mymod1
1. from mymod1: ['mymod1']  # before mymod1 imported mymod2. note mymod1 is already there!
1. from mymod2: ['mymod1', 'mymod2']  # in mymod2 now, before importing mymod3
1. from mymod3: ['mymod1', 'mymod2', 'mymod3']  # before mymod3 imports mymod1
2. from mymod3: ['mymod1', 'mymod2', 'mymod3']  # mymod3 exit
2. from mymod2: ['mymod1', 'mymod2', 'mymod3']  # mymod2 exit
2. from mymod1: ['mymod1', 'mymod3', 'mymod2']  # mymod1 exit

这里的关键见解是模块实例本身sys.modules在它完成执行之前已经存在。这意味着它可以再次被导入,它会返回现有的对象,而无需再次执行所有模块级代码。

包的子模块中的组件具有相互依赖关系是很自然的。当模块范围的代码开始实际执行诸如尝试连接到数据库之类的操作时,通常会出现问题,因此请尽量避免直接在模块级别编写任何脚本。

循环导入错误的原因是在模块完成初始化之前需要模块命名空间中的某些内容。在这种情况下,模块本身存在,但您尝试访问的名称可能还不存在。

# mymod.py
import sys
print("var" in vars(sys.modules["mymod"]))
var = "I'm a name in mymod namespace"
print("var" in vars(sys.modules["mymod"]))

导入mymod将打印False,然后True,模块的命名空间本身仍在变异,因为导入正在执行过程中。

精明的读者可能已经注意到了这一点,mymod2mymod3在输出中切换了位置:

2. from mymod2: ['mymod1', 'mymod2', 'mymod3']
2. from mymod1: ['mymod1', 'mymod3', 'mymod2']
                           |_____ wtf? _____|

这其实绝非偶然!作为加载模块的最后一步,导入机制将实际模块从sys.modules. 如果您在 REPL 内再次检查,mymod1现在将是最后一个。

>>> import mymod1
...
>>> [x for x in sys.modules if x.startswith("mymod")]
['mymod3', 'mymod2', 'mymod1']

我不会描述导入系统为什么这样做,因为它与问题并不真正相关,但有兴趣了解原因的用户应该查看Guido 的这个邮件列表帖子


推荐阅读