首页 > 解决方案 > 在 yaml 文件中存储任意对象的树

问题描述

我只是想使用 YAML 来存储代表配置的专用 Python 对象,并在需要时将其加载回来。
我的应用程序允许定义由嵌套 Python 对象组成的场景。
根对象(命名为 Scenario)包含一些属性和对象列表(命名为 Level1Type1、Level1Type2、...),每个 Level1 对象也由一些属性和对象列表(命名为 Level2Type1、级别 2 类型 2,...)。
总而言之,它是一棵树。每个叶子都有属性和一个其他对象的列表,这些对象本身就是叶子。
此外,只有部分对象的属性,必须保存在文件中(动态属性与配置文件无关)。
我决定明确定义保存哪些属性。

阅读由 google 检索到的关于“Python serialize objects with yaml”主题的文档,给了我一些提示,但让我对真正需要什么感到困惑。
其中大部分是由Anthon提供的(非常感谢他)。它主要解释了我为什么使用 ruamel.yaml。
一个遗憾:由于缺乏关于 Python 任意对象序列化的解释和文档,我在该主题上花费了太多时间。

我注意到,CommentedSeq当使用 YAML 文件重新加载场景时,使用对象构造函数创建的场景中的 List 对象变成了对象。
我也想知道__repr__我在许多示例中看​​到的定义的目标。它被序列化机制使用吗?

以下是验证我的应用程序需求的代码。

import ruamel.yaml
from io import StringIO

yaml = ruamel.yaml.YAML()

class Base:
    """
    Base class for every class in the tree.
    """
    # class variables can be necessary
    cst_value = "common"

    def __init__(self, elt_name=None, comment=None):
        self.elt_name = elt_name
        self.comment = comment

    def treatment(self):
        raise NotImplementedError('Base class should not be implemented')

@yaml.register_class
class Scenario(Base):

    yaml_tag = u'!Scenario'

    def __init__(self, elt_name=None, comment=None, level1_objs=None):
        super().__init__(elt_name, comment)
        # List of level1 objects: to be saved in yaml file.
        self.level1_objs = [] if level1_objs is None else level1_objs

    @classmethod
    def to_yaml(cls, representer, node):
        dict_representation = {
            'elt_name': node.elt_name,
            'comment': node.comment,
            'level1_objs': node.level1_objs
        }
        return representer.represent_mapping(cls.yaml_tag, dict_representation)

    @classmethod
    def from_yaml(cls, constructor, node):
        m = {}
        for m in constructor.construct_yaml_map(node):
            pass
        elt_name = m['elt_name'] if 'elt_name' in m else None
        comment = m['comment'] if 'comment' in m else None
        level1_objs = m['level1_objs'] if 'level1_objs' in m else None
        return cls(elt_name, comment, level1_objs)

    def treatment(self):
        pass


class BunchOfData:

    def __init__(self):
        self.data_frame = None
        self.data1 = None
        self.data2 = None
        self.data3 = None


@yaml.register_class
class Level1Type1(Base):

    yaml_tag = u'!Level1Type1'

    def __init__(self, elt_name=None, comment=None, level2_objs=None, l1_t1_attr1=None):
        super().__init__(elt_name, comment)
        # List of level2 objects: to be saved in yaml file.
        self.level2_objs = [] if level2_objs is None else level2_objs
        # Attribute: to be saved in yaml file.
        self.l1_t1_attr1 = l1_t1_attr1
        # Dynamic attribute: Not to be saved in yaml file
        self.dyn_data = BunchOfData()

    @classmethod
    def to_yaml(cls, representer, node):
        dict_representation = {
            'elt_name': node.elt_name,
            'comment': node.comment,
            'l1_t1_attr1': node.l1_t1_attr1,
            'level2_objs': node.level2_objs
        }
        return representer.represent_mapping(cls.yaml_tag, dict_representation)

    @classmethod
    def from_yaml(cls, constructor, node):
        m = {}
        for m in constructor.construct_yaml_map(node):
            pass
        elt_name = m['elt_name'] if 'elt_name' in m else None
        comment = m['comment'] if 'comment' in m else None
        level2_objs = m['level2_objs'] if 'level2_objs' in m else None
        l1_t1_attr1 = m['l1_t1_attr1'] if 'l1_t1_attr1' in m else None
        return cls(elt_name, comment, level2_objs, l1_t1_attr1)

    def treatment(self):
        pass


@yaml.register_class
class Level1Type2(Base):

    yaml_tag = u'!Level1Type2'

    def __init__(self, elt_name=None, comment=None, level2_objs=None, l1_t2_attr1=None, l1_t2_attr2=None):
        super().__init__(elt_name, comment)
        # List of level2 objects: to be saved in yaml file.
        self.level2_objs = [] if level2_objs is None else level2_objs
        # Attribute: to be saved in yaml file.
        self.l1_t2_attr1 = l1_t2_attr1
        self.l1_t2_attr2 = l1_t2_attr2
        # Dynamic attribute: Not to be saved in yaml file
        self.dyn_data = BunchOfData()

    @classmethod
    def to_yaml(cls, representer, node):
        dict_representation = {
            'elt_name': node.elt_name,
            'comment': node.comment,
            'level2_objs': node.level2_objs,
            'l1_t2_attr1': node.l1_t2_attr1,
            'l1_t2_attr2': node.l1_t2_attr2
        }
        return representer.represent_mapping(cls.yaml_tag, dict_representation)

    @classmethod
    def from_yaml(cls, constructor, node):
        m = {}
        for m in constructor.construct_yaml_map(node):
            pass
        elt_name = m['elt_name'] if 'elt_name' in m else None
        comment = m['comment'] if 'comment' in m else None
        level2_objs = m['level2_objs'] if 'level2_objs' in m else None
        l1_t2_attr1 = m['l1_t2_attr1'] if 'l1_t2_attr1' in m else None
        l1_t2_attr2 = m['l1_t2_attr2'] if 'l1_t2_attr2' in m else None
        return cls(elt_name, comment, level2_objs, l1_t2_attr1, l1_t2_attr2)

    def treatment(self):
        pass


@yaml.register_class
class Level2Type1(Base):

    yaml_tag = u'!Level2Type1'

    def __init__(self, elt_name=None, comment=None, l2_t1_attr1=None):
        super().__init__(elt_name, comment)
        # Attribute: to be saved in yaml file.
        self.l2_t1_attr1 = l2_t1_attr1

    @classmethod
    def to_yaml(cls, representer, node):
        dict_representation = {
            'elt_name': node.elt_name,
            'comment': node.comment,
            'l2_t1_attr1': node.l2_t1_attr1
        }
        return representer.represent_mapping(cls.yaml_tag, dict_representation)

    @classmethod
    def from_yaml(cls, constructor, node):
        m = {}
        for m in constructor.construct_yaml_map(node):
            pass
        elt_name = m['elt_name'] if 'elt_name' in m else None
        comment = m['comment'] if 'comment' in m else None
        l2_t1_attr1 = m['l2_t1_attr1'] if 'l2_t1_attr1' in m else None
        return cls(elt_name, comment, l2_t1_attr1)

    def treatment(self):
        pass


@yaml.register_class
class Level2Type2(Base):

    yaml_tag = u'!Level2Type2'

    def __init__(self, elt_name=None, comment=None, l2_t2_attr1=None, l2_t2_attr2=None):
        super().__init__(elt_name, comment)
        # Attribute: to be saved in yaml file.
        self.l2_t2_attr1 = l2_t2_attr1
        self.l2_t2_attr2 = l2_t2_attr2

    @classmethod
    def to_yaml(cls, representer, node):
        dict_representation = {
            'elt_name': node.elt_name,
            'comment': node.comment,
            'l2_t2_attr1': node.l2_t2_attr1,
            'l2_t2_attr2': node.l2_t2_attr2
        }
        return representer.represent_mapping(cls.yaml_tag, dict_representation)

    @classmethod
    def from_yaml(cls, constructor, node):
        m = {}
        for m in constructor.construct_yaml_map(node):
            pass
        elt_name = m['elt_name'] if 'elt_name' in m else None
        comment = m['comment'] if 'comment' in m else None
        l2_t2_attr1 = m['l2_t2_attr1'] if 'l2_t2_attr1' in m else None
        l2_t2_attr2 = m['l2_t2_attr2'] if 'l2_t2_attr2' in m else None
        return cls(elt_name, comment, l2_t2_attr1, l2_t2_attr2)

    def treatment(self):
        pass


@yaml.register_class
class Level2Type3(Base):

    yaml_tag = u'!Level2Type3'

    def __init__(self, elt_name=None, comment=None, l2_t3_attr1=None, l2_t3_attr2=None, l2_t3_attr3=None):
        super().__init__(elt_name, comment)
        # Attribute: to be saved in yaml file.
        self.l2_t3_attr1 = l2_t3_attr1
        self.l2_t3_attr2 = l2_t3_attr2
        self.l2_t3_attr3 = l2_t3_attr3

    @classmethod
    def to_yaml(cls, representer, node):
        dict_representation = {
            'elt_name': node.elt_name,
            'comment': node.comment,
            'l2_t3_attr1': node.l2_t3_attr1,
            'l2_t3_attr2': node.l2_t3_attr2,
            'l2_t3_attr3': node.l2_t3_attr3
        }
        return representer.represent_mapping(cls.yaml_tag, dict_representation)

    @classmethod
    def from_yaml(cls, constructor, node):
        m = {}
        for m in constructor.construct_yaml_map(node):
            pass
        elt_name = m['elt_name'] if 'elt_name' in m else None
        comment = m['comment'] if 'comment' in m else None
        l2_t3_attr1 = m['l2_t3_attr1'] if 'l2_t3_attr1' in m else None
        l2_t3_attr2 = m['l2_t3_attr2'] if 'l2_t3_attr2' in m else None
        l2_t3_attr3 = m['l2_t3_attr3'] if 'l2_t3_attr3' in m else None
        return cls(elt_name, comment, l2_t3_attr1, l2_t3_attr2, l2_t3_attr3)

    def treatment(self):
        pass

# Make this run.
test = Scenario("my_scenario", "what a scenario may look like after yaml dump",
                [Level1Type1("l1_t1_object", "I am a Level1 Type1 object", [
                    Level2Type1("l2_t1_object", "I am a Level2 Type1 object", 11211),
                    Level2Type2("l2_t2_object", "I am a Level2 Type2 object", 11221, 11222),
                    Level2Type3("l2_t3_object", "I am a Level2 Type3 object", 11231, 11232, 11233),
                ], 111),
                 Level1Type2("l1_t2_object", "I am a Level1 Type2 object", [
                     Level2Type2("l2_t2_object", "I am a Level2 Type2 object", 12221, 12222),
                     Level2Type1("l2_t1_object", "I am a Level2 Type1 object", 12211),
                     Level2Type3("l2_t3_object", "I am a Level2 Type3 object", 12231, 12232, 12233),
                 ], 121, 122)
                 ]
                )
# serialize
dump_buf = StringIO()
yaml.dump(test, dump_buf)
test_serialized = dump_buf.getvalue()
print(test_serialized)
# deserialize
test_is_back = yaml.load(test_serialized)
print(test_is_back)

生成的 yaml 文件如下所示:

!Scenario
elt_name: my_scenario
comment: what a scenario may look like after yaml dump
level1_objs:
- !Level1Type1
  elt_name: l1_t1_object
  comment: I am a Level1 Type1 object
  l1_t1_attr1: 111
  level2_objs:
  - !Level2Type1
    elt_name: l2_t1_object
    comment: I am a Level2 Type1 object
    l2_t1_attr1: 11211
  - !Level2Type2
    elt_name: l2_t2_object
  ....

标签: pythonserializationyamlruamel.yaml

解决方案


这不是一个代码审查网站,所以我将把自己限制在真实的和 IMO 隐含的问题上:

至于真正的问题。不,__repr__序列化过程不使用 ,只是为了确保您可以“打印”实例并获得一些人类可解释的表示,而不是<__module__.Type. object at 0xaddress>否则您会得到。

至于隐含的问题:您得到的是一个CommentedSeq而不是“正常”列表,因为您使用默认的往返加载程序/倾倒程序,使用

yaml = ruamel.yaml.YAML()

该加载程序/转储程序需要能够在某处附加注释(以及未注册标签的锚点/别名和标签信息),并且它可以在 CommentedSeq实例上这样做,因为它不能在 build-in 上这样做list

CommentedSeq在大多数方面,它的行为是,list但如果这是一个问题,或者你不需要任何往返功能(就像你的情况一样),你应该只使用:

yaml = ruamel.yaml.YAML(typ='safe')

(这将为您提供更快但不完全兼容的基于 C 的 YAML 1.1 加载器/转储器)

或使用:

yaml = ruamel.yaml.YAML(typ='safe', pure=True)

这为您提供了装载机/自卸车,而无需完全往返的“开销”,因此装载回list而不是CommentedSeq.


from_yaml即使使用往返加载程序,也可以编写方法来执行此操作,但这并非易事。但是,如果您使用 YAML 文档进行转储然后加载,则不应在注释中存储任何信息,而应使用安全的转储器/加载器。


推荐阅读