首页 > 解决方案 > 使用 Serde 反序列化跳过序列中的无效元素

问题描述

使用 Serde,我想通过保留有效元素并跳过无效元素来反序列化一系列元素。

我有以下有效载荷:

{
    "nhits": 30,
    "parameters": {
        "dataset": "occupation-parkings-temps-reel",
        "timezone": "UTC",
        "rows": 50,
        "start": 0,
        "format": "json",
        "facet": [
            "etat_descriptif"
        ]
    },
    "records": [
        {
            "datasetid": "occupation-parkings-temps-reel",
            "recordid": "1436c55a76fc7910b5a0336eb74cc0957870a8fd",
            "fields": {
                "nom_parking": "P1 Esplanade - Centre commercial",
                "etat": 1,
                "ident": 27,
                "infousager": "220",
                "idsurfs": "1703_DEP_27",
                "libre": 229,
                "total": 251,
                "etat_descriptif": "Ouvert"
            },
            "record_timestamp": "2020-12-20T12:51:00.704000+00:00"
        },
        {
            "datasetid": "occupation-parkings-temps-reel",
            "recordid": "2b15689c04478fcad8c964a5d9f3c0148eb70126",
            "fields": {
                "etat": 1,
                "ident": 30,
                "infousager": "LIBRE",
                "libre": 719,
                "total": 719,
                "etat_descriptif": "Ouvert"
            },
            "record_timestamp": "2020-12-20T12:51:00.704000+00:00"
        }
    ],
    "facet_groups": [
        {
            "facets": [
                {
                    "count": 28,
                    "path": "Ouvert",
                    "state": "displayed",
                    "name": "Ouvert"
                },
                {
                    "count": 1,
                    "path": "Ferm\u00e9",
                    "state": "displayed",
                    "name": "Ferm\u00e9"
                },
                {
                    "count": 1,
                    "path": "frequentation temps reel indisponible",
                    "state": "displayed",
                    "name": "frequentation temps reel indisponible"
                }
            ],
            "name": "etat_descriptif"
        }
    ]
}

我有一个不同的结构对应:

/// The container for the API response
#[derive(Debug, Deserialize)]
pub struct OpenDataResponse<T> {
    /// The parameters relative to the response
    pub parameters: Parameters,

    /// The parameters relative to the pagination
    #[serde(flatten)]
    pub pagination: Pagination,

    /// The sets of records inside the response
    #[serde(bound(deserialize = "T: Deserialize<'de>"))]
    #[serde(deserialize_with = "deserialize::failable_records")]
    pub records: Vec<Record<T>>,
}

/// A record represents an item of some data
/// with a specific id.
#[derive(Debug, Deserialize)]
pub struct Record<T> {
    /// The identifier of the record
    #[serde(rename(deserialize = "recordid"))]
    pub id: String,

    #[serde(rename(deserialize = "fields"))]
    pub(crate) inner: T,
}

#[derive(Debug, Deserialize)]
pub struct StatusOpenData {
    #[serde(rename(deserialize = "idsurfs"))]
    pub id: String,

    #[serde(rename(deserialize = "nom_parking"))]
    pub name: String,

    #[serde(rename(deserialize = "etat"))]
    pub status: i8,

    #[serde(rename(deserialize = "libre"))]
    pub free: u16,

    pub total: u16,

    #[serde(rename(deserialize = "etat_descriptif"))]
    pub users_info: Option<String>,
}

关于这些定义,一个StatusOpenData元素有一些必填字段。所以在
示例records中,第一个元素是有效的,第二个是无效的。

我实现了自己的反序列化方法deserialize::failable_records

struct FailableDeserialize<T> {
    inner: Option<T>,
}

impl<'de, T: Deserialize<'de>> Deserialize<'de> for FailableDeserialize<T> {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        let value: Option<T> = Deserialize::deserialize(deserializer).ok();
        Ok(FailableDeserialize { inner: value })
    }
}

pub(super) fn failable_records<'de, D, T>(deserializer: D) -> Result<Vec<T>, D::Error>
where
    D: Deserializer<'de>,
    T: Deserialize<'de>,
{
    // Error returned from the line below
    let elements: Vec<FailableDeserialize<T>> = Deserialize::deserialize(deserializer)?;
    let result = elements.into_iter().filter_map(|f| f.inner).collect();
    Ok(result)
}

这失败了一些错误,如:should take errors into account: Error("expected or]",

我不明白为什么会返回错误: let elements: Vec<FailableDeserialize<T>> = Deserialize::deserialize(deserializer)?;尝试反序列化一系列FailableDeserialize<T>元素,但这种类型Deserialize以他无法返回错误的方式实现 a 。

我哪里错了?

标签: jsonrustdeserializationserde

解决方案


忽略错误的直接方法ok()将导致解串器不同步。反序列化错误可能发生在任何令牌上,并且不需要仅在消耗完整元素后发生。如果反序列Record化失败的 serde 将卡在Record对象内,但Vec反序列化器期望,它找不到它。

如果您坚持使用已经提供类似FailableDeserialize. 你可以serde_with::DefaultOnError这样写failable_records

fn failable_records<'de, D, T>(deserializer: D) -> Result<Vec<T>, D::Error>
where
    D: Deserializer<'de>,
    T: Deserialize<'de>,
{
    #[serde_with::serde_as]
    #[derive(Deserialize)]
    #[serde(bound(deserialize = "T: Deserialize<'de>"))]
    struct Wrapper<T>(#[serde_as(deserialize_as = "Vec<serde_with::DefaultOnError>")] Vec<Option<T>>);

    // Error returned from the line below
    let elements: Wrapper<T> = Deserialize::deserialize(deserializer)?;
    let result = elements.0.into_iter().filter_map(|f| f).collect();
    Ok(result)
}

在内部,它使用未标记的枚举来解决去同步问题,因为这将在开始真正的反序列化之前消耗一个完整的对象。


推荐阅读