首页 > 解决方案 > Haskell, Aeson - 有没有更好的方法来解析历史数据?

问题描述

我所说的“历史数据”只是指日期作为关键,而当天的价值作为价值。

例如,政府机构或大学的研究部门通常以这种格式编译有关地震、降雨、市场变动等的日期


 {
        "Meta Data": {
            "1: Country": "SomeCountry",
            "2: Region": "SomeRegion",
            "3: Latest Recording": "2018-11-16"
        },
        "EarthQuakes": {
            "2018-11-16": {
                "Richter": "5.2508"
            },
            "2018-11-09": {
                "Richter": "4.8684"
            },
            "2018-11-02": {
                "Richter": "1.8399"
            },
    ...
    ...
    ...
            "1918-11-02": {
                "Richter": "1.8399"
            }
}

通常它会有一个“元数据”部分,另一个包含值/数据。


作为初学者,我知道解析这些类型的文档的两种方法。

您可以使用 Aeson 文档中显示的一般解析,在其中定义这样的数据类型

Data MetaData = MetaData { country :: String, region :: String, latestRec :: String } deriving (Show, Eq, Generic)

使其成为一个实例FromJSON

instance FromJSON MetaData where
  parseJSON = withObject "MetaData" $
    \v -> do
       metaData  <- v        .: pack "Meta Data"
       country   <- metaData .: pack "1: Country"
       region    <- metaData .: pack "2: Region"
       latestRec <- metaData .: pack "3: Latest Recording"
       return MetaData{..}

当然RecordWildCardDeriveGenerics启用扩展。


我看到这种方法的问题是它不能轻易地用于“EarthQuakes”部分。

我必须定义每一个日期

earthQuakes <- v .: "EarthQuakes"
date1 <- earthQuakes .: "2018-11-16"
date2 <- earthQuakes .: "2018-11-06"
date3 <- earthQuakes .: "2018-11-02"
...
...
dateInfinity <- earthQuakes .: "1918-11-16"

更好的方法是通过将链接解码为Object类型,将所有数据解析为默认 JSON 值

thisFunction = do
    linksContents <- simpleHttp "somelink"
    let y = fromJust (decode linksContents :: Object)
        z = aLotOfFunctionCompositions y
    return z

aLotOfFunctionCompositions首先转换Object为可能HashMap[(k, v)]对的地方。然后我会映射一个unConstruct函数以从默认构造函数中获取值,例如

unConstruct (DefaultType value) = case (DefaultType value) of
             DefaultType x -> x

最后你会得到一个不错的清单!

这种方法的问题是aLotOfFunctionComposition.

那只是一个例子!但实际上它可能看起来像这样丑陋和难以阅读

let y = Prelude.map (\(a, b) -> (decode (encode a) :: Maybe String, decode (encode (snd (Prelude.head b))) :: Maybe String)) x
      z = Prelude.map (\(a, b) -> (fromJust a, fromJust b)) y
      a = Prelude.map (\(a, b) -> (a, read b :: Double)) z
      b = Prelude.map (\(a, b) -> (Prelude.filter (/= '-') a, b)) a
      c = Prelude.map (\(a, b) -> (read a :: Int, b)) b

这是我制作的工作代码的片段。


所以我的问题是:有没有更好/更干净的方法来解码这些类型的 JSON 文件,你有很多“日期”键,你需要将它们解析成可用的数据类型?

标签: jsonhaskellaeson

解决方案


将 aMap放入您的数据类型中。Aeson 将Map k vs 转换为对象/从对象转换,其中vs 通过它们自己的To-/ From-JSON实例进行编码/解码,而ks 通过To-/ From- JSONKeys 进行编码。事实证明,Day(从time包中)有非常合适的To-/ From-JSONKey实例。

data EarthquakeData = EarthquakeData {
    metaData :: MetaData,
    earthquakes :: Map Day Earthquake
} deriving (Eq, Show, Generic)

instance FromJSON EarthquakeData where
    parseJSON = withObject "EarthquakeData $ \v ->
        EarthquakeData <$> v .: "Meta Data"
        -- Map k v has a FromJSON instance that just does the right thing
        -- so just get the payloads with (.:)
        -- all this code is actually just because your field names are really !#$@~??
        -- not an Aeson expert, maybe there's a better way
                       <*> v .: "EarthQuakes"
instance ToJSON EarthquakeData where
    toJSON EarthquakeData{..} = object [ "Meta Data"   .= metaData
                                       , "EarthQuakes" .= earthquakes
                                       ]

data MetaData = MetaData { country :: String, region :: String, latestRec :: Day } deriving (Eq, Show)
instance FromJSON MetaData where
    parseJSON = withObject "MetaData" $ \v ->
        -- if you haven't noticed, applicative style is much neater than do
        -- using OverloadedStrings avoids all the pack-ing static
        MetaData <$> v .: "1: Country"
                 <*> v .: "2: Region"
                 <*> v .: "3: Latest Recording"
instance ToJSON MetaData where
    toJSON MetaData{..} = object [ "1: Country"          .= country
                                 , "2: Region"           .= region
                                 , "3: Latest Recording" .= latestRec
                                 ]
    toEncoding MetaData{..} = pairs $ "1: Country"          .= country
                                   <> "2: Region"           .= region
                                   <> "3: Latest Recording" .= latestRec

data Earthquake = Earthquake { richter :: Double } deriving (Eq, Show)
-- Earthquake is a bit funky because your JSON apparently has
-- numbers inside strings?
-- only here do you actually need monadic operations
instance FromJSON Earthquake where
    parseJSON = withObject "Earthquake" $ \v ->
        do string <- v .: "Richter"
           stringNum <- parseJSON string
           case readMaybe stringNum of
             Just num -> return $ Earthquake num
             Nothing -> typeMismatch "Double inside a String" string
instance ToJSON Earthquake where
    toJSON = object . return . ("Richter" .=) . show . richter
    toEncoding = pairs . ("Richter" .=) . show . richter

我已经针对您的示例 JSON 对此进行了测试,它似乎可以往返encodedecode成功。


推荐阅读