首页 > 解决方案 > 用python从json文件中读取元素

问题描述

我是在 python 中读取 json 文件的新手。我想从文件中获取 url。这是我的 json 文件。

[
    {
        "author": "[{'name': 'Ahmed Osman'}, {'name': 'Wojciech Samek'}]",
        "day": 1,
        "id": "1802.00209v1",
        "link": "[{'rel': 'alternate', 'href': 'http://arxiv.org/abs/1802.00209v1', 'type': 'text/html'}, {'rel': 'related', 'href': 'http://arxiv.org/pdf/1802.00209v1', 'type': 'application/pdf', 'title': 'pdf'}]",
        "month": 2,
        "summary": "We propose an architecture for VQA which utilizes recurrent layers to\ngenerate visual and textual attention. The memory characteristic of the\nproposed recurrent attention units offers a rich joint embedding of visual and\ntextual features and enables the model to reason relations between several\nparts of the image and question. Our single model outperforms the first place\nwinner on the VQA 1.0 dataset, performs within margin to the current\nstate-of-the-art ensemble model. We also experiment with replacing attention\nmechanisms in other state-of-the-art models with our implementation and show\nincreased accuracy. In both cases, our recurrent attention mechanism improves\nperformance in tasks requiring sequential or relational reasoning on the VQA\ndataset.",
        "tag": "[{'term': 'cs.AI', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}, {'term': 'cs.CL', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}, {'term': 'cs.CV', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}, {'term': 'cs.NE', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}, {'term': 'stat.ML', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}]",
        "title": "Dual Recurrent Attention Units for Visual Question Answering",
        "year": 2018
    },
    {
        "author": "[{'name': 'Ji Young Lee'}, {'name': 'Franck Dernoncourt'}]",
        "day": 12,
        "id": "1603.03827v1",
        "link": "[{'rel': 'alternate', 'href': 'http://arxiv.org/abs/1603.03827v1', 'type': 'text/html'}, {'rel': 'related', 'href': 'http://arxiv.org/pdf/1603.03827v1', 'type': 'application/pdf', 'title': 'pdf'}]",
        "month": 3,
        "summary": "Recent approaches based on artificial neural networks (ANNs) have shown\npromising results for short-text classification. However, many short texts\noccur in sequences (e.g., sentences in a document or utterances in a dialog),\nand most existing ANN-based systems do not leverage the preceding short texts\nwhen classifying a subsequent one. In this work, we present a model based on\nrecurrent neural networks and convolutional neural networks that incorporates\nthe preceding short texts. Our model achieves state-of-the-art results on three\ndifferent datasets for dialog act prediction.",
        "tag": "[{'term': 'cs.CL', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}, {'term': 'cs.AI', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}, {'term': 'cs.LG', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}, {'term': 'cs.NE', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}, {'term': 'stat.ML', 'scheme': 'http://arxiv.org/schemas/atom', 'label': None}]",
        "title": "Sequential Short-Text Classification with Recurrent and Convolutional\n  Neural Networks",
        "year": 2016
    }
]

我使用以下代码读取文件。

with open(args.filename, 'r') as myfile:
    data = json.loads(myfile.read())
    myfile.close()

我想href通过使用获得第二个data[0]["link"][1]["href"]。但是类型data[0]["link"]是字符串。我想知道我该如何处理这个问题。

标签: pythonjson

解决方案


您可以ast.eval_literal()在 json 中使用来制作“字符串格式”列表,将其解释为列表,然后按照您正确说明的方式引用它。

从您的数据开始,这对我有用:

import ast
print(ast.literal_eval(data[0]['link'])[1]['href'])

输出:

http://arxiv.org/pdf/1802.00209v1

推荐阅读