json - 如何使用 pandas 读取 JSON 数据?
问题描述
我正在尝试使 CNN 适合 huffpost 新闻数据集https://www.kaggle.com/rmisra/news-category-dataset。我使用的数据集是 json 格式。我的数据格式是这样的
[
{
"category": "CRIME",
"headline": "There Were 2 Mass Shootings In Texas Last Week, But Only 1 On TV",
"authors": "Melissa Jeltsen",
"link": "https://www.huffingtonpost.com/entry/texas-amanda-painter-mass-shooting_us_5b081ab4e4b0802d69caad89",
"short_description": "She left her husband. He killed their children. Just another day in America.",
"date": "2018-05-26"
},
{
"category": "ENTERTAINMENT",
"headline": "Will Smith Joins Diplo And Nicky Jam For The 2018 World Cup's Official Song",
"authors": "Andy McDonald",
"link": "https://www.huffingtonpost.com/entry/will-smith-joins-diplo-and-nicky-jam-for-the-official-2018-world-cup-song_us_5b09726fe4b0fdb2aa541201",
"short_description": "Of course it has a song.",
"date": "2018-05-26"
}
]
这是我正在尝试的代码,代码源是https://www.kaggle.com/kredy10/simple-lstm-for-text-classification
import pandas as pd
import json
df = pd.read_json('News_Category_Dataset_v2.json', lines=True)
但是我在数据读取代码行中遇到了这些错误
Traceback (most recent call last): File "./Hpnews.py", line 37, in <module>
df = pd.read_json('News_Category_Dataset_v2.json', lines=True) File "C:\Users\Anaconda3\lib\site-packages\pandas\util\_decorators.py", line 214, in wrapper
return func(*args, **kwargs) File "C:\Users\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 608, in read_json
result = json_reader.read() File "C:\Users\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 729, in read
obj = self._get_object_parser(self._combine_lines(data.split("\n"))) File "C:\Users\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 753, in _get_object_parser
obj = FrameParser(json, **kwargs).parse()
解决方案
推荐阅读
- javascript - 为什么当我尝试使用 map() 时我的 GET 响应未定义
- html - 从 https 站点/页面链接到 http 站点会触发安全警告吗?
- sql - 如果数据库列可以不同,如何将数据存储在 SQL 数据库中?
- android - 从阴影方法调用真实对象的方法会导致无限递归
- sql-server - 3 个表 - 来自 tb1 的 FK 完整性不好。tb2 & tb3 都不错。tb1 是主要的 - 我如何比较并找出 tb2 和 tb3 中哪些 id 不好
- delphi - C++ 到 Delphi:FOR 循环中的变量与分配的声明
- java - 递归获取 StackOverFlow 错误 (Java.lang.StackOverFlowError)
- python - Python语句末尾的分号会抑制输出吗?
- python - 为什么我的 pygame 游戏在显示矩形时会丢失帧?
- c++ - 如何比较类中的布尔值?