python - JSONDecodeError 在 Python 中打开多行文本文件
问题描述
我正在尝试打开从 hdfs 提取的文本文件,提取某些值,然后将此文件输出到单行 csv 文件中。下面是文本文件的“内容”以及我用来提取数据和输出的代码:
#file.txt
{"timestamp": someInt, "videoId": someString, "overridden": someInt, "scores": [{"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}]}
{"timestamp": someInt, "videoId": someString, "overridden": someInt, "scores": [{"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}, {"bucket": someString, "name": someString, "value": someInt}]}
...
初始代码:
wanted_data = []
with open('file.txt', 'r') as f:
for line in f:
json_data = json.loads(line)
wanted_data.append(json_data['videoId'])
for i in range(6):
wanted_data.append(json_data['scores'][i]['bucket'])
wanted_data.append(json_data['scores'][i]['value'])
with open('file.csv', 'w+') as f_out:
write = csv.writer(f_out)
write.writerow(wanted_data)
这会导致 JSONDecode 错误:
/usr/lib/python3.7/json/decoder.py in raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end
JSONDecodeError: Expecting value: line 2 column 1 (char 1)
我应该加载这个文本文件的正确方法是什么?
解决方案
看起来您在 JSON 字符串之间有空行。在处理之前检查该行实际上有一些文本:
wanted_data = []
with open('file.txt', 'r') as f:
for line in f:
if line.strip():
json_data = json.loads(line)
wanted_data.append(json_data['videoId'])
for score in json_data['scores']:
wanted_data.append(score['bucket'])
wanted_data.append(score['value'])
推荐阅读
- css - 使用 localStorage 在页面重新加载后保持 css 样式反应
- javascript - playwright - 登录表单、输入和提交点击的问题
- maven - 詹金斯管道中的不可读POM
- c++ - C++ 模板函数以随机顺序接受参数
- node.js - 为什么我不能访问 httprequest 变量值?
- laravel - 我在子域上上传了 laravel 8,但收到错误
- javascript - 路由在 React JS 中无法正常工作
- tokenize - AttributeError:“GPT2TokenizerFast”对象没有属性“max_len”
- java - 使用 pyjnius 在 kivy 中获取传入号码
- django - 'str' 对象在 djangorestframework_simplejwt 上没有属性 'decode'