首页 > 解决方案 > Pandas 包含 json 文件的密钥

问题描述

import requests
import pandas as pd
import json

url = 'http://www.fundamentus.com.br/resultado.php'
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"}
fundamentus = requests.get(url, headers=headers)


dfs = pd.read_html(fundamentus.text)
table = dfs[0]
table.to_json('table7.json', orient='records', indent=2)

这给了我以下信息:

[{
    "Papel":"VNET3",
    "Cota\u00e7\u00e3o":0.0,
    "P\/L":0.0,
    "P\/VP":0.0,
    "PSR":0.0,
    "Div.Yield":"0,00%",
    "P\/Ativo":0.0,
    "P\/Cap.Giro":0,
    "P\/EBIT":0.0,
    "P\/Ativ Circ.Liq":0,
    "EV\/EBIT":0.0,
    "EV\/EBITDA":0.0,
    "Mrg Ebit":"0,00%",
    "Mrg. L\u00edq.":"0,00%",
    "Liq. Corr.":0,
    "ROIC":"0,00%",
    "ROE":"12,99%",
    "Liq.2meses":"000",
    "Patrim. L\u00edq":"9.257.250.00000",
    "D\u00edv.Brut\/ Patrim.":0.0,
    "Cresc. Rec.5a":"-2,71%"
  },
  {
    "Papel":"CFLU4",
    "Cota\u00e7\u00e3o":1.0,
    "P\/L":0.0,
    "P\/VP":0.0,
    "PSR":0.0,
    "Div.Yield":"0,00%",
    "P\/Ativo":0.0,
    "P\/Cap.Giro":0,
    "P\/EBIT":0.0,
    "P\/Ativ Circ.Liq":0,
    "EV\/EBIT":0.0,
    "EV\/EBITDA":0.0,
    "Mrg Ebit":"8,88%",
    "Mrg. L\u00edq.":"10,72%",
    "Liq. Corr.":110,
    "ROIC":"17,68%",
    "ROE":"32,15%",
    "Liq.2meses":"000",
    "Patrim. L\u00edq":"60.351.00000",
    "D\u00edv.Brut\/ Patrim.":6.0,
    "Cresc. Rec.5a":"8,14%"
  }
]

但我需要以下内容。

[ VNET3 = {
    "Cota\u00e7\u00e3o":0.0,
    "P\/L":0.0,
    "P\/VP":0.0,
    "PSR":0.0,
    "Div.Yield":"0,00%",
    "P\/Ativo":0.0,
    "P\/Cap.Giro":0,
    "P\/EBIT":0.0,
    "P\/Ativ Circ.Liq":0,
    "EV\/EBIT":0.0,
    "EV\/EBITDA":0.0,
    "Mrg Ebit":"0,00%",
    "Mrg. L\u00edq.":"0,00%",
    "Liq. Corr.":0,
    "ROIC":"0,00%",
    "ROE":"12,99%",
    "Liq.2meses":"000",
    "Patrim. L\u00edq":"9.257.250.00000",
    "D\u00edv.Brut\/ Patrim.":0.0,
    "Cresc. Rec.5a":"-2,71%"
  },
  CFLU4 = {
    "Cota\u00e7\u00e3o":1.0,
    "P\/L":0.0,
    "P\/VP":0.0,
    "PSR":0.0,
    "Div.Yield":"0,00%",
    "P\/Ativo":0.0,
    "P\/Cap.Giro":0,
    "P\/EBIT":0.0,
    "P\/Ativ Circ.Liq":0,
    "EV\/EBIT":0.0,
    "EV\/EBITDA":0.0,
    "Mrg Ebit":"8,88%",
    "Mrg. L\u00edq.":"10,72%",
    "Liq. Corr.":110,
    "ROIC":"17,68%",
    "ROE":"32,15%",
    "Liq.2meses":"000",
    "Patrim. L\u00edq":"60.351.00000",
    "D\u00edv.Brut\/ Patrim.":6.0,
    "Cresc. Rec.5a":"8,14%"
  }
]

结局也是错误的。 例如:“Cota\u00e7\u00e3o”

我试过了:table.to_json('table7.json',**force_ascii=True**, orient='records', indent=2)

我也试过。

table.to_json('table7.json',**encoding='utf8'**, orient='records', indent=2)

但没有成功。

所以我尝试用 json 阅读,因为 Idea 被阅读并转换。

这是 json 阅读器声明。

jasonfile = open('table7.json', 'r')
stocks = jasonfile.read()
jason_object = json.loads(stocks)
print(str(jason_object['Papel']))

但我有。

  **print(str(jason_object['Papel']))
TypeError: list indices must be integers or slices, not str**

提前致谢。

标签: pythonjson

解决方案


你有很多字典的列表,所以你必须使用索引[0]来获取一本字典

print( jason_object[0]['Papel'] )

文本Cota\u00e7\u00e3o可以是正确的。这就是 JSON 保持原生字符的方式。

但是如果你打印它

print('Cota\u00e7\u00e3o')

那么你应该得到

Cotação 

当我跑

for key in jason_object[0].keys():
    print(key)

然后我出现在屏幕上

VNET3
Papel
Cotação
P/L
P/VP
PSR
Div.Yield
P/Ativo
P/Cap.Giro
P/EBIT
P/Ativ Circ.Liq
EV/EBIT
EV/EBITDA
Mrg Ebit
Mrg. Líq.
Liq. Corr.
ROIC
ROE
Liq.2meses
Patrim. Líq
Dív.Brut/ Patrim.
Cresc. Rec.5a

但是如果我table7.json在文本编辑器中打开然后我看到Cota\u00e7\u00e3o


列出[ VNET3 = { .. }]它是不正确的JSON,也不是 Python 结构。

正确JSON且 Python 结构是字典{ "VNET3": { .. } }

new_data = dict()

for item in jason_object:
    key = item['Papel']
    item.pop('Papel')
    val = item
    new_data[key] = val

print(new_data)

最少的工作代码

import requests
import pandas as pd
import json

url = 'http://www.fundamentus.com.br/resultado.php'

headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"}

response = requests.get(url, headers=headers)

dfs = pd.read_html(response.text)
table = dfs[0]
table.to_json('table7.json', orient='records', indent=2)

jasonfile = open('table7.json', 'r')
jason_object = json.loads(jasonfile.read())

#print(jason_object[0]['Papel'])

#for key in jason_object[0].keys():
#    print(key)

new_data = dict()

for item in jason_object:
    key = item['Papel']
    item.pop('Papel')
    val = item
    new_data[key] = val

print(new_data)

在 Python 3.7、Linux Mint 上测试,默认UTF-8在控制台/终端中使用。


推荐阅读