python - 如何将数据分成两个数据框
问题描述
我有一个与 json 数据的链接,我想将数据从中分离到两个数据框中。
我的代码如下:
import pandas as pd
import requests
pd.set_option('display.max_rows', 50000)
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 10000)
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'}
url = "https://www.nseindia.com/api/option-chain-indices?symbol=BANKNIFTY"
data = requests.get(url, headers=headers).json()
for x in range(len(data['records']['data'])):
print(data['records']['data'][x])
输出包含如下行,其中“CE”和“PE”数据可用:
{'strikePrice': 13900, 'expiryDate': '23-Apr-2020', 'PE': {'strikePrice': 13900, 'expiryDate': '23-Apr-2020', 'underlying': 'BANKNIFTY', 'identifier': 'OPTIDXBANKNIFTY23-04-2020PE13900.00', 'openInterest': 1597, 'changeinOpenInterest': 1589, 'pchangeinOpenInterest': 19862.5, 'totalTradedVolume': 9101, 'impliedVolatility': 110.76, 'lastPrice': 1.65, 'change': -1.5, 'pChange': -47.61904761904762, 'totalBuyQuantity': 49800, 'totalSellQuantity': 17560, 'bidQty': 40, 'bidprice': 1.75, 'askQty': 460, 'askPrice': 2.35, 'underlyingValue': 20681.45}, 'CE': {'strikePrice': 13900, 'expiryDate': '23-Apr-2020', 'underlying': 'BANKNIFTY', 'identifier': 'OPTIDXBANKNIFTY23-04-2020CE13900.00', 'openInterest': 0, 'changeinOpenInterest': 0, 'pchangeinOpenInterest': 0, 'totalTradedVolume': 2, 'impliedVolatility': 162.07, 'lastPrice': 6901.1, 'change': 3502.4000000000005, 'pChange': 103.05116662253218, 'totalBuyQuantity': 2620, 'totalSellQuantity': 2620, 'bidQty': 200, 'bidprice': 6629.85, 'askQty': 200, 'askPrice': 7208.75, 'underlyingValue': 20681.45}}
{'strikePrice': 13900, 'expiryDate': '30-Apr-2020', 'PE': {'strikePrice': 13900, 'expiryDate': '30-Apr-2020', 'underlying': 'BANKNIFTY', 'identifier': 'OPTIDXBANKNIFTY30-04-2020PE13900.00', 'openInterest': 989, 'changeinOpenInterest': 12, 'pchangeinOpenInterest': 1.2282497441146367, 'totalTradedVolume': 134, 'impliedVolatility': 98.26, 'lastPrice': 16.3, 'change': -4.899999999999999, 'pChange': -23.113207547169807, 'totalBuyQuantity': 32900, 'totalSellQuantity': 4100, 'bidQty': 20, 'bidprice': 16.3, 'askQty': 20, 'askPrice': 17, 'underlyingValue': 20681.45}, 'CE': {'strikePrice': 13900, 'expiryDate': '30-Apr-2020', 'underlying': 'BANKNIFTY', 'identifier': 'OPTIDXBANKNIFTY30-04-2020CE13900.00', 'openInterest': 1, 'changeinOpenInterest': 0, 'pchangeinOpenInterest': 0, 'totalTradedVolume': 0, 'impliedVolatility': 0, 'lastPrice': 5000, 'change': -5000, 'pChange': -100, 'totalBuyQuantity': 2640, 'totalSellQuantity': 2840, 'bidQty': 20, 'bidprice': 6242.05, 'askQty': 20, 'askPrice': 7401.65, 'underlyingValue': 20681.45}}
{'strikePrice': 13900, 'expiryDate': '14-May-2020', 'PE': {'strikePrice': 13900, 'expiryDate': '14-May-2020', 'underlying': 'BANKNIFTY', 'identifier': 'OPTIDXBANKNIFTY14-05-2020PE13900.00', 'openInterest': 0, 'changeinOpenInterest': 0, 'pchangeinOpenInterest': 0, 'totalTradedVolume': 0, 'impliedVolatility': 0, 'lastPrice': 0, 'change': 0, 'pChange': -100, 'totalBuyQuantity': 100, 'totalSellQuantity': 0, 'bidQty': 100, 'bidprice': 0.3, 'askQty': 0, 'askPrice': 0, 'underlyingValue': 20681.45}, 'CE': {'strikePrice': 13900, 'expiryDate': '14-May-2020', 'underlying': 'BANKNIFTY', 'identifier': 'OPTIDXBANKNIFTY14-05-2020CE13900.00', 'openInterest': 0, 'changeinOpenInterest': 0, 'pchangeinOpenInterest': 0, 'totalTradedVolume': 0, 'impliedVolatility': 0, 'lastPrice': 0, 'change': 0, 'pChange': -100, 'totalBuyQuantity': 2420, 'totalSellQuantity': 2420, 'bidQty': 2420, 'bidprice': 6223.45, 'askQty': 2420, 'askPrice': 7565.05, 'underlyingValue': 20681.45}}
我想将 CE 和 PE 值存储在两个单独的数据框中,列名为
['strikePrice','expiryDate', 'underlying', 'identifier', 'openInterest', 'changeinOpenInterest', 'pchangeinOpenInterest', 'totalTradedVolume','impliedVolatility', 'lastPrice', 'change', 'pChange', 'totalBuyQuantity', 'totalSellQuantity', 'bidQty', 'bidprice', 'askQty', 'askPrice', 'underlyingValue']
解决方案
与来自itertools和集合的一些朋友进行一些列表理解应该有助于将数据放入单独的数据帧中:
headers = {
'User-Agent':
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'
}
url = "https://www.nseindia.com/api/option-chain-indices?symbol=BANKNIFTY"
data = requests.get(url, headers=headers).json()
#list comprehension here
#note that it is nested, since the data we are truly keen on
#is embedded in a list
#this will give us a tuple of the key, and the dataframe of the values
#as the values are dictionaries
res = [[(key, pd.DataFrame.from_dict(value, orient='index').T)
for key, value in entry.items()
if key in ['PE', 'CE']]
for entry in data['records']['data']]
from collections import defaultdict
from itertools import chain
d = defaultdict(list)
#group the values into a pair
#they will be a combined list of dataframes
#belonging to either PE or CE
for k, v in chain.from_iterable(res):
d[k].append(v)
#now we can merge the values
#and keep our result as a dictionary
#this allows us to access PE or CE via keys
result = {key: pd.concat(values) for key, values in d.items()}
#now, we can access either PE or CE
#dataframe is quite long, so this is a small part of it
result['PE'].iloc[:3,:5]
strikePrice expiryDate underlying identifier openInterest
0 13900 23-Apr-2020 BANKNIFTY OPTIDXBANKNIFTY23-04-2020PE13900.00 1597
0 13900 30-Apr-2020 BANKNIFTY OPTIDXBANKNIFTY30-04-2020PE13900.00 989
0 13900 14-May-2020 BANKNIFTY OPTIDXBANKNIFTY14-05-2020PE13900.00 0
推荐阅读
- php - Laravel 给了我 Facade\Ignition\Exceptions\ViewException 未定义的变量:GENERAL_SETTING
- python - 为什么opencv转换颜色空间不同于pil?
- java - 如何从大型 .txt 文件中提取特定数据,并在运行时打印?
- python - 我正在提取一个与我需要的 ascci 值不同的文件。Python
- bash - 如何让大量 bash 脚本一个接一个地运行
- c# - 使用校验和 (C#) 部署到 Artifactory 时出现问题
- android - Android Studio 3.5 构建找不到密钥库文件。前两天还好好的
- javascript - 我应该如何重构这个嵌套的 IF 语句?
- matplotlib - 为什么同一个Expression使用matplotlib和mathematica得到不同的图形?
- google-chrome - Chrome 扩展程序拒绝说“您的产品违反了该政策的隐私政策和安全传输部分”