首页 > 解决方案 > 如何将嵌套在 Request 中的 Json 提取到 Pandas DataFrame 中

问题描述

我尝试使用图形 API 从 facebook 广告中获取活动数据,并将数据放在 Pandas Dataframe 上。

所以我得到了数据,但我不知道如何从json中提取这些数据。

我尝试了这段代码,但现在我不知道如何继续。

import requests
import pandas as pd
import json

graphAPI = "https://graph.facebook.com/v3.3/#/ads?fields=name,insights{reach,impressions,clicks,frequency,cpm,cpc},ads{insights.time_range({'since':'2019-06-01','until':'2019-06-30'}).time_increment(1)}&access_token=#"

req = requests.get(graphAPI)

ReqDict = req.json()

print(ReqDict)

所以,我得到了这个结果:

{'data': [
    {'name': 'inverno_0160702', 'id': '213544564564'}, 
    {'name': 'instagram_ads_conversao_postinsta_post2_adidasoriginals_smith', 
     'insights': 
        {'data': 
         [
             {'reach': '2619', 
              'impressions': '2625', 
              'clicks': '43', 
              'frequency': '1.002291', 
              'cpm': '9.939048', 
              'cpc': '0.606744', 
              'date_start': '2019-06-02', 
              'date_stop': '2019-07-01'}
         ], 
              'paging': {'cursors': {'before': 'MAZDZD', 'after': 'MAZDZD'}}}, 'id': '23843373097230145'}, 
    {'name': 'instagram_ads_conversao_postinsta_', 'id': '2256589465461212'}, 
    {'name': 'instagram_ads_conversao_postinsta', 'id': '23123546545644546'}, 
    {'name': 'instagram_ads_conversao_postinsta_20190628', 
     'insights': 
        {'data': 
         [
             {'reach': '23610', 
              'impressions': '37099',
              'clicks': '1815', 
              'frequency': '1.571326', 
              'cpm': '4.492574', 
              'cpc': '0.091829', 
              'date_start': '2019-06-02',
              'date_stop': '2019-07-01'}
         ], 
              'paging': {'cursors': {'before': 'MAZDZD', 'after': 'MAZDZD'}}}, 'id': '2132653545313545313222'}], 
              'paging': {'cursors': {'before':'QVFIUlBJdHFYY1RqYnk3TTFSUDVQemh0bTBXY1BrazdrWXY2WTI5LXc5R2hUVTdnWnRiYzNnTl96azdjVWZAxamcycVVCOXM4ZAUJidV9HUzlUYUNuV25PQ0x3', 'after': 'QVFIUldvei1tRTZAUVGk1N3hhQTJUX1dQbWVSSnV0d0dTY0ctTmQ0ZAnFRdlg4NTVFbHNrazVUZA2NqTk5aMVI2UVdjM0dWUWltenVlY3Rna0N4aFdNeHA1SFRn'}, 'next': 'https://graph.facebook.com/v3.3/#'}
}

我想像这样表示我的数据框:

Name              | id     |  reach | impressions | Clicks | frequency | cpm  | cpc   | date_start  | date_stop

inverno...        |null    | 2619   |   26554     | 43     |   1.002   | 9.93 | 0.60  | '2019-06-02'| '2019-06-02'   
instagram_ads...  |222562..| null   |   null      | null   |   null    | null | null  |    null     |    null
instagram_ads...  |null    | 23610  |   37099     | 1815   | 1.571326  | 4.49 | 0.09  | '2019-06-02'| '2019-07-01'   
instagram_ads...  |231235..| null   |   null      | null   |   null    | null | null  |    null     |    null

在这个 json 的某些键中没有属性insights,但这不是问题,值可以为 null。

有人可以帮我解决这个问题。我是初学者。

编辑

现在我尝试这个,但不起作用,列已创建但数据为空。

jsonDf = json_normalize(ReqDict,record_path='data',meta=['reach','impressions','clicks','frequency','cpm','cpc','date_start','date_stop'], errors='ignore')

结果

id                                                 insights                  name   reach    impressions    clicks  frequency   cpm cpc date_start  date_stop
23843368620640145   {'data': [{'reach': '6726', 'impressions': '79...   facebook_ads_trafego_singlead_LKL_promocionado...   NaN NaN NaN NaN NaN NaN NaN NaN
23843337666290145   {'data': [{'reach': '12797', 'impressions': '1...   facebook_ads_trafego_singlead_LKL_inverno19_fe...   NaN NaN NaN NaN NaN NaN NaN NaN
23843339836870145   {'data': [{'reach': '24720', 'impressions': '2...   facebook_ads_trafego_singlead_LKL_promocionado...   NaN NaN NaN NaN NaN NaN NaN NaN
23843337719810145   {'data': [{'reach': '7766', 'impressions': '88...   facebook_ads_trafego_singlead_LKL_fitness_femi...   NaN NaN NaN NaN NaN NaN NaN NaN
23843337726230145   {'data': [{'reach': '579459', 'impressions': '...   facebook_ads_trafego_singlead_LKL_fitness_femi...   NaN NaN NaN NaN NaN NaN NaN NaN

谢谢!!!

标签: pythonjsonpandasfacebook-graph-api

解决方案


所以我得到了这个解决方案:

json_normalize(ReqDict['data'],record_path=['insights','data'],meta=['id','name'])

推荐阅读