首页 > 解决方案 > pandas json_normalize 列创建为 dtype 对象

问题描述

我有一个从 api 提供的 json 对象,如下所示:

{
  "workouts": [
    {
      "id": 92527291,
      "starts": "2021-06-28T15:42:44.000Z",
      "minutes": 30,
      "name": "Indoor Cycling",
      "created_at": "2021-06-28T16:12:57.000Z",
      "updated_at": "2021-06-28T16:12:57.000Z",
      "plan_id": null,
      "workout_token": "ELEMNT BOLT A1B3:59",
      "workout_type_id": 12,
      "workout_summary": {
        "id": 87540207,
        "heart_rate_avg": "152.0",
        "calories_accum": "332.0",
        "created_at": "2021-06-28T16:12:58.000Z",
        "updated_at": "2021-06-28T16:12:58.000Z",
        "power_avg": "185.0",
        "distance_accum": "17520.21",
        "cadence_avg": "87.0",
        "ascent_accum": "0.0",
        "duration_active_accum": "1801.0",
        "duration_paused_accum": "0.0",
        "duration_total_accum": "1801.0",
        "power_bike_np_last": "186.0",
        "power_bike_tss_last": "27.6",
        "speed_avg": "9.73",
        "work_accum": "332109.0",
        "file": {
          "url": "https://cdn.wahooligan.com/wahoo-cloud/production/uploads/workout_file/file/FPoJBPZo17BvTmSomq5Y_Q/2021-06-28-154244-ELEMNT_BOLT_A1B3-59-0.fit"
        }
      }
    }
  ],
  "total": 55,
  "page": 1,
  "per_page": 1,
  "order": "descending",
  "sort": "starts"
}

我想将数据放入数据框中。但是,许多列似乎具有对象的 dtype。我认为这是因为 json 中的一些数值是双引号的。避免这种情况的最好和最有效的方法是什么(json 可能有很多锻炼元素)?

是修复返回的json吗?或者遍历数据框列并将对象转换为浮点数?

谢谢

马丁

标签: jsonpandasdataframe

解决方案


IIUC,你可以试试:

df = pd.json_normalize(json_data, meta=[
                  'total', 'page', 'per_page', 'order', 'sort'], record_path='workouts').convert_dtypes()

推荐阅读