首页 > 解决方案 > Why are some attributes in .xlsx sheet not showing up in my dataframe created in Python?

问题描述

I have imported Canada.xlsx file from my pc to df_can dataframe in Jupyter notebook. However some important attributes of xlsx file like 'OdName' are not visible. Also the data in the xlsx file starts with the country "Afghanistan" whereas my df_can dataframe starts with North-America.

Following is the code:

import pandas as pd
df_can = pd.read_excel('C:\\Users\\datasets\\UN_MigFlow_A_to_E\\Canada.xlsx', sheet_name='Canada by Citizenship', skiprows=range(20), skip_footer=2 )
df_can.head()

I searched the internet but couldn't find any solution to both the problems. I also tried modifying the data-sheet in xlsx file itself by clearing the top 20 unwanted rows, but that didn't work either.

I am attaching the image of Canada.xlsx (i.e. expected outcome) and the url is

url:https://www.un.org/en/development/desa/population/migration/data/empirical2/migrationflows.asp (incase anyone's interested)

Canada.xlsx

The actual output is a data frame starting with North America and missing important attributes like 'OdName'. What could be the problem?

标签: python-3.xdataframejupyter-notebook

解决方案


我意识到问题不在于数据集,而在于我作为在线课程的一部分所遵循的代码。许多代码已经过时并且不符合标准。这是正确的代码:

df_can = pd.read_excel('C:\\Users\\datasets\\UN_MigFlow_A_to_E\\Canada.xlsx', 'Canada by Citizenship', skiprows=range(20), skip_footer=2 )
df_can.head()

我错误地选择了另一张纸,因此数据不是我最初期望的。现在也存在所有属性。


推荐阅读