python - Faulty Pandas dataframe read_json sorting on python3.5.9
问题描述
Dataframe with more than 10 rows is incorrectly sorted on python3.5.9 after converting to json and back to pandas.DataFrame.
from pandas import DataFrame, read_json
columns = ['a', 'b', 'c']
data = [[1*i, 2*i, 3*i] for i in range(11)]
df = DataFrame(columns=columns, data=data)
print(df)
# a b c
# 0 0 0 0
# 1 1 2 3
# 2 2 4 6
# 3 3 6 9
# 4 4 8 12
# 5 5 10 15
# 6 6 12 18
# 7 7 14 21
# 8 8 16 24
# 9 9 18 27
# 10 10 20 30
new_df = read_json(df.to_json())
print(new_df)
# a b c
# 0 0 0 0
# 1 1 2 3
# 10 10 20 30 # this should be the last line
# 2 2 4 6
# 3 3 6 9
# 4 4 8 12
# 5 5 10 15
# 6 6 12 18
# 7 7 14 21
# 8 8 16 24
# 9 9 18 27
So DataFrame which was created with read_json
seems to be sorting indexes like strings (1,10,2,3,...) instead of ints (1,2,3..).
Behaviour generated with Python 3.5.9 (default, Jan 4 2020, 04:09:01) (docker image python:3.5-stretch)
Everything seems to be working fine on my local machine (Python 3.8.1 (default, Dec 21 2019, 20:57:38)).
pandas==0.25.3 was used on both instances.
Is where a way to fix this without upgrading python?
解决方案
用于sort_values
对列上的数据框进行排序a
。如下所示:
new_df = read_json(df.to_json())
#sort column
print(new_df.sort_values('a'))
#sort index
print(new_df.sort_index())
#ouput
a b c
0 0 0 0
1 1 2 3
2 2 4 6
3 3 6 9
4 4 8 12
5 5 10 15
6 6 12 18
7 7 14 21
8 8 16 24
9 9 18 27
10 10 20 30
``
推荐阅读
- swift - 预计解码字典
但找到了一个数组。”,基础错误:无)) - javascript - Javascript:重命名内置函数
- c++ - 为二叉搜索树创建统一搜索函数
- react-native - 如何使用 React Native Web 录制语音?
- javascript - Vue.js 变量在基础模式中失去作用域
- android - android studio 多个dex文件定义 Lcom/google/common/reflect/Types$WildcardTypeImpl;
- python-3.x - 如何按大小对列表进行排序?
- c++ - 为什么 Itanium ABI 需要在内存中分配一些值参数并通过引用传递?
- mysql - 在 Asp.Net Core 中使用 Pomelo.EntityFrameworkCore.MySql 获取模型列表 Json
- azure-service-fabric - 如何查找 Service Fabric Mesh 服务的公共 IP 地址