首页 > 解决方案 > Python sort_values (inplace=True) but not really?

问题描述

So I am trying to write a loop in python as I have to compare rows to each other in a table. I have to sort the data, which I do by 'sort_values', the dataframe seems to sort, yet when I step through it with a 'for loop' it is still unsorted? So I'm clearly not understanding how pandas memory allocation works. I have tried sorting to another dataframe and I get the same problem

import pandas as pd

data = {'state': ['Ohio', 'Ohio', 'Ohio', 'Nevada', 'Nevada', 'Nevada'],
        'date1': ['2000-04-18', '2000-04-16', '2000-04-15', '2000-04-25', '2000-04-16', '2000-04-17'],
        'stat1': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}

frame = pd.DataFrame(data) 
 
frame
output original unsorted:
 state  date1   stat1
0   Ohio    2000-04-18  1.5
1   Ohio    2000-04-16  1.7
2   Ohio    2000-04-15  3.6
3   Nevada  2000-04-25  2.4
4   Nevada  2000-04-16  2.9
5   Nevada  2000-04-17  3.2 
frame.sort_values(by=['state','date1'], inplace=True)

frame
sorted output:
    state   date1   stat1
4   Nevada  2000-04-16  2.9
5   Nevada  2000-04-17  3.2
3   Nevada  2000-04-25  2.4
2   Ohio    2000-04-15  3.6
1   Ohio    2000-04-16  1.7
0   Ohio    2000-04-18  1.5
for i1 in range(0, len(frame)):
    state1=frame['state'][i1]
    print(frame['state'][i1],' ', frame['date1'][i1])
output unsorted:
Ohio   2000-04-18
Ohio   2000-04-16
Ohio   2000-04-15
Nevada   2000-04-25
Nevada   2000-04-16
Nevada   2000-04-17

标签: pythonpandas

解决方案


You need to reset indices to see the correct order in the loop:

frame.sort_values(by=['state','date1'], inplace=True).reset_index(inplace = True)

Otherwise, when iterating over the data frame, it moves forward based on the row indices. Hence, you can see the same order as you had in the original data frame. You can also verify the fact by looking at the indices in your examples.


推荐阅读