首页 > 解决方案 > Pandas pd.merge gives nan

问题描述

I have two dataframes, which I need to merge/join based on a column. When I try to join/merge them, the new columns gives NaN.

Basically, I need to perform Left Join on the dataframes, considering df_user as the dataframe on the Left.

PS: The column on both the dataframes have same datatype.

Please find the dataframes below -

df_user.dtypes

App                       category
Sentiment                     int8
Sentiment_Polarity         float64
Sentiment_Subjectivity     float64

df_play.dtypes
App               category
Category          category
Rating             float64
Reviews            float64
Size               float64
Installs             int64
Type                  int8
Price              float64
Content Rating        int8
Installs_Cat          int8


df_play.head()

    App             Category  Rating    Reviews Size    Installs    Type    Price   Content Installs_Cat
0   SPrapBook   ART_AND_DESIGN  4.1      159       19   10000         0       0        0         9
1   U Launcher  ART_AND_DESIGN  4.5      87510     25   5000000       0       0        0         14
2   Sketch -    ART_AND_DESIGN  4.3      215644    2.8  50000000      0       0        1         16
3   Pixel Dra   ART_AND_DESIGN  4.4      967       5.6  100000        0       0        0         11
4   Paper flo   ART_AND_DESIGN  3.8      167       19   50000         0       0        0         10


df_user.head()


                App           Sentiment     Sentiment_Polarity  Sentiment_Subjectivity
0   10 Best Foods for You         2                1.00              0.533333
1   10 Best Foods for You         2                0.25              0.288462
3   10 Best Foods for You         2                0.40              0.875000
4   10 Best Foods for You         2                1.00              0.300000
5   10 Best Foods for You         2                1.00              0.300000

I tried both the codes below -

result = pd.merge(df_user, df_play, how='left', on='App')
result = df_user.join(df_play.set_index('App'),on='App',how='left',rsuffix='_y')

But all i got was -

App Sentiment   Sentiment_Polarity  Sentiment_Subjectivity  Category    Rating  Reviews Size    Installs    Type    Price   Content Rating  Installs_Cat
0   10 Best Foods for You   2   1.00        0.533333    NaN NaN NaN NaN NaN NaN NaN NaN NaN
1   10 Best Foods for You   2   0.25        0.288462    NaN NaN NaN NaN NaN NaN NaN NaN NaN
2   10 Best Foods for You   2   0.40        0.875000    NaN NaN NaN NaN NaN NaN NaN NaN NaN
3   10 Best Foods for You   2   1.00        0.300000    NaN NaN NaN NaN NaN NaN NaN NaN NaN
4   10 Best Foods for You   2   1.00        0.300000    NaN NaN NaN NaN NaN NaN NaN NaN NaN

Please excuse me for the formatting.

标签: pythonpandasdataframeleft-join

解决方案


推荐阅读