首页 > 解决方案 > 尝试在比较之前将 NaT、NaN 转换为字符串并在 Pandas 中失败

问题描述

为了避免任何 NaT、NaN 和 None 比较问题,我尝试在进行比较之前转换为字符串值“ NULL ”。

 if frames_equal == False:
        print(file_name, " value by value check for differences:")
        source_columns = df.columns;
        print(file_name, " columns:")
        print(source_columns);
        for source_index, source_row in df.iterrows():

            for source_col in source_columns:

                source_value = source_row[source_col];
                target_value = df_file.loc[source_index, source_col];

                if pd.isna(source_value) or pd.isnull(source_value):
                   source_value = '__NULL__';
                elif pd.isna(target_value) or pd.isnull(target_value):
                    target_value = '__NULL__';

                if source_value != target_value:
                    values_equal = False;
                    print("~" * 50);
                    print(file_name, " value differences in column ", source_col);
                    print("MISMATCH AT INDEX: ", source_index)
                    print("REGISTRATION_UID:  ", source_row["REGISTRATION_UID"])
                    print("Column: ", source_col);
                    print("Source Value: ", source_value);
                    print("Source Type: ", type(source_value));
                    print("Target Value: ", target_value);
                    print("Target Type: ", type(target_value));
                    print("~" * 50)

在比较之前,我通过在源值和目标值上使用 pd.isna() 或 pd.isnull() 检查源值或目标值是否为空。

但是,我的输出中仍然有不等式测试。

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2020_07_27__lu_volume.csv  value differences in column  LU_INSERT_YEAR
MISMATCH AT INDEX:  23740
REGISTRATION_UID:   ZOMI-00041736
Column:  LU_INSERT_YEAR
Source Value:  __NULL__
Source Type:  <class 'str'>
Target Value:  nan
Target Type:  <class 'numpy.float64'>

这意味着在比较之前我的 nan 值没有被拾取并转换为“ NULL ”字符串?

标签: pythonpandas

解决方案


推荐阅读