首页 > 解决方案 > 对数据框进行行迭代以计算值并将它们添加到新列

问题描述

达到目标的步骤:创建一个 for 循环以遍历数据框中的每一行,并且:

  1. 获取 X 和 Y 列值以在函数中使用它们
  2. 该函数将生成经度和纬度值
  3. 在名为“Lat”和“Lon”的新列中的同一行中添加这些值

目前,第 1 步和第 2 步正在工作,但我无法获取每列中每个值的附加信息

我试过的是:

在循环中使用的定义

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=28, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    print( lonlat[1],lonlat[0])

环形:

for _, row in df.iterrows():
    xy_to_lonlat(row['X'],row['Y'])

这是完美的输出:

28.667978631874004 -17.96430510323817
28.67957708337043 -17.96589718293177
28.680075373251725 -17.96652237896143
28.696094952446764 -17.971279315586795

但我需要将这两个值引入 df,准确地引入df['Lat']df['Lon']

我尝试将它们附加()到列表中,稍后我将插入到 df 中,但它不起作用:

aLongitud=[]
aLatitud=[]

for _, row in df.iterrows():
    xy_to_lonlat(row['X'],row['Y'])
    aLongitud.append(lonlat[1])
    aLatitud.append(lonlat[0])

这是 df 的样子: df.head 如何打印

该函数适用于 52 行,我只需要将它们放入 df 中的 2 个新列中:

28.667978631874004 -17.96430510323817
28.67957708337043 -17.96589718293177
28.680075373251725 -17.96652237896143
28.696094952446764 -17.971279315586795
28.69709953128404 -17.97089438970623
28.704102246479206 -17.97502030269029
28.714190480593878 -17.98059681820521
28.84284299081375 -17.943724718418043
28.85522495646711 -17.907748758676934
28.85497605095961 -17.915999785074945
28.834039353212727 -17.853402778875363
28.84368320877517 -17.790724992980966
28.8311955800612 -17.773218425619255
28.757725903465193 -17.735394629644425
28.75694932761218 -17.734865031953948
28.651232614536056 -17.75864104734293
28.647850336922037 -17.75586691138396
28.64510111053916 -17.756973867003158
28.54740295444906 -17.779646961686794
28.481011316595747 -17.871383348460515
28.598084805574075 -17.92779850800547
28.84869842152646 -17.898800401690675
28.730123181880874 -17.72687292142767
28.65501749037169 -17.759807688028065
28.586115587686052 -17.755714748146353
28.855549587948108 -17.90757529900783
28.62104314133748 -17.750679106650242
28.805231369924527 -17.76049570914483
28.842322764567797 -17.794590436117428
28.654662237239517 -17.761368473029265
28.652716177555675 -17.954686156568993
28.84441637529699 -17.789637146820752
28.812367721581616 -17.763087214328706
28.80648375432461 -17.75977264125206
28.713070037952928 -17.74394044409638
28.850159557661478 -17.898032389327415
28.84268417328949 -17.884610248902643
28.506075965709968 -17.87932721318885
28.60916367244466 -17.92715257476472
28.508055636889907 -17.879126662123344
28.593688218530882 -17.755496249789623
28.614870490264675 -17.753636080872226
28.453393338804933 -17.83975500058191
28.81927942283548 -17.97071265399719
28.632049774803967 -17.948276230580895
28.810197401802437 -17.7626526992656
28.81013751332894 -17.762176710792335
28.651195000175182 -17.757862000230173
28.491243000164914 -17.874658000300624
28.523693000166094 -17.87819700030406
28.56082500016691 -17.89452700031706
28.53126600016634 -17.878297000304332

循环函数后 df 的外观>>“无”问题: 在此处输入图像描述

标签: pythonpandasdataframefor-loop

解决方案


此解决方案将您现有的xy_tolonlat()函数与 pandas DataFrameapply方法一起使用:

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=28, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[1],lonlat[0]

# I just made up this data
xs = [21000,21020,23000]
ys = [3000000,3000050,3000100]
df = pd.DataFrame({'X':xs,'Y':ys})

df['lat_lon'] = df.apply(lambda r: xy_to_lonlat(r['X'],r['Y']),axis=1)
df['Lat'] = df['lat_lon'].apply(lambda x: x[0])
df['Lon'] = df['lat_lon'].apply(lambda x: x[1])
df = df.drop('lat_lon',axis=1)

df

#        X        Y        Lat        Lon
# 0  21000  3000000  27.039540 -19.826207
# 1  21020  3000050  27.039996 -19.826026
# 2  23000  3000100  27.041129 -19.806152

推荐阅读