python - how to copy values of one column of a dataframe to another column of other dataframe in pandas?
问题描述
while copying one by one values of string column to another dataframe's column I got this as an output containing square brackets:
chk.at[index,'StartLocation1'] = chkn['StartLocation1'].values
chk.at[index,'EndLocation1'] = chkn['EndLocation1'].values
0 [Petrol Pump-Ramji Ambedkar Nagar]
1 [V Enterprises]
2 [Baola]
3 [Dharmajyot-Vapi]
4 [KINGSTON TOWER VASAI Dominos-(THANE)]
Name: StartLocation1, dtype: object
So further I thought to remove this [] bracket: I have applied this:
chk['EndLocation1'].str.strip('[]').astype(str)
0 nan
1 nan
2 nan
3 nan
4 nan
But, I have got nan values. Please support!
See this is my whole code:
chk['StartLocation1'] = ''
chk['EndLocation1'] = ''
for index, row in chk.iterrows():
start = row.StartTime
end = row.EndTime
reg = row.RegistrationNo
query = "SELECT TOP 1 RegistrationNo, GPSDateTime, Location FROM GPSEventsDataCurrentWeek where GPSDateTime Between 'start_date' and 'end_date' and RegistrationNo = 'reg' and GroundSpeed > 0 ORDER BY GPSDateTime ASC"
query = query.replace('start_date', start.strftime('%m/%d/%Y %H:%M:%S'))
query = query.replace('end_date', end.strftime('%m/%d/%Y %H:%M:%S'))
query = query.replace('reg', str(reg))
chk1 = pd.read_sql(query, con=engine)
chk1 = chk1.rename({'Location': 'StartLocation1','GPSDateTime': 'StartTime'}, axis=1)
query2 = "SELECT TOP 1 RegistrationNo, GPSDateTime, Location FROM GPSEventsDataCurrentWeek where GPSDateTime Between 'start_date' and 'end_date' and RegistrationNo = 'reg' and GroundSpeed > 0 ORDER BY GPSDateTime DESC"
query2 = query2.replace('start_date', start.strftime('%m/%d/%Y %H:%M:%S'))
query2 = query2.replace('end_date', end.strftime('%m/%d/%Y %H:%M:%S'))
query2 = query2.replace('reg', str(reg))
chk2 = pd.read_sql(query2, con=engine)
chk2 = chk2.rename({'Location': 'EndLocation1','GPSDateTime': 'EndTime'}, axis=1)
chkn = pd.merge(chk1,chk2, on = ['RegistrationNo'], how = 'outer')
print(chkn[['StartLocation1','EndLocation1']])
chk.at['StartLocation1'] = chkn['StartLocation1'].values
chk.at[index,'EndLocation1'] = chkn['EndLocation1'].values
and this is my dataframe chk:
Companyid RegistrationNo Date Hour Value RunningDuration StartTime EndTime
0 236.0 MH-01-CJ-3571 2020-09-01 0.0 True 00:42:00 2020-09-01 00:08:00 2020-09-01 00:59:00
1 236.0 MH-01-CV-7460 2020-09-01 0.0 True 00:49:00 2020-09-01 00:09:00 2020-09-01 00:58:00
2 654.0 MH-04-JK-4102 2020-09-01 0.0 True 00:03:00 2020-09-01 00:11:00 2020-09-01 00:24:00
3 654.0 DN-09-R-9421 2020-09-01 0.0 True 00:02:00 2020-09-01 00:24:00 2020-09-01 00:54:00
4 236.0 MH-01-CV-7456 2020-09-01 0.0 True 00:04:00 2020-09-01 00:38:00 2020-09-01 00:42:00
解决方案
也许一个可能的解决方案可以通过列表元素的连接(python 中的列表连接列表)选择列表中的唯一值。df.values 返回一个 n 数组对象,因此返回的值是长度为 1 的列表,但建议使用 df.to_numpy 代替(https://pandas.pydata.org/pandas-docs/stable/reference/api/ pandas.DataFrame.values.html)
无论如何,我认为以这种方式分配列并不是最好的方法。
推荐阅读
- java - Spring Boot bean 未使用 Bazel 添加到上下文中
- networking - 如何将 Kubernetes 服务暴露给 VPC 中的所有区域,而不暴露给外界
- namespaces - phpunit 类未在测试中加载
- c# - Amazon Web Services Websocket Gateway API 路由设置中 Lambda 函数的格式是什么?
- python - 大熊猫中的 drop_first 有什么用?
- java - 从 http get Java 解谜
- wordpress - 在 FB Messenger 客户聊天中发送产品链接 | WooCommerce
- node.js - React.js npm start 什么都不做
- python - Python:数据框列表中的标准偏差
- r - Using dplyr to mutate variables which meet a certain criteria, while keeping the remaining values the same