首页 > 解决方案 > python Dataframe subString 一列(长字符串)

问题描述

我需要从列(带有LF的长字符串)中提取,我遵循的步骤如下:

1. df['CF_Row_Start'] = df.Content.str.find('string_start', 0) # works fine    
2. df['CF_Row_End'] = df.Content.str.find('\n', df['CF_Row_Start']) #  does NOT work
3. df['CR_Row'] = df.Content.str[df['CF_Row_Start']:df['CF_Row_End'] #gives empty column because of step 2

步骤 2 给出nan了结果。

知道如何解决吗?

这是我的 df 的样子(只有 1 行):

Idx,Id,Type,Name,Path,Content,SubPath,CF_Row_Start,CF_Row_End,CF_Row
0,167404,rp_chart,Voice,/QoS Reporting/Voice - On-Net,
description {SMS - On-Net}
show_in_email_body common
email_attachments common
db_name kpi
db_table Main
column_filter {{errorText !vq*} && {OrderId 60978 | 60979 | 61178} && {errorId !990101|990102|990103|990104|130104|991142|443100|441100|110902}}
column_sql_filter {(NOT (Main.errorText LIKE 'vq%')) AND (Main.OrderId IN (60978,60979,61178)) AND (NOT (Main.errorId IN (990101,990102,990103,990104,130104,991142,443100,441100,110902)))}
column_tcl_filter {!([string match -nocase ""vq*"" $COL(errorText)]) && [lsearch -exact [list 60978 60979 61178] $COL(OrderId)] >= 0 && !([lsearch -exact [list 990101 990102 990103 990104 130104 991142 443100 441100 110902] $COL(errorId)] >= 0)}
background_color automatic
x_label_angle 30
bar_labels absolute
bar_style automatic
bar_effect off
bar_width_max 60
line_width 2,
250,,

我的代码是:

df['CF_Row_Start'] = df.Content.str.find('column_filter', 0)
df['CF_Row_End'] = df.Content.str.find('\n', df['CF_Row_Start'])
print(df['CF_Row_End'][0])

result is nan
expected result is 330 (whole line of column_filter)

谢谢你。

标签: pythonpandasdataframefind

解决方案


推荐阅读