python - 在第二次出现列值后删除所有行
问题描述
我想在列值的第二个实例之后删除已转换为数据框的 .txt 文件中的所有数据。在这种情况下,分隔符“---”。
数据框构造如下:
15 Leading Causes of Death 15 Code Deaths Population Crude Rate Crude Rate Lower 95% Confidence Interval Crude Rate Upper 95% Confidence Interval
#Accidents (unintentional injuries) (V01-X59,Y85-Y86) GR113-112 21 152430 13.8 8.5 21.1
#Intentional self-harm (suicide) (*U03,X60-X84,Y87.0) GR113-124 15 152430 Unreliable 5.5 16.2
---
Dataset: Underlying Cause of Death, 1999-2019
Query Parameters:
States: Marin County, CA (06041)
Ten-Year Age Groups: 25-34 years
Year/Month: 1999; 2000; 2001; 2002; 2003
Group By: 15 Leading Causes of Death
Show Totals: Disabled
Show Zero Values: Disabled
Show Suppressed: Disabled
Calculate Rates Per: 100,000
Rate Options: Default intercensal populations for years 2001-2009 (except Infant Age Groups)
---
Help: See http://wonder.cdc.gov/wonder/help/ucd.html for more information.
---
Query Date: Sep 23, 2021 6:51:59 PM
在列值或 NaN 等的第一个实例之后,我已经看到了很多解决方案,但对于第二个或 nth 没有任何解决方案......
这是到目前为止我在文件中读取的简单代码。
import pandas as pd
dl = pd.read_csv('Underlying Cause of Death, 1999-2019(3).txt', sep = '\t')
dl.to_csv('test.csv', index = False)
解决方案
查找以 '---' 开头的行并应用累积总和,然后使第一行的索引等于 2 并将您的数据帧切片到该索引。
>>> df.iloc[:df.iloc[:, 0].str.startswith('---').cumsum().eq(2).idxmax()]
0 #Accidents (unintentional injuries) (V01-X59,Y... GR113-112 21.0 152430.0 13.8 8.5 21.1
1 #Intentional self-harm (suicide) (*U03,X60-X84... GR113-124 15.0 152430.0 Unreliable 5.5 16.2
2 --- NaN NaN NaN NaN NaN NaN
3 Dataset: Underlying Cause of Death, 1999-2019 NaN NaN NaN NaN NaN NaN
4 Query Parameters: NaN NaN NaN NaN NaN NaN
5 States: Marin County, CA (06041) NaN NaN NaN NaN NaN NaN
6 Ten-Year Age Groups: 25-34 years NaN NaN NaN NaN NaN NaN
7 Year/Month: 1999 2000 2001.0 2002.0 2003 NaN NaN
8 Group By: 15 Leading Causes of Death NaN NaN NaN NaN NaN NaN
9 Show Totals: Disabled NaN NaN NaN NaN NaN NaN
10 Show Zero Values: Disabled NaN NaN NaN NaN NaN NaN
11 Show Suppressed: Disabled NaN NaN NaN NaN NaN NaN
12 Calculate Rates Per: 100,000 NaN NaN NaN NaN NaN NaN
13 Rate Options: Default intercensal populations ... NaN NaN NaN NaN NaN NaN
推荐阅读
- java - 如何使用 Java 进行 RDP 身份验证?
- json - how do i loop through a json object from an api in vuejs?
- phpstorm - 为什么 PhpStorm 搜索有时会挂起相当长的一段时间?
- knapsack-problem - 迷你锌。离散背包问题。Аn 难以理解的解决方案
- scrapy - scrapy分页在tripadvisor上不起作用
- twitter-bootstrap - Bootstrap5 模态单击整个 div。禁用内部链接项
- azure-devops - Azure DevOps - 无法加载源的服务索引
- c++ - 获取 Qt5 库的运行时版本信息
- java - 在 Java 中抽象出模型构建器上的字段创建
- javascript - React:将 html 属性作为 props 传递的最佳方式