python - 如何仅在数据框中第一次拆分之前拆分文本?
问题描述
我有一个数据集,其中有两列:Industry Classifications
和Stock Tickers
. 一家公司在其Industry Classification
列中有多个标签,由;
分隔符分隔。我只想选择第一个标签。
import pandas as pd
training = pd.read_excel('Training Data.xlsx')
当前文件结构:(这是该列的示例)
Industry Classifications
Beauty Care Products (Primary); Consumer Staples (Primary); Hair Care Products (Primary);
Catalog Flowers, Gifts and Novelties (Primary); Catalog Hobbies, Games and Toy Retail (Primary);
Information Technology (Primary); Internet Software and Services (Primary);
Casualty (Primary); Financials (Primary); Fire and Marine Insurance (Primary);
Commercial and Professional Services (Primary); Commercial Services and Supplies (Primary);
Banks (Primary); Banks (Primary); Diversified Banks (Primary); Financials (Primary);
Application Software (Primary); Information Technology (Primary); Software (Primary);
Commercial and Professional Services (Primary); Consulting Services (Primary); Industrials (Primary);
Banks (Primary); Banks (Primary); Financials (Primary); National and State Commercial Banks (Primary);
预期输出:
Industry Classifications
Beauty Care Products (Primary)
Catalog Flowers
Information Technology (Primary)
Casualty (Primary)
Commercial and Professional Services (Primary)
Banks (Primary); Banks (Primary)
Application Software (Primary)
Commercial and Professional Services (Primary)
Banks (Primary); Banks (Primary)
解决方案
您可以像已经在做的那样提取第一列,然后在分号上拆分并获取结果的第一个元素。
first_tag = col.split(';')[0]
推荐阅读
- javascript - 如何通过循环将属性推送到数组内对象内的数组?
- solidity - 我需要帮助来理解以下代码
- listview - Xamarin Forms更改ListView中每个ViewCell的背景颜色
- json - 显示 json 对象属性 | 离子项目和 Laravel Restful Api
- excel - 计算具有相似数据的唯一单元格在 Excel 中的列之间计数
- selenium - 如何处理应用程序中的动态错误消息
- python - 我在使用这个 pygame 代码时遇到了一些问题
- c - 使用 realloc() 初始化内存
- java - 同一实体上的多对多
- python - 如何支持 Flask-SSE 访问控制