python - 使用 python 从“Samuel L. JacksonJessica BielBrian Presley50 CentChristina RicciChad Michael Murray”中提取文本
问题描述
我有这样的字符串
"Samuel L. JacksonJessica BielBrian Presley50 CentChristina RicciChad Michael Murray"
我想要这样
Samuel L. Jackson,
Jessica Biel,
Brian Presley,
50 Cent,
Christina Ricci,
Chad Michael,
Murray,
使用蟒蛇
解决方案
在熊猫中,您可以这样做:
import pandas as pd
a= pd.Series("Samuel L. JacksonJessica BielBrian Presley50 CentChristina RicciChad Michael Murray").str.replace(r'([a-z])([A-Z0-9])', r'\1,\2')
a.to_list()[0]
# 'Samuel L. Jackson,Jessica Biel,Brian Presley,50 Cent,Christina Ricci,Chad Michael Murray'
或者
a = pd.Series("Samuel L. JacksonJessica BielBrian Presley50 CentChristina RicciChad Michael Murray").str.replace(r'([a-z])([A-Z0-9])', r'\1,\n\2')
print(a.to_list()[0])
输出
Samuel L. Jackson,
Jessica Biel,
Brian Presley,
50 Cent,
Christina Ricci,
Chad Michael Murray
是不是这个意思:
import requests
import csv
from bs4 import BeautifulSoup
link='https://en.wikipedia.org/wiki/Home_of_the_Brave_(2006_film)'
result1 = requests.get(link)
src1 = result1.content
soup = BeautifulSoup(src1,'lxml')
table = soup.find_all('ul')[3]
names = table.find_all('a')
for item in names:
print(item.text)
输出:
Samuel L. Jackson
Jessica Biel
Brian Presley
50 Cent
Chad Michael Murray
Christina Ricci
Victoria Rowell
Vyto Ruginis
推荐阅读
- php - 图像未在 mysql 数据库中更新
- java - 调用 Lambda 函数时出现 ClassNotFoundException
- css - CSS:使元素的z-index低于兄弟的孩子
- python - python的算法列表
- c - C 编译指示“-Wunused-variable”在 Ubuntu 上不起作用
- javascript - 从 Javascript 中的 XML 数据中获取任何属性的值
- java - 使用 Swing 的测验布局
- python - 将预定时间戳传递给 celery beat / redbeat
- selenium - bind() 失败:无法分配请求的地址 (99)
- brightness - Halcon - 检测与当地环境相比的亮点