python-3.x - 如何在没有方括号的情况下将结果保存在文本文件或 excel 中?
问题描述
我正在研究网页抓取,我正在逐行从文本文件中获取名称,并在谷歌上搜索并从该结果中抓取地址。我想在各自名称的前面添加该结果。这是我的文本文件 a.txt:
0.5BN FINHEALTH PRIVATE LIMITED
01 SYNERGY CO.
1 BY 0 SOLUTIONS
这是我的代码:
import requests
from bs4 import BeautifulSoup
USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"
out_fl = open('a.txt','r')
for line in out_fl:
query = line
query = query.replace(' ', '+')
print(line)
URL = f"https://google.com/search?q={query}"
print(URL)
headers = {"user-agent": USER_AGENT}
resp = requests.get(URL, headers=headers)
if resp.status_code == 200:
soup = BeautifulSoup(resp.content, "html.parser")
results = []
newline = '\n'
for g in soup.find_all('span', class_="i4J0ge"):
x = f'{line}:{g.text}{newline}'
results.append(x)
print(results)
with open("results.txt","a") as result:
result.write(str(results))
我得到这样的结果,但它的格式不正确,请帮帮我。我的预期结果是:
0.5BN FINHEALTH PRIVATE LIMITED : Address: 2nd Floor, BHIVE Forum, GNS Towers #18, Dairy
Circle Road, Adugodi, Koramangala, Bengaluru, Karnataka 560029Hours: Closed ⋅ Opens 9:30AM
MonSaturdayClosedSundayClosedMonday9:30am–7:30pmTuesday9:30am–7:30pmWednesday9:30am–
7:30pmThursday9:30am–7:30pmFriday9:30am–7:30pmSuggest an editUnable to add this file.
Please check that it is a valid photo
01 SYNERGY CO. : 01 SYNERGY CO.\n:Located in: Punjab Agricultural UniversityAddress: 3rd
Floor Kartar Bhawan, Ferozpur Rd, Ludhiana, Punjab 141001Hours: Closes soon ⋅ 5PM ⋅ Opens
9:30AM MonSaturday10am–5pmSundayClosedMonday9:30am–7:30pmTuesday9:30am–
7:30pmWednesday9:30am–7:30pmThursday9:30am–7:30pmFriday9:30am–7:30pmSuggest an editUnable
to add this file. Please check that it is a valid photo.Phone: 098159 18807
或者进入excel。谢谢
解决方案
您可以将结果分配给 pandas 数据框,然后将其写入 excel 或 csv
Import pandas as pd
df=pd.DataFrame(columns=["",""]. # Assign column name as required
df = [results]
df.to_excel('filename.xlsx', sheet_name='sheet name', index = False)
推荐阅读
- javascript - CSS transition with scroll
- php - php replace single and multiple back slash in text
- .net - .Net Framework 4.8 未显示在 Visual Studio 目标框架下拉列表中
- android - 用于创建手机身份验证凭据的验证ID在flutter中无效
- minecraft - ScaledResolution 未定义
- python - 如何在 Python 中将混合值转换为整数?
- azure - Databricks 无法连接到 Azure Synapse Analytics:返回意外版本:Microsoft SQL Azure (RTM) - 12.0.2000.8
- jquery - Vue.js 通过外部库对 DOM 进行动态更改(jQuery 样式)
- html - 本地存储记住单击的选项卡类
- spring-boot - 动态添加队列消费者春季问题