python - 从这个网站上抓取地址和电话号码
问题描述
如何使用bs4和 pandas 库从和联系信息类中抓取数据并导出到 csv 文件?从这个网站?我需要有关如何从标签和联系信息类中抓取数据的帮助。
import pandas as pd
import bs4
import requests
import re
full_dict={'Title':[],'Description':[],'Address':[]}
res=requests.get("https://cupcakemaps.com/cupcakes/cupcakes-near-me/p:2")
listings=soup.findAll(class_='media')
for listing in listings:
listing_title=listing.find(True,{'title':True}).attrs['title']
listing_Description=listing.find('p',{'class':'summary-desc'})
listing_address=listing.find('p',{'class':'contact-`info'}).text=re.compile(r'[0-9]{0,4}')`
解决方案
前任。
import pandas as pd
from bs4 import BeautifulSoup,Tag
import requests
import re
res=requests.get("https://cupcakemaps.com/cupcakes/cupcakes-near-me/p:2")
soup = BeautifulSoup(res.text,'lxml')
listings=soup.findAll(class_='media')
data = []
for listing in listings:
listing_title=listing.find(True,{'title':True}).attrs['title']
listing_Description=listing.find('p',{'class':'summary-desc'})
if isinstance(listing_Description,Tag):
listing_Description = listing_Description.text.strip()
listing_address=listing.find('p',{'class':'contact-info'})
if isinstance(listing_address,Tag):
number_text = listing_address.text.strip()
listing_address = ''.join(filter(str.isdigit,number_text))
full_dict = {'Title': listing_title, 'Description': listing_Description, 'Address': listing_address}
data.append(full_dict)
df = pd.DataFrame(data)
# saved data into csv file
df.to_csv("contact.csv")
print(df)
输出/输出:
Title Description Address
0 Explore Category 'Anaheim CA Birthday Cupcakes... Delectable Anaheim, CA - Delectable check out ... 7147156086
1 Explore Category 'Costa Mesa CA Birthday Cupca... Lisa's Gourmet Snacks Costa Mesa CA check out... 7144275814
2 Explore Category 'Shorewood IL Birthday Cupcak... Acapulco Bakery Inc Shorewood, IL - Acapulco B... 8157291737
3 Explore Category 'San Francisco CA Birthday Cu... Hilda's Mart & Bake Shop San Francisco CA che... 4153333122
4 Explore Category 'Los Angeles CA Birthday Cupc... Lenny's Deli Los Angeles, CA - Lenny's Deli ch... 3104755771
5 Explore Category 'San Francisco CA Birthday Cu... Sweet Inspirations San Francisco CA check out... None
6 Explore Category 'Costa Mesa CA Birthday Cupca... The Cupcake Costa Mesa CA check out The Cupc... 9496420571
7 Explore Category 'Los Angeles CA Birthday Cupc... United Bread & Pastry Inc Los Angeles CA chec... 3236610037
8 Explore Category 'Garden Grove CA Birthday Cup... Pescadores Garden Grove CA check out Pescado... 7145395585
9 Explore Category 'Bakersfield CA Birthday Cupc... Bimbo Bakeries Usa Bakersfield CA check out ... 6613219352
推荐阅读
- javascript - 尝试通过 paypal api 创建付款时如何解决响应:{type: "cors"}
- java - 按对象属性分组
- reactjs - 反向代理到静态网站不起作用
- python - 围绕 Flask SqlAlchemy 查询简化一对 if-else 语句
- c# - 在 Visual Studio 2019 中的 Xamarin 中捕获 Xaml 错误的最佳方法是什么?
- jsf - 如何在 p:calendar 中将滑块/选择器初始值设置为“值”?
- c# - SqlDataReader - 使用所有匹配项填充子列表
- javascript - 使用ajax发送ip和接收位置
- javascript - 承诺没有按顺序解决
- rest - 多个客户端调用的 RESTApi 问题