python - 如何在从代码中获得的 csv 文件中打印网页抓取结果
问题描述
from bs4 import BeautifulSoup
import requests
import csv
url = "https://coingecko.com/en"
page = requests.get(url)
html_doc = page.content
soup = BeautifulSoup(html_doc,"html.parser")
coinname =soup.find_all("div",attrs={"class":"coin-content center"})
coin_sign = soup.find_all("div",attrs={"class":"coin-icon mr-2 center flex-column"})
coinvalue = soup.find_all("td",attrs={"class":"td-price price text-right "})
marketcap = soup.find_all("td",attrs={"class":"td-market_cap cap "})
Liquidity = soup.find_all("td", attrs={"class": "td-liquidity_score lit text-right "})
coin_name = []
coinsign = []
Coinvalue = []
Marketcap = []
marketliquidity = []
for div in coinname:
coin_name.append(div.a.span.text)
for sign in coin_sign:
coinsign.append(sign.span.text)
for Value in coinvalue:
Coinvalue.append(Value.a.span.text)
for cap in marketcap:
Marketcap.append(cap.div.span.text)
for liquidity in Liquidity:
marketliquidity.append(liquidity.a.span.text)
print(coin_name)
print(coinsign)
print(Coinvalue)
print(Marketcap)
print(marketliquidity)
我想将输出保存到一个包含 5 列的 csv 文件中。第 1 列是“coin_name”,第 2 列是“coinsign”,第 3 列是“coinvalue”,第 4 列是“Marketcap”,第 5 列是“Marketliquidity”。我该如何解决这个问题?
我还想限制我收到的数据,因为我只想收到 100 个 coin_name 但我收到了 200 个 coin_name。
解决方案
from bs4 import BeautifulSoup
import requests
import csv
url = "https://coingecko.com/en"
page = requests.get(url)
soup = BeautifulSoup(page.content,"html.parser")
#Instead of assigning variable and looping you can use list comprehension.
names = [div.a.span.text for div in soup.find_all("div",attrs={"class":"coin-content center"})]
signs = [sign.span.text for sign in soup.find_all("div",attrs={"class":"coin-icon mr-2 center flex-column"})]
values = [value.a.span.text for value in soup.find_all("td",attrs={"class":"td-price price text-right "})]
caps = [cap.div.span.text for cap in soup.find_all("td",attrs={"class":"td-market_cap cap "})]
liquidities = [liquidity.a.span.text for liquidity in soup.find_all("td", attrs={"class": "td-liquidity_score lit text-right "})]
with open('coins.csv', mode='w',newline='') as coins:
writer = csv.writer(coins, delimiter=',', quotechar='"')
#Take only first 100 coins
for i in range(100):
writer.writerow([names[i],signs[i],values[i],caps[i],liquidities[i]])
输出将是
Bitcoin,BTC,"$6,578.62","$113,894,498,118","$1,476,855,331"
Ethereum,ETH,$224.49,"$22,995,876,618","$1,256,303,216"
EOS,EOS,$5.73,"$5,193,319,905","$708,339,006"
XRP,XRP,$0.48,"$19,249,618,341","$564,378,978"
Litecoin,LTC,$57.80,"$3,388,966,637","$486,289,650"
NEO,NEO,$18.11,"$1,177,368,159","$160,733,208"
Monero,XMR,$113.64,"$1,871,890,512","$55,235,745"
推荐阅读
- python - 仅在出现特殊字符时计算行数
- directory - 是否有用于描述数据集(文件夹和文件)内容的特定 UML 图?
- node.js - 护照反序列化用户未被调用
- node.js - 如何从 Fastify 服务器中的 URL 解析查询字符串参数?
- bash - 匹配不包含特定模式的行并在最后附加字符串 - Bash
- cypress - 如何在 Cypress 中正确使用 .wrap()
- python-3.x - 如何在一维中变形/缩放 3 维 numpy 数组?
- python - 403 禁止 - cloud_storage_bucket get_media
- bower - 如何创建自己的凉亭资产?
- django - 如何使用 CreateView 过滤表单字段之一中的查询集值?