python - 无法以某种自定义方式将结果写入 csv 文件
问题描述
我创建了一个脚本来解析网页中不同容器的、 和singers
out their concerning links
。脚本运行良好。我不能做的是将结果相应地写入 csv 文件。actors
their concerning links
我试过:
import csv
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
base_url = 'https://www.hindigeetmala.net'
link = 'https://www.hindigeetmala.net/movie/2_states.htm'
res = requests.get(link)
soup = BeautifulSoup(res.text,"lxml")
with open("hindigeetmala.csv","w",newline="") as f:
writer = csv.writer(f)
writer.writerow(['singer_records','actor_records'])
for item in soup.select("tr[itemprop='track']"):
try:
singers = [i.get_text(strip=True) for i in item.select("span[itemprop='byArtist']") if i.get_text(strip=True)]
except Exception: singers = ""
try:
singer_links = [urljoin(base_url,i.get("href")) for i in item.select("a:has(> span[itemprop='byArtist'])") if i.get("href")]
except Exception: singer_links = ""
singer_records = [i for i in zip(singers,singer_links)]
try:
actors = [i.get_text(strip=True) for i in item.select("a[href^='/actor/']") if i.get("href")]
except Exception: actors = ""
try:
actor_links = [urljoin(base_url,i.get("href")) for i in item.select("a[href^='/actor/']") if i.get("href")]
except Exception: actor_links = ""
actor_records = [i for i in zip(actors,actor_links)]
song_name = item.select_one("span[itemprop='name']").get_text(strip=True)
writer.writerow([singer_records,actor_records,song_name])
print(singer_records,actor_records,song_name)
如果我按原样执行脚本,这就是我得到的输出。
当我尝试喜欢writer.writerow([*singer_records,*actor_records,song_name])
时,我得到了这种类型的输出。只写入第一对元组。
这是我的预期输出。
如何根据第三张图像将结果写入 csv 文件中的名称及其链接?
PS 为简洁起见,输出的所有图像都代表 csv 文件的第一列。
解决方案
根据 SIM 的反馈,我认为这就是您要寻找的(我刚刚添加了一个用于格式化列表的功能)
import csv
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
base_url = 'https://www.hindigeetmala.net'
link = 'https://www.hindigeetmala.net/movie/2_states.htm'
res = requests.get(link)
soup = BeautifulSoup(res.text, "lxml")
def merge_results(inpt):
return [','.join(nested_items for nested_items in
[','.join("'" + tuple_item + "'" for tuple_item in item)
for item in inpt])]
with open("hindigeetmala.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerow(['singer_records', 'actor_records'])
for item in soup.select("tr[itemprop='track']"):
try:
singers = [i.get_text(strip=True) for i in item.select(
"span[itemprop='byArtist']") if i.get_text(strip=True)]
except Exception:
singers = ""
try:
singer_links = [urljoin(base_url, i.get("href")) for i in item.select(
"a:has(> span[itemprop='byArtist'])") if i.get("href")]
except Exception:
singer_links = ""
singer_records = [i for i in zip(singers, singer_links)]
try:
actors = [i.get_text(strip=True) for i in item.select(
"a[href^='/actor/']") if i.get("href")]
except Exception:
actors = ""
try:
actor_links = [urljoin(base_url, i.get("href")) for i in item.select(
"a[href^='/actor/']") if i.get("href")]
except Exception:
actor_links = ""
actor_records = [i for i in zip(actors, actor_links)]
song_name = item.select_one(
"span[itemprop='name']").get_text(strip=True)
writer.writerow(merge_results(singer_records) +
merge_results(actor_records)+[song_name])
print(singer_records, actor_records, song_name)
推荐阅读
- ios - UITextField 没有响应验证码
- r - topsis(d, w, i) 中的错误:“决策”必须是矩阵或数据框
- css - 在 CSS 3 中画一个圆弧
- reactjs - 测试通过单击外部组件触发的 onClose 回调
- excel - Excel 会在打开“抱歉,我们找不到......”时查找其他找不到的文件
- optimization - 如何为 Gekko Python 提供目标函数的一阶和二阶导数?
- google-sheets - 从另一个单元格中提取 URL 以用于 importrange 函数
- c - 一次插入存储在 2 字节变量中的 3 x(4 位)值
- ruby - 使用 ruby-graphql 实现依赖注入的最佳实践是什么?
- prometheus - Prometheus 通过身份验证监控不同的服务器