首页 > 解决方案 > 试图将网页中的 JS 数据输出到 .html 输出文件中

问题描述

我已经成功地将 JS 中列出的网站抓取到本地 .html 文件中,但输出不足。

问题是:

非常感谢

import requests
import json
from bs4 import BeautifulSoup

JSONDATA = requests.request("GET", "https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=1000000&page=1")
JSONDATA = JSONDATA.json()

for line in JSONDATA['posts']:
    soup = BeautifulSoup(line['episodeNumber'],'lxml')
    soup = BeautifulSoup(line['title'],'lxml')
    soup = BeautifulSoup(line['image']['large'],'lxml')
    soup = BeautifulSoup(line['excerpt']['long'],'lxml')
    soup = BeautifulSoup(line['audioSource'],'lxml')
with open("output1.html", "w") as file:
    file.write(str(soup))

标签: pythonhtmljsonpython-3.xbeautifulsoup

解决方案


使用pandas库,将数据保存到CSV当前项目目录的文件中

import requests
import pandas as pd

resp = requests.get("https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=1000000&page=1").json()
df = pd.DataFrame(resp['posts'], columns=['episodeNumber', 'title', 'image','excerpt','audioSource'])
#it will save data into post csv file and stored in current project directory
df.to_csv("posts.csv")

推荐阅读