首页 > 解决方案 > Writing python web scraped data to .csv file on a Mac

问题描述

I am brand new to Python and taking a class. I think I am close to completing the requirements but am stuck on getting my data into a csv file. The file is always empty. I have tried multiple things with the write portion of the code and still can't figure it out. Any guidance would be appreciated.

import requests
from bs4 import BeautifulSoup
import csv
import os.path

url = "https://www.census.gov/programs-surveys/popest.html"
response = requests.get(url)
# parse html
page = str(BeautifulSoup(response.content))

def getURL(page):
    start_link = page.find("a href")
    if start_link == -1:
        return None, 0
    start_quote = page.find('"', start_link)
    end_quote = page.find('"', start_quote + 1)
    url = page[start_quote + 1: end_quote]
    return url, end_quote

while True:
    url, n = getURL(page)
    page = page[n:]
    if url:
        print url
    else:
        break

userhome = os.path.expanduser('~')
myfile = os.path.join(userhome, 'Desktop', 'data.csv')

f=open(myfile,"w")
f.write(getURL)
f.close()

标签: python

解决方案


您使用的是 Python 2 还是 3?我注意到您正在调用不带括号的打印函数。

您的主要问题是您只是将函数(getURL)调用到 f.write,您需要传递您尝试保存的实际值。在您的情况下,您正在打印的“url”变量是我假设您要保存的变量。

虽然我不确定这是您想要的格式,但在我的 data.csv 文件中,通过进行以下更改,我将每个 URL 都放在了一个新行上:

  1. 将此代码移动到while 循环之前的行:

userhome = os.path.expanduser('~')

myfile = os.path.join(userhome, 'Desktop', 'data.csv')

f=打开(我的文件,“w”)

  1. 在你的 while 循环中,在你的 print 语句之前或之后添加这个:

    f.write(url+"\n")

  2. 将 f.close() 留在脚本末尾


推荐阅读