首页 > 解决方案 > 发生异常:Python 中的 TypeError

问题描述

我对编码很陌生,所以很抱歉这是一个愚蠢的问题。每次尝试为 Python 爬虫运行此代码时,我都会收到错误消息。任何帮助都会很棒。

Exception has occurred: TypeError
'module' object is not callable
  File "C:\Users\quawee\OneDrive\seaporn.org-scraper\seaporn.org-scraper.py", line 33, in <module>
    articles = requests(x)

从这段代码....

import requests
from bs4 import BeautifulSoup
import pandas as pd
import time

articlelist = []

def request(x):
    url = f'https://www.seaporn.org/category/hevc/page/{x}/'
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}
    r = requests.get(url, headers=headers)
    soup = BeautifulSoup(r.content, features='lxml')
    return soup.find_all('article', class_ = 'post-summary')

def parse(articles):
    for item in articles:
        link = item.find({'a': 'entry-link'})
        article = {
            'link': link['href']
        }

        articlelist.append(article)

def output():
    df = pd.DataFrame(articlelist)
    df.to_excel('articlelist.xlsx', index=False)
    print('Saved to xlsx.')

x = 5000

while True:
    print(f'Page {x}')
    articles = requests(x)
    x = x + 1
    time.sleep(3)
    if len(articles) != 0:
        parse(articles)
    else:
        break

print('Completed, total articles is', len(articlelist))
output()

标签: pythonpython-3.x

解决方案


您定义的函数的名称是request(x). 您requests(x)在 while 循环内调用。这应该可行,我只是更正了拼写:

import requests
from bs4 import BeautifulSoup
import pandas as pd
import time

articlelist = []

def request(x):
    url = f'https://www.seaporn.org/category/hevc/page/{x}/'
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}
    r = requests.get(url, headers=headers)
    soup = BeautifulSoup(r.content, features='lxml')
    return soup.find_all('article', class_ = 'post-summary')

def parse(articles):
    for item in articles:
        link = item.find({'a': 'entry-link'})
        article = {
            'link': link['href']
        }

        articlelist.append(article)

def output():
    df = pd.DataFrame(articlelist)
    df.to_excel('articlelist.xlsx', index=False)
    print('Saved to xlsx.')

x = 5000

while True:
    print(f'Page {x}')
    articles = request(x)
    x = x + 1
    time.sleep(3)
    if len(articles) != 0:
        parse(articles)
    else:
        break

print('Completed, total articles is', len(articlelist))
output()

推荐阅读