首页 > 解决方案 > 由于for循环中的for循环而感到困惑

问题描述

我在我的程序中将 a 定义used = []为全局var。现在我有一个函数jimotiwhile Loop. 在函数内部,我正在遍历 web-scraping(bs4) 的结果并将titleweb-scrape 添加到used列表中。当title存在于used我试图不再打印它时,它会一次又一次地打印,因为正则表达式在两个或三个关键字上匹配它,所以我打印了 2、3 次相同的文本。我怎么能改变代码,只打印一次?

这是代码


from bs4 import BeautifulSoup
import requests
from time import sleep
from random import randint
import re
import os

allowed = ["pc", "FUJITSU", "LIFEBOOK", "win" "Windows",
            "PC", "Linux" "linux", "HP", "hp", "notebook", "desktop",
            "raspberry", "NEC", "mac", "Mac", "Core"]
denied = ["philips"]
used = set()

source = requests.get("https://jmty.jp/aichi/sale-pcp").text
soup = BeautifulSoup(source, 'lxml')


def jimoti(sk):
    global used
    for h2 in soup.find_all('div', class_='p-item-content-info'):
        title = h2.select_one('.p-item-title').text
        address = h2.select_one('.p-item-title').a["href"]
        price = (h2.select_one('.p-item-most-important').text).replace("円", "").replace("\n", "").replace(",", "")
        price = int(price)
        town = h2.select_one('.p-item-supplementary-info').text
        if price < 5000000 and title not in used:
            used.add(title)
            for pattern in allowed:
                print(pattern)
                if re.search(pattern, title):
                    second(sk, title, address, price, town)
                    break

def second(sk, title, address, price, town):
    sk = sk
    title = title
    address = address
    price = price
    town = town
    for prh in denied:
        print(prh)
        if re.search(prh, title):
            break
        else:
            send(sk, title, address, price, town)
                            


if __name__ == '__main__':
    while True:
        jimoti(sk)
        sleep(randint(11,20))

标签: pythonpython-3.xloops

解决方案


最初的问题是关于在设置条件时循环打印元素不止一次 - 我们break在第一次点击它之后通过循环外来避免这种情况。

seen = set()
for i in range(10):
    if i not in seen:
        for x in range(10):
            seen.add(i)
            break

内部循环中间有逻辑-

for prh in denied:
        print(prh)
        if re.search(prh, title):
            break
        else:
            send(sk, title, address, price, town)

所写的这将查找prhintitle直到找到它,然后 break-so 将为prh不在 in 中的每个值调用一次发送title。这可能不是你的意思 - 怎么样send unless any of the prh values is in title?

if all([prh not in title for prh in denied]):
    send(sk, title, address, price, town)

上升一个级别,您对“允许”的逻辑基本相同。我的猜测是正确的逻辑是这样的——

used.add(title)
if (
    any([word in title for word in allowed])
    and all([word not in title for word in denied]
):
  send(sk, title, address, price, town)

我也不确定为什么你把那些睡袋乱扔在那里,似乎具有破坏性。不确定 stackoverflow 是否是您可能需要的支持级别的最佳论坛 - 对于 python 初学者来说,reddit 可能会更有帮助。


推荐阅读