首页 > 解决方案 > 将特定文本附加到输出数据以生成 url

问题描述

需要将特定文本附加(添加)到所有输出行并最终创建一个 url 。代码后的更多解释。

from bs4 import BeautifulSoup
import requests

source = requests.get('https://dota2.gamepedia.com/Category:Counters').text

soup = BeautifulSoup(source, 'lxml')
link = soup.find('div', class_="mw-category")

heroes_names = []

savefile = open('file.txt', 'w')

for link in link:
    link = link.text
    # print(link)
    heroes = link.split("\n")
    # print(heroes)
    for i in range(1,len(heroes)):
        # print(heroes)
        heroname = heroes[i].split("/")[0]
        print(heroname)
        heroes_names.append(heroname)
        savefile.write(heroname + "\n")

# for hero_name in heroes_names:
#     print(hero_name)
savefile.close()

所需输出:

最终要求:

标签: python

解决方案


所以你已经把你所有的英雄名字都放在了 heros_names 中,对吧?然后您可以像这样创建网址:

url_list = []
for hero_name in heroes_names:
    print(hero_name + "/counters") # Prints out HERO/counters
    url = "https://dota2.gamepedia.com/%s/Counters" % hero_name
    url_list.append(url)

url_list 然后包含你的 heros_names 列表中英雄的所有 url。


推荐阅读