python - 将函数返回的键值作为新列附加到 Dataframe
问题描述
我有一个数据框,其中包含我想提取几个值的 url 列表。然后应将返回的键/值添加到原始数据框中,其中键作为新列和相应的值。
我认为这会神奇地发生
result_type='expand'
,但显然不会。当我尝试
df5["data"] = df5.apply(lambda x: request_function(x['url']),axis=1, result_type='expand')
我最终将结果全部放在一个数据列中:
[{'title': ['Python Notebooks: Connect to Google Search Console API and Extract Data - Adapt'], 'description': []}]
我的目标是一个包含以下 3 列的数据框:
| URL| Title | Description|
这是我的代码:
import requests
from requests_html import HTMLSession
import pandas as pd
from urllib import parse
ex_dic = {'url': ['https://www.searchenginejournal.com/reorganizing-xml-sitemaps-python/295539/', 'https://searchengineland.com/check-urls-indexed-google-using-python-259773', 'https://adaptpartners.com/technical-seo/python-notebooks-connect-to-google-search-console-api-and-extract-data/']}
df5 = pd.DataFrame(ex_dic)
df5
def request_function(url):
try:
found_results = []
r = session.get(url)
title = r.html.xpath('//title/text()')
description = r.html.xpath("//meta[@name='description']/@content")
found_results.append({ 'title': title, 'description': description})
return found_results
except requests.RequestException:
print("Connectivity error")
except (KeyError):
print("anoter error")
df5.apply(lambda x: request_function(x['url']),axis=1, result_type='expand')
解决方案
ex_dic
应该是字典列表,以便您可以更新应用的属性。
import requests
from requests_html import HTMLSession
import pandas as pd
from urllib import parse
ex_dic = {'url': ['https://www.searchenginejournal.com/reorganizing-xml-sitemaps-python/295539/', 'https://searchengineland.com/check-urls-indexed-google-using-python-259773', 'https://adaptpartners.com/technical-seo/python-notebooks-connect-to-google-search-console-api-and-extract-data/']}
ex_dic['url'] = [{'url': item} for item in ex_dic['url']]
df5 = pd.DataFrame(ex_dic)
session = HTMLSession()
def request_function(url):
try:
print(url)
r = session.get(url['url'])
title = r.html.xpath('//title/text()')
description = r.html.xpath("//meta[@name='description']/@content")
url.update({ 'title': title, 'description': description})
return url
except requests.RequestException:
print("Connectivity error")
except (KeyError):
print("anoter error")
df6 = df5.apply(lambda x: request_function(x['url']),axis=1, result_type='expand')
print df6
推荐阅读
- youtube - YouTube API:区分 Premiered 和 Livestream
- javascript - 在Javascript中重新排序对象中的值数组
- c# - 建立连接到数据库的基于规则的系统
- swift - Firebase - 获取随机数据
- r - 在 R 中的循环中移动时处理来自同一用户的数据两次
- c++ - 针对 Win10 SDK 构建时 TYPE_ALIGNMENT(LARGE_INTEGER) 不正确
- oauth-2.0 - 捷径 OAuth/OIDC 的风险?
- java - TestNG 数据提供者将对象二维数组转换为 Hashtable - 抛出 MethodMatcherException
- python - Python beautifulsoup - 如何获取项目,稍后在浏览器中加载
- html - 如何在屏幕底部定位元素?