python - pythonic 方法来识别 url 中的名称并将其与现有的一组名称匹配
问题描述
您好,这是我想解决的问题,但我被卡住了。
给定一个 url 列表,我想执行以下操作:
- 提取 url 中的名称
- 将从 url 中找到的名称与现有名称的字典匹配
- 有 1 个找到的所有名称的字典,将找到的名称拆分为 2 个单独的字典,1 个与在字典中找到的名称相关联,另一个与未找到的名称相关联
例子:
INPUT :
urls = ['www.twitter.com/users/aoba-joshi/$#fsd=43r',
'www.twitter.com/users/chrisbrown-e2/#4f=34ds',
'www.facebook.com/celebrity/neil-degrasse-tyson',
'www.instagram.com/actor-nelson-bigetti']
# the key is the ID associated to the names, and the values are all the potential names
existing_names = {1 : ['chris brown', 'chrisbrown', 'Brown Chris', 'brownchris'] ,
2 : ['nelson bigetti', 'bigetti nelson', 'nelsonbigetti', 'bigettinelson'],
3 : ['neil degrasse tyson', 'tyson neil degreasse', 'tysonneildegrasse', 'neildegrassetyson']}
OUTPUT :
# names_found will be a dictionary with the key as the URL and the values as the found name
names_found = {'www.twitter.com/users/aoba-joshi/$#fsd=43r' : 'aoba joshi',
'www.twitter.com/users/chrisbrown-e2/#4f=34ds' : 'chris brown',
'www.facebook.com/celebrity/neil-degrasse-tyson' : 'neil degrasse tyson',
'www.instagram.com/actor-nelson-bigetti' : 'nelson bigetti'}
# existing_names_found is a dictionary where the keys are the found name, and the values are the corresponding list of names in the existing names dictionary
existing_names_found = {'chris brown' : ['chris brown', 'chrisbrown', 'Brown Chris', 'brownchris'],
'neil degrasse tyson' : ['neil degrasse tyson', 'tyson neil degreasse', 'tysonneildegrasse', 'neildegrassetyson'],
'nelson bigetti' : ['nelson bigetti', 'bigetti nelson', 'nelsonbigetti', 'bigettinelson']}
# new_names_found is a dictionary with the keys as the new name found, and the values as the url associated to the new found name
new_names_found = {'aoba joshi' : 'www.twitter.com/users/aoba-joshi/$#fsd=43r'}
解决方案
好吧......如果我得到正确的你想要做的......这是应该工作的东西
for link in links_list:
link_split = link.split('/')
name_list = link_split[2].split('-') # makes from chris-brown-xx => chrisbrownxx
name = ""
for part in name:
name + part
for (key, value) in existing_names: # check if the name is in the list
for name_x in value:
name_x = # same as I did with name_list, but this time with " "
if name_x in name.lower():
# append it to new_names_found
(抱歉,我正在手机上输入此内容,但希望对您有所帮助:))
(或者,您可以尝试查看它是否包含文本的两个部分......但这样的事情会失败 - >“Luke Luk”并在“Luke O'Niel”上检查它)......有很多有问题的
推荐阅读
- c# - C#为图表计算数据表列
- azure - 未处理的异常:System.ComponentModel.Win32Exception:在 Azure DevOps 中实现代码覆盖 (SonarQube) 时权限被拒绝
- angular - 验证 mat-table 内的表单字段输入(Angular 7)
- asterisk - 为什么我需要 Asterisk 中的直接媒体
- java - 如何求程序的时间复杂度?
- java - 如何只允许通过 Java 中的代理进行连接?
- pyspark - EMR 集群引导 + 集群设置环境变量
- react-native - 在 React Native Image 中使用 Import 语法
- xamarin.forms - ImageCropper.Forms 插件在 Xamarin.Forms iOS 中不起作用
- discord - 一个机器人,您可以在其中回答控制台日志或不和谐频道中的 dms