python - Movie File 将文件解析成字典形式
问题描述
1.6. 推荐电影 创建一个函数,计算一组电影评论中有多少关键词相似,并推荐关键词数量最相似的电影。此任务的解决方案将需要使用字典。电影评论和关键字位于名为 film_reviews.txt 的文件中,以逗号分隔。第一个词是电影名称,其余词是电影的关键字标签(即“惊人”、“诗意”、“可怕”等)。
函数名称:similar_movie()
参数/参数:电影名称
返回:与作为参数传递的电影相似的电影列表
film_reviews.txt -
7 Days in Entebbe,fun,foreign,sad,boring,slow,romance
12 Strong,war,violence,foreign,sad,action,romance,bloody
A Fantastic Woman,fun,foreign,sad,romance
A Wrinkle in Time,book,witty,historical,boring,slow,romance
Acts of Violence,war,violence,historical,action
Annihilation,fun,war,violence,gore,action
Armed,foreign,sad,war,violence,cgi,fancy,action,bloody
Black '47,fun,clever,witty,boring,slow,action,bloody
Black Panther,war,violence,comicbook,expensive,action,bloody
解决方案
我认为这对你有用
film_data = {'films': {}}
with open('film_reviews.txt', 'r') as f:
for line in f.readlines():
data = line.split(',')
data[-1] = data[-1].strip() # removing new line character
film_data['films'][data[0].lower()] = data[1:]
def get_smilar_movie(name):
if name.lower() in film_data['films'].keys():
original_review = film_data['films'][name.lower()]
similarities = dict()
for key in film_data['films']:
if key == name.lower():
continue
else:
similar_movie_review = set(film_data['films'][key])
overlap = set(original_review) & similar_movie_review
universe = set(original_review) | similar_movie_review
# % of overlap compared to the first movie = output1
output1 = float(len(overlap)) / len(set(original_review)) * 100
# % of overlap compared to the second movie = output2
output2 = float(len(overlap)) / len(similar_movie_review) * 100
# % of overlap compared to universe
output3 = float(len(overlap)) / len(universe) * 100
similarities[output1 + output2 + output3] = dict()
similarities[output1 + output2 + output3]['reviews'] = film_data['films'][key]
similarities[output1 + output2 + output3]['movie'] = key
max_similarity = max(similarities.keys())
movie2 = similarities[max_similarity]
print(name,' reviews ',film_data['films'][name.lower()])
print('similar movie ',movie2)
print('Similarity = {0:.2f}/100'.format(max_similarity/3))
return movie2['movie']
return None
该get_similar_movie
函数将从 film_data 返回最相似的电影dict
。该函数将电影名称作为argument
.
推荐阅读
- doctrine - 运行命令学说错误:模式:更新 symfony
- java - AWS DMS 服务在目标数据库上返回不同格式的 dateTime
- typescript - 为什么我的 Sequelize / Typescript 函数错误“存在具有此名称的两种不同类型,但它们不相关。”?
- scala - 读取文件火花,将具有特定值的字段设置为空或“”
- vba - 如何从单元格中去除阴影并保持内部颜色不变
- jquery - jQuery函数过滤表多输入
- java - 使用 JdbcTemplate 将查询准备为字符串
- javascript - ng-repeat-start 中的多个级别不起作用
- c# - 如何使用额外的运行参数从现有的 exe 文件中生成 exe 文件?
- symbolic-math - 如何从包含特定变量 MAPLE 的符号表达式中获取部分表达式?