首页 > 解决方案 > Movie File 将文件解析成字典形式

问题描述

1.6. 推荐电影 创建一个函数,计算一组电影评论中有多少关键词相似,并推荐关键词数量最相似的电影。此任务的解决方案将需要使用字典。电影评论和关键字位于名为 film_reviews.txt 的文件中,以逗号分隔。第一个词是电影名称,其余词是电影的关键字标签(即“惊人”、“诗意”、“可怕”等)。

函数名称:similar_movie()

参数/参数:电影名称

返回:与作为参数传递的电影相似的电影列表

film_reviews.txt -

7 Days in Entebbe,fun,foreign,sad,boring,slow,romance
12 Strong,war,violence,foreign,sad,action,romance,bloody
A Fantastic Woman,fun,foreign,sad,romance
A Wrinkle in Time,book,witty,historical,boring,slow,romance
Acts of Violence,war,violence,historical,action
Annihilation,fun,war,violence,gore,action
Armed,foreign,sad,war,violence,cgi,fancy,action,bloody
Black '47,fun,clever,witty,boring,slow,action,bloody
Black Panther,war,violence,comicbook,expensive,action,bloody

标签: python

解决方案


我认为这对你有用

film_data = {'films': {}} 
with open('film_reviews.txt', 'r') as f:
    for line in f.readlines():
        data = line.split(',')
        data[-1] = data[-1].strip() # removing new line character
        film_data['films'][data[0].lower()] = data[1:]

def get_smilar_movie(name):
    if name.lower() in film_data['films'].keys():
        original_review = film_data['films'][name.lower()]
        similarities = dict()
        for key in film_data['films']:
            if key == name.lower():
                continue
            else:
                similar_movie_review = set(film_data['films'][key])
                overlap = set(original_review) & similar_movie_review
                universe = set(original_review) | similar_movie_review
                # % of overlap  compared to the first movie = output1
                output1 = float(len(overlap)) / len(set(original_review)) * 100 
                # % of overlap compared to the second movie = output2
                output2 = float(len(overlap)) / len(similar_movie_review) * 100 
                # % of overlap compared to universe
                output3 = float(len(overlap)) / len(universe) * 100 
                similarities[output1 + output2 + output3] = dict()
                similarities[output1 + output2 + output3]['reviews'] = film_data['films'][key]
                similarities[output1 + output2 + output3]['movie'] = key 
        max_similarity = max(similarities.keys())
        movie2 = similarities[max_similarity]
        print(name,' reviews ',film_data['films'][name.lower()])
        print('similar movie ',movie2)
        print('Similarity = {0:.2f}/100'.format(max_similarity/3))
        return movie2['movie']
    return None

get_similar_movie函数将从 film_data 返回最相似的电影dict。该函数将电影名称作为argument.


推荐阅读