首页 > 解决方案 > 为Python中列表中的每个值返回字典中的键

问题描述

因此,我有一个已加载到字典中的 JSON 文件。它有 8 个不同的键用于存储信息。我正在尝试创建一个搜索引擎,它返回包含搜索字符串中所有单词的配方并返回它们。我将字符串更改为将用于搜索的“令牌”列表。

这是存储在字典中的一些信息的示例。只要令牌位于标题、类别、成分或方向中,就应返回食谱。

{
  "title": "101 \"Whaler\" Fish Sandwich ",
  "categories": [
   "Sandwich",
   "Cheese",
   "Dairy",
   "Fish",
   "Tomato",
   "Saut\u00e9",
   "Kid-Friendly",
   "Mayonnaise",
   "Cornmeal",
   "Lettuce",
   "Cookie"
  ],
  "ingredients": [
   "1 cup whole milk",
   "2 eggs",
   "1 1/2 cups flour",
   "1/4 cup yellow cornmeal",
   "2 tablespoons chopped parsley",
   "4 flounder fillets",
   "Salt and pepper to taste",
   "3 tablespoons canola oil",
   "4 sesame-seed hamburger buns",
   "4 leaves romaine lettuce",
   "1/2 tomato, sliced",
   "4 slices mild cheese, such as mild cheddar (optional)",
   "1/2 cup mayonnaise",
   "2 tablespoons pickle relish",
   "1 tablespoon lemon juice",
   "1 dash Tabasco sauce"
  ],
  "directions": [
   "1. In a medium-size bowl, whisk together the milk and eggs. In another medium-size bowl, mix together the flour, cornmeal, and parsley.",
   "2. Season the fish with the salt and pepper.",
   "3. Dredge the fish through the egg mixture, then coat it thoroughly with the flour mixture.",
   "4. In a large saut\u00e9 pan, immediately heat the oil over medium-high heat. When it is hot but not smoking, add the fillets to the pan. Cook on one side until the batter is light golden brown, about 4 minutes. Carefully turn the fillets and cook for 2 to 3 minutes more. Using a slotted spatula, remove them from the pan and drain on paper towels.",
   "5. Meanwhile, whisk together the tartar-sauce ingredients (if using).",
   "6. Slice the buns and spread the tartar sauce (if using) on the insides. Place a fillet on each bottom bun, then top with the lettuce, tomato, and cheese, if desired."
  ],
  "rating": 4.375,
  "calories": 819.0,
  "protein": 35.0,
  "fat": 42.0
 },

我如何能够返回包含字符串中所有标记的字典中的键?

例如,如果有“煎饼糖浆”的搜索字符串,我会返回字典中同时包含“煎饼”和“糖浆”的食谱?

此代码从我正在读入 Python 的 JSON 文件创建一个字典:

file = open('recipes.json') 
recipes = json.load(file)

此代码清除输入字符串

def tokenisation(input_string):
    
    #functions to remove digits and punctuation and replace it with whitespace
    d_translate = str.maketrans(string.digits, ' '*len(string.digits))
    p_translate = str.maketrans(string.punctuation, ' '*len(string.punctuation))
    
    #clean the string
    new_string = input_string.translate(d_translate)
    new_string = new_string.translate(p_translate)
    new_string = new_string.lower()
    
    #split the string
    splitted_string = new_string.split(" ")
    
    #make a list to store tokens in
    tokens = []
    
    #checking length of token
    for token in splitted_string:
        if len(token) > 3:
            tokens.append(token)
    
    return tokens

那么实际的搜索功能(到目前为止)是这样的:

def search(query, ordering = 'normal', count = 10):
    

    token_list = tokenisation(query)    
    values = [recipes[k] for k in token_list]

但当然,它会单独检查每个令牌,而不是检查所有存在的令牌。

标签: pythonjsondictionary

解决方案


我认为您想要的是以下内容:

def search(query, ordering = 'normal', count = 10):
    token_list = tokenisation(query)

    matching_recipes = []
    for recipe in recipes:
        recipe_tokens = []
        for key in recipe:
            if type(recipe[key]) != list:
                continue
            if type(recipe[key]) == str:
                recipe_tokens.append(recipe[key])
            for sentence in recipe[key]:
                # Make sure all the words from the recipes are in one big list
                recipe_tokens.extend([t for t in sentence.split()])

        if all([tl in recipe_tokens for tl in token_list]):
            # check if all the tokens from token_list are in the tokens of the recipe
            matching_recipes.append(recipe)

    return matching_recipes

推荐阅读