首页 > 解决方案 > 谷歌知识图搜索 API 奇怪的结果

问题描述

我想使用这个 API,但结果让我很困惑。

  1. 我想使用搜索字符串“Abs”搜索 types = “Brand”,languages = “en”,我得到了 2 个正确和 1 个不正确的结果,请。检查 KG Search API Explorer 的响应:
{
  "@context": {
    "detailedDescription": "goog:detailedDescription",
    "goog": "http://schema.googleapis.com/",
    "EntitySearchResult": "goog:EntitySearchResult",
    "kg": "http://g.co/kg",
    "resultScore": "goog:resultScore",
    "@vocab": "http://schema.org/"
  },
  "@type": "ItemList",
  "itemListElement": [
    {
      "@type": "EntitySearchResult",
      "resultScore": 296.41555786132812,
      "result": {
        "@id": "kg:/m/01bnqx",
        "@type": [
          "Brand",
          "Thing"
        ],
        "name": "Absolut Vodka",
        "detailedDescription": {
          "license": "https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License",
          "url": "https://en.wikipedia.org/wiki/Absolut_Vodka",
          "articleBody": "Absolut Vodka is a brand of vodka, produced near Åhus, in southern Sweden. Absolut is a part of the French group Pernod Ricard. Pernod Ricard bought Absolut for €5.63 billion in 2008 from the Swedish state. Absolut is one of the largest brands of spirits in the world and is sold in 126 countries.\n"
        },
        "url": "http://www.absolut.com"
      }
    },
    {
      "result": {
        "@id": "kg:/m/04hqw8",
        "name": "Absolute",
        "@type": [
          "Brand",
          "Thing"
        ],
        "detailedDescription": {
          "license": "https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License",
          "articleBody": "Absolute is the brand of a long-running series of compilation albums owned by the Swedish record company EVA Records. Initially, the only albums in the series were called Absolute Music, but starting in 1990 there have been other themed albums such as Absolute Dance and Absolute Rock.",
          "url": "https://en.wikipedia.org/wiki/Absolute_(record_compilation)"
        }
      },
      "resultScore": 103.74134826660161,
      "@type": "EntitySearchResult"
    },
    {
      "@type": "EntitySearchResult",
      "resultScore": 0.0041735083796083927,
      "result": {
        "detailedDescription": {
          "articleBody": "S-AWC is the brand name of an advanced full-time four-wheel drive system developed by Mitsubishi Motors. The technology, specifically developed for the new 2007 Lancer Evolution, the 2010 Outlander, the 2014 Outlander, the Outlander PHEV and the Eclipse Cross have an advanced version of Mitsubishi Motors' AWC system. ",
          "url": "https://en.wikipedia.org/wiki/Mitsubishi_S-AWC",
          "license": "https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License"
        },
        "name": "Mitsubishi S-AWC",
        "@type": [
          "Brand",
          "Thing"
        ],
        "@id": "kg:/m/02vtht5"
      }
    }
  ]
}

其中 Absolut Vodka 和 Absolut 是不错的结果,但老实说我不明白为什么“Mitsubishi S-AWC”会出现在这个结果中(resultScore 这么低)。任何想法表示赞赏:)

  1. 我认为像查询参数中设置的最小 resultScore 这样的功能会很棒!我在这里没有找到这样的:方法实体.搜索

  2. 此外,我还没有找到有关作为搜索字符串接受的最小字符数的信息(2、3、更多?)

谢谢!

标签: google-apis-explorergoogle-knowledge-graph

解决方案


Google Entity Search API 输出所有语言的全文搜索结果。“languages”参数不影响搜索,它只影响输出。

具体来说,在搜索“ABS”时,您会得到“Mitsubishi S-AWC”,因为中文维基百科中的相关中文文章在摘要[1]中包含标记 ABS。

例如,您可以通过中文搜索“S-AWC 是品牌名称”并获得中文维基百科[2]的链接,即使中文文章不包含这些词。

这里的得分是某种 BM25 变体[3]。您可以随意过滤它(例如获取第一个结果),但在您的示例中响应是正确的。

[1] https://zh.wikipedia.org/zh-cn/S-AWC%E8%B6%85%E8%83%BD%E5%85%A8%E6%99%82%E5%9B%9B %E8%BC%AA%E6%8E%A7%E5%88%B6%E7%B3%BB%E7%B5%B1

[2] https://angryloki.github.io/mreid-resolver/#/search?lang=zh&q=S-AWC%20is%20the%20brand%20name&type=Brand

[3] https://en.wikipedia.org/wiki/Okapi_BM25


推荐阅读