首页 > 解决方案 > 从维基百科/维基数据/链接数据中获取已消除歧义的同音词列表

问题描述

"George Bush"如果我在维基百科上手动搜索,我会得到这个页面,其中列出了带有简短描述的同音词。

我想将我的搜索提供给 api 并获取以下信息:

只要我能明确地解析它,我不介意得到更多。

我的目标是让网站的用户能够标记公众人物,但我想限制他们的选择并避免歧义,所以这个列表可能会略有不同,任何其他带有 api 的体面数据库都可以。

我还没有弄清楚如何使用维基百科或维基数据来做到这一点,我只是设法在知道特定 ID/页面后进行查询,这里不是这种情况。

标签: apiwikipedia-apilinked-datawikidata-api

解决方案


有几种方法可以做到这一点,具体取决于您想要的数据类型。

例如 - https://en.wikipedia.org/w/api.php?action=query&titles=George%20Bush&prop=links - 会告诉您该人的姓名是否存在“歧义”。

这将返回:

               {
                    "ns": 0,
                    "title": "Bush family"
                },
                {
                    "ns": 0,
                    "title": "George Brush (disambiguation)"
                },
                {
                    "ns": 0,
                    "title": "George Bush (biblical scholar)"
                },
                {
                    "ns": 0,
                    "title": "George Bush (footballer)"
                },
                {
                    "ns": 0,
                    "title": "George Bush (racing driver)"
                },
                {
                    "ns": 0,
                    "title": "George H. W. Bush"
                },
                {
                    "ns": 0,
                    "title": "George P. Bush"
                },
                {
                    "ns": 0,
                    "title": "George W. Bush"
                },
                {
                    "ns": 0,
                    "title": "George Washington Bush"

您可以使用 - https://en.wikipedia.org/w/api.php?action=query&utf8=&list=search&srsearch=George%20Bush一次获取更多数据

这会让你:

    "search": [
        {
            "ns": 0,
            "title": "George W. Bush",
            "pageid": 3414021,
            "size": 299185,
            "wordcount": 27007,
            "snippet": "<span class=\"searchmatch\">George</span> Walker <span class=\"searchmatch\">Bush</span> (born July 6, 1946) is an American politician who served as the 43rd President of the United States from 2001 to 2009. He had previously",
            "timestamp": "2018-09-26T21:48:08Z"
        },
        {
            "ns": 0,
            "title": "George H. W. Bush",
            "pageid": 11955,
            "size": 210189,
            "wordcount": 20867,
            "snippet": "<span class=\"searchmatch\">George</span> Herbert Walker <span class=\"searchmatch\">Bush</span> (born June 12, 1924) is an American politician who served as the 41st President of the United States from 1989 to 1993. Prior",
            "timestamp": "2018-10-01T06:41:50Z"
        },

推荐阅读