首页 > 解决方案 > 针对Apache Jena Fuseki的python中的listOfDict到RDF转换

问题描述

为了从 python 中将一些数据存储在 Apache Jena 中,我希望进行从字典列表到 RDF 的通用转换,并可能返回查询。

对于 Dict 到 RDF 部分的列表,我尝试实现“insertListofDicts”(见下文)并使用“testListOfDictInsert”(见下文)对其进行测试。结果如下,当使用 Apache Jena Fuseki 服务器尝试时会导致 400: Bad Request。

简单字符串类型需要修复什么 - 并且可能是其他原始 Python 类型才能使其正常工作?

另请在以下位置找到源代码:

@prefix foaf: <http://xmlns.com/foaf/0.1/>
INSERT DATA {
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#name "Elizabeth Alexandra Mary Windsor".
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#born "1926-04-21".
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q9682".
foaf:Person/George+of+Cambridge foaf:Person#name "George of Cambridge".
foaf:Person/George+of+Cambridge foaf:Person#born "2013-07-22".
foaf:Person/George+of+Cambridge foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q1359041".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#name "Harry Duke of Sussex".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#born "1984-09-15".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q152316".

}

testListOfDictInsert

def testListOfDictInsert(self):
        '''
        test inserting a list of Dicts using FOAF example
        https://en.wikipedia.org/wiki/FOAF_(ontology)
        '''
        listofDicts=[
            {'name': 'Elizabeth Alexandra Mary Windsor', 'born': '1926-04-21', 'age': 94, 'ofAge': True , 'wikidataurl': 'https://www.wikidata.org/wiki/Q9682' },
            {'name': 'George of Cambridge',              'born': '2013-07-22', 'age':  7, 'ofAge': False, 'wikidataurl': 'https://www.wikidata.org/wiki/Q1359041'},
            {'name': 'Harry Duke of Sussex',             'born': '1984-09-15', 'age': 36, 'ofAge': True , 'wikidataurl': 'https://www.wikidata.org/wiki/Q152316'}
        ]
        jena=self.getJena(mode='update',debug=True)
        jena.insertListOfDicts(listofDicts,'foaf:Person','name','@prefix foaf: <http://xmlns.com/foaf/0.1/>')

插入字典列表

def insertListOfDicts(self,listOfDicts,entityType,primaryKey,prefixes):
        '''
        insert the given list of dicts mapping datatypes according to
        https://www.w3.org/TR/xmlschema-2/#built-in-datatypes
        
        mapped from 
        https://docs.python.org/3/library/stdtypes.html
        
        compare to
        https://www.w3.org/2001/sw/rdb2rdf/directGraph/
        http://www.bobdc.com/blog/json2rdf/
        https://www.w3.org/TR/json-ld11-api/#data-round-tripping
        https://stackoverflow.com/questions/29030231/json-to-rdf-xml-file-in-python
        '''
        errors=[]
        insertCommand='%s\nINSERT DATA {\n' % prefixes
        for index,record in enumerate(listOfDicts):
            if not primaryKey in record:
                errors.append["missing primary key %s in record %d",index]
            else:    
                primaryValue=record[primaryKey]
                encodedPrimaryValue=urllib.parse.quote_plus(primaryValue)
                tSubject="%s/%s" %(entityType,encodedPrimaryValue)
                for keyValue in record.items():
                    key,value=keyValue
                    valueType=type(value)
                    if self.debug:
                        print("%s(%s)=%s" % (key,valueType,value))
                    tPredicate="%s#%s" % (entityType,key)
                    tObject=value    
                    if valueType == str:   
                        insertCommand+='  %s %s "%s".\n' % (tSubject,tPredicate,tObject)
        insertCommand+="\n}"
        if self.debug:
            print (insertCommand)
        self.insert(insertCommand)
        return errors

标签: pythonsparqlrdfsparqlwrapper

解决方案


+是 HTTP 表单编码中用于空格的特殊字符,但它只能用于application/x-www-form-urlencoded.

对于 URI,使用%20或决定替换字符,例如_空格,因为它看起来有点像空格。

在所有这些情况下,URI 中都没有空格字符 - 有+, %20(三个字符)或_. 它是编码,而不是转义机制。


推荐阅读