首页 > 解决方案 > 如何从python中的json文件中删除所有重音

问题描述

我正在 python 中导入一个 json 文件,但该文件在城市名称中充满了重音字符(来自葡萄牙语),我需要以某种方式从该文件中删除以进一步使用。例如,单词'São Paulo', 'Santo André'and'Foz do Iguaçu'应该在 json 中变为:Sao Paulo, Santo Andre 和 Foz do Iguacu。

    { "type": "FeatureCollection", "features": [ 
        { "type": "Feature", "properties": {"id": "1100015", "name": "São Paulo", "description": "Alta Floresta D'Oeste"}, "geometry": { "type": "Polygon", "coordinates": [-62.1820888570, -11.8668597878] }},
        { "type": "Feature", "properties": {"id": "1100023", "name": "Santo André", "description": "Ariquemes"}, "geometry": { "type": "Polygon", "coordinates": [-62.5359497334, -9.7318235272] }},
        { "type": "Feature", "properties": {"id": "1100031", "name": "Foz do Iguaçu", "description": "Cabixi"}, "geometry": { "type": "Polygon", "coordinates": [-60.3993982597, -13.4558418276] }}
}

标签: pythonjson

解决方案


使用 unidecode :)

import unidecode
import json

places_json =      '''
        { "type": "FeatureCollection", 
        "features": [ 
        { "type": "Feature", "properties": {"id": "1100015", "name": "São Paulo", "description": "Alta Floresta D'Oeste"}, "geometry": { "type": "Polygon", "coordinates": [-62.1820888570, -11.8668597878] }},
        { "type": "Feature", "properties": {"id": "1100023", "name": "Santo André", "description": "Ariquemes"}, "geometry": { "type": "Polygon", "coordinates": [-62.5359497334, -9.7318235272] }},
        { "type": "Feature", "properties": {"id": "1100031", "name": "Foz do Iguaçu", "description": "Cabixi"}, "geometry": { "type": "Polygon", "coordinates": [-60.3993982597, -13.4558418276] }}
                    ]
        }
        '''
json_dec = unidecode.unidecode(places_json)
print(json.loads(json_dec))

推荐阅读