python - 如何从python中的json文件中删除所有重音
问题描述
我正在 python 中导入一个 json 文件,但该文件在城市名称中充满了重音字符(来自葡萄牙语),我需要以某种方式从该文件中删除以进一步使用。例如,单词'São Paulo'
, 'Santo André'
and'Foz do Iguaçu'
应该在 json 中变为:Sao Paulo, Santo Andre 和 Foz do Iguacu。
{ "type": "FeatureCollection", "features": [
{ "type": "Feature", "properties": {"id": "1100015", "name": "São Paulo", "description": "Alta Floresta D'Oeste"}, "geometry": { "type": "Polygon", "coordinates": [-62.1820888570, -11.8668597878] }},
{ "type": "Feature", "properties": {"id": "1100023", "name": "Santo André", "description": "Ariquemes"}, "geometry": { "type": "Polygon", "coordinates": [-62.5359497334, -9.7318235272] }},
{ "type": "Feature", "properties": {"id": "1100031", "name": "Foz do Iguaçu", "description": "Cabixi"}, "geometry": { "type": "Polygon", "coordinates": [-60.3993982597, -13.4558418276] }}
}
解决方案
使用 unidecode :)
import unidecode
import json
places_json = '''
{ "type": "FeatureCollection",
"features": [
{ "type": "Feature", "properties": {"id": "1100015", "name": "São Paulo", "description": "Alta Floresta D'Oeste"}, "geometry": { "type": "Polygon", "coordinates": [-62.1820888570, -11.8668597878] }},
{ "type": "Feature", "properties": {"id": "1100023", "name": "Santo André", "description": "Ariquemes"}, "geometry": { "type": "Polygon", "coordinates": [-62.5359497334, -9.7318235272] }},
{ "type": "Feature", "properties": {"id": "1100031", "name": "Foz do Iguaçu", "description": "Cabixi"}, "geometry": { "type": "Polygon", "coordinates": [-60.3993982597, -13.4558418276] }}
]
}
'''
json_dec = unidecode.unidecode(places_json)
print(json.loads(json_dec))