首页 > 解决方案 > 删除所有非字符,除了python中的数字、拉丁字母和西里尔字母

问题描述

标签: python

解决方案


from string import ascii_letters, digits, whitespace

cyrillic_letters = u"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ"


def strip(text):
    allowed_chars = cyrillic_letters + ascii_letters + digits + whitespace
    print(allowed_chars)
    return "".join([c for c in text if c in allowed_chars])

edit: Not familiar with the Cyrillic alphabet but this is how I managed to strip characters except as you specified Cyrillic-letters, latin-letters, non-numbers and (I added this one) whitespace from a string.


推荐阅读