python - Django:无法使用命令上传带有西班牙语字符的数据
问题描述
我已将我的 Django 应用程序配置为使用命令从 CSV 文件上传数据。
注意:destionation 数据库是PostgreSQL
,但我只是使用在本地机器上进行测试SQLite
。
我用这一行阅读了 CSV 文件:
tmp_data_products=pd.read_csv('static/data/products.csv',sep=',', encoding="utf-8").fillna(" ")
数据:
如果我只使用第一行,则不会出现问题:
| category | product | slug | description | image | available
|----------|-----------------------------------|-----------------------------------|---------------------------------------------------------------|-------------------------------------------------------------------------------------|-----------
| Muestras | Sobre con stickers de muestra | sobre-con-stickers-de-muestra | Sobre con 10 stickers de muestra | https://stickers-files.s3.amazonaws.com/product/Artboard_1_PNG.png | True
| Muestras | Stickers transparentes de muestra | stickers-transparentes-de-muestra | Sobre con 10 stickers de muestra con el diseño que tú envíes. | https://stickers-files.s3.amazonaws.com/product/Artboard_1_PNG.png | True
当我的 CSV 文件中没有任何西班牙语字符时,它可以完美运行。但是,当有
ñ
或重音时,tú
我会收到有关编码的错误。
错误:
(stickers-gallito-app) D:\web_proyects\stickers-gallito-app>python manage.py products
DEBUG ON
Traceback (most recent call last):
File "pandas\_libs\parsers.pyx", line 1134, in pandas._libs.parsers.TextReader._convert_tokens
File "pandas\_libs\parsers.pyx", line 1240, in pandas._libs.parsers.TextReader._convert_with_dtype
File "pandas\_libs\parsers.pyx", line 1256, in pandas._libs.parsers.TextReader._string_convert
File "pandas\_libs\parsers.pyx", line 1494, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 44: invalid continuation byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "manage.py", line 19, in <module>
execute_from_command_line(sys.argv)
File "D:\virtual_envs\stickers-gallito-app\lib\site-packages\django\core\management\__init__.py", line 381, in execute_from_command_line
utility.execute()
File "D:\virtual_envs\stickers-gallito-app\lib\site-packages\django\core\management\__init__.py", line 375, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "D:\virtual_envs\stickers-gallito-app\lib\site-packages\django\core\management\__init__.py", line 224, in fetch_command
klass = load_command_class(app_name, subcommand)
File "D:\virtual_envs\stickers-gallito-app\lib\site-packages\django\core\management\__init__.py", line 36, in load_command_class
module = import_module('%s.management.commands.%s' % (app_name, name))
File "C:\Users\OGONZALES\AppData\Local\Programs\Python\Python37-32\lib\importlib\__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "D:\web_proyects\stickers-gallito-app\shop\management\commands\products.py", line 8, in <module>
tmp_data_products=pd.read_csv('static/data/products.csv',sep=',', encoding="utf-8").fillna(" ")
File "D:\virtual_envs\stickers-gallito-app\lib\site-packages\pandas\io\parsers.py", line 678, in parser_f
return _read(filepath_or_buffer, kwds)
File "D:\virtual_envs\stickers-gallito-app\lib\site-packages\pandas\io\parsers.py", line 446, in _read
data = parser.read(nrows)
File "D:\virtual_envs\stickers-gallito-app\lib\site-packages\pandas\io\parsers.py", line 1036, in read
ret = self._engine.read(nrows)
File "D:\virtual_envs\stickers-gallito-app\lib\site-packages\pandas\io\parsers.py", line 1848, in read
data = self._reader.read(nrows)
File "pandas\_libs\parsers.pyx", line 876, in pandas._libs.parsers.TextReader.read
File "pandas\_libs\parsers.pyx", line 891, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas\_libs\parsers.pyx", line 968, in pandas._libs.parsers.TextReader._read_rows
File "pandas\_libs\parsers.pyx", line 1094, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas\_libs\parsers.pyx", line 1141, in pandas._libs.parsers.TextReader._convert_tokens
File "pandas\_libs\parsers.pyx", line 1240, in pandas._libs.parsers.TextReader._convert_with_dtype
File "pandas\_libs\parsers.pyx", line 1256, in pandas._libs.parsers.TextReader._string_convert
File "pandas\_libs\parsers.pyx", line 1494, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 44: invalid continuation byte
我搜索并发现了另一个问题:
Pandas df.to_csv("file.csv" encode="utf-8") 仍然为减号提供垃圾字符
尝试了答案,没有成功。使用encoding='utf-8-sig'
没有任何区别。
上传数据的完整命令:
import pandas as pd
import csv
from shop.models import Product, Category
from django.core.management.base import BaseCommand
tmp_data_products=pd.read_csv('static/data/products.csv',sep=',', encoding="utf-8").fillna(" ")
class Command(BaseCommand):
def handle(self, **options):
products = [
Product(
category=Category.objects.get(name=row['category']),
name=row['product'],
slug=row['slug'],
description=row['description'],
available=row['available']
)
for _, row in tmp_data_products.iterrows()
]
Product.objects.bulk_create(products)
模型.py:
class Product(models.Model):
name = models.CharField(max_length=250, unique=False)
slug = models.SlugField(max_length=250, unique=False)
description = models.TextField(blank=True)
category = models.ForeignKey(Category, on_delete=models.CASCADE)
image = models.ImageField(upload_to='product', blank=True, null=True)
available = models.BooleanField(default=True)
created = models.DateTimeField(auto_now_add=True)
updated = models.DateTimeField(auto_now=True)
class Meta:
ordering = ('name',)
verbose_name = 'product'
verbose_name_plural = 'products'
def get_url(self):
return reverse('shop:ProdDetail', args=[self.category.slug, self.slug])
def __str__(self):
return '{}'.format(self.name)
设置.py:
import os
# SITE_ROOT = root()
# Build paths inside the project like this: os.path.join(BASE_DIR, ...)
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
# Quick-start development settings - unsuitable for production
# See https://docs.djangoproject.com/en/2.1/howto/deployment/checklist/
# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = '^_67&#r+(c+%pu&n+a%&dmxql^i^_$0f69)mnhf@)zq-rbxe9z'
ALLOWED_HOSTS = ['127.0.0.1', 'stickers-gallito-app.herokuapp.com',
'stickersgallito.pe', 'www.stickersgallito.pe']
# Application definition
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'shop',
'search_app',
'cart',
'stripe',
'order',
'crispy_forms',
'embed_video',
'storages',
'marketing',
'django.contrib.humanize',
]
MIDDLEWARE = [
'whitenoise.middleware.WhiteNoiseMiddleware',
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
ROOT_URLCONF = 'stickers_gallito.urls'
TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': [os.path.join(BASE_DIR, 'templates'),
os.path.join(BASE_DIR, 'shop', 'templates/'),
os.path.join(BASE_DIR, 'search_app', 'templates/'),
os.path.join(BASE_DIR, 'cart', 'templates/'),
os.path.join(BASE_DIR, 'order', 'templates/'), ]
,
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
'shop.context_processor.menu_links',
'shop.context_processor.has_shop',
# 'cart.context_processor.current_time',
'cart.context_processor.cart_items_counter'
],
},
},
]
WSGI_APPLICATION = 'stickers_gallito.wsgi.application'
# Database
# https://docs.djangoproject.com/en/2.1/ref/settings/#databases
# Redirecciona www y http a https
SECURE_SSL_REDIRECT = False
# SECURITY WARNING: don't run with debug turned on in production!
DEBUG = True
if DEBUG:
print("DEBUG ON")
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': 'mydatabase',
}
}
else:
### HEROKU POSTGRESS ACCESS
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'xxxxxx',
'USER': 'xxxxxx',
'PASSWORD': 'xxxxxxxxxx1',
'HOST': 'xxxxxxamazonaws.com',
'PORT': 'xxxxx',
}
}
####
AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]
LANGUAGE_CODE = 'es'
TIME_ZONE = 'UTC'
USE_I18N = False
USE_L10N = False
USE_TZ = True
# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/2.1/howto/static-files/
STATIC_URL = '/static/'
STATIC_ROOT = os.path.join(BASE_DIR, 'staticfiles')
STATICFILES_DIRS = (
os.path.join(BASE_DIR, 'static'),
)
STATICFILES_LOCATION = 'static'
STATICFILES_STORAGE = 'custom_storages.StaticStorage'
MEDIAFILES_LOCATION = 'media'
DEFAULT_FILE_STORAGE = 'custom_storages.MediaStorage'
####
MEDIA_URL = '/media/'
MEDIA_ROOT = os.path.join(BASE_DIR, 'static', 'media')
CRISPY_TEMPLATE_PACK = 'bootstrap4'
### CULQUI ###
CULQI_PUBLISHABLE_KEY = os.environ['CULQI_PUBLISHABLE_KEY']
CULQI_SECRET_KEY = os.environ['CULQI_SECRET_KEY']
# DO NOT DO THIS!
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
MAILCHIMP_API_KEY = os.environ['MAILCHIMP_API_KEY']
MAILCHIMP_DATA_CENTER = os.environ['MAILCHIMP_DATA_CENTER']
MAILCHIMP_EMAIL_LIST_ID = os.environ['MAILCHIMP_EMAIL_LIST_ID']
### AMAZON ###
AWS_S3_OBJECT_PARAMETERS = {
'Expires': 'Thu, 31 Dec 2099 20:00:00 GMT',
'CacheControl': 'max-age=94608000',
}
AWS_STORAGE_BUCKET_NAME = os.environ['AWS_STORAGE_BUCKET_NAME']
AWS_S3_REGION_NAME = os.environ['AWS_S3_REGION_NAME']
# Tell django-storages the domain to use to refer to static files.
AWS_S3_CUSTOM_DOMAIN = '%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME
AWS_ACCESS_KEY_ID = os.environ['AWS_ACCESS_KEY_ID']
AWS_SECRET_ACCESS_KEY = os.environ['AWS_SECRET_ACCESS_KEY']
### MAILGUN - EMAIL MESSAGE SETTINGS ###
EMAIL_HOST = os.environ['EMAIL_HOST']
EMAIL_PORT = os.environ['EMAIL_PORT']
EMAIL_USE_TLS = os.environ['EMAIL_USE_TLS']
EMAIL_HOST_USER = os.environ['EMAIL_HOST_USER']
EMAIL_HOST_PASSWORD = os.environ['EMAIL_HOST_PASSWORD']
解决方案
您的问题有几种解决方案。
首先检查官方 Django 文档中的编码部分。(https://docs.djangoproject.com/en/2.2/ref/unicode/)尝试使用 django.utils.encoding 模块
转换函数¶
django.utils.encoding 模块包含一些方便在字符串和字节串之间来回转换的函数。
也使用这个提示:
导入 sys sys.getfilesystemencoding()
export LANG="es_ES.iso-8859-1" (如果我发现西班牙语编码正确)(仔细检查您要使用的西班牙语版本)
另一种选择是在生成项目时通过视图传递西班牙语编码:
def some_view(request):
request.encoding = 'iso-8859-1'
...
推荐阅读
- python-3.x - AttributeError:“NoneType”对象没有属性“大小”
- css - 如何删除css flex包含之间的空间
- javascript - 如何使用 Chai 进行测试
- c# - 如何拥有多个相同类型的 DbContext?
- javascript - 时刻 js 没有返回给定周+年的正确日期
- c++ - 在 Gnuplot 中绘制一组点、线
- javascript - 使用实时值更新图表,添加第二行
- android - 如何为浏览器和移动应用程序制作动态 URL - 对于动态的客户确认邮件 URL?
- reactjs - 如何在 Jest 中模拟来自 Apollo 2 的 client.readQuery
- solr - Apache Solr nGram 分组问题