python - 在 AWS 中上传 word2vec 模型会出错
问题描述
--python3
我使用 gensim 库创建了一个 word2vec 模型并将其保存在本地磁盘中。我想将该文件上传到我的 s3 存储桶。我已经使用 gensim 成功创建了 word2vec 模型,但是在将其上传到我的存储桶时出现错误。
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
一些链接建议编码以避免此类错误。这适用于创建的 word2vec 向量模型吗?如果是这样,我必须做什么类型的编码?如果没有,还有其他方法可以上传文件。这是我将文件上传到我的 s3 存储桶的代码
import boto
from boto.s3.key import Key
from os.path import expanduser
def upload_file(aws_access_key_id, aws_secret_access_key, bucket_name, bucket_folders, path_to_file, file_name, job_id):
try:
conn = boto.connect_s3(aws_access_key_id, aws_secret_access_key)
except Exception as error:
#LOGGER.info( "Cannot upload %s job vector to aws due to connection error in aws")
#LOGGER.exception(error)
print("connection error")
if conn:
bucket = conn.get_bucket(bucket_name)
check_file_in_bucket = bucket_folders + file_name
if bucket.lookup(check_file_in_bucket):
# deleting the existing file on server
(bucket.lookup(check_file_in_bucket)).delete()
k = Key(bucket)
k.key = check_file_in_bucket
upload_file = path_to_file + file_name
try:
if os.path.isfile(upload_file):
print("file present")
upload_file = open(upload_file, 'r+')
try:
size = os.fstat(upload_file.fileno()).st_size
except:
# Not all file objects implement fileno(),
# so we fall back on this
file.seek(0, os.SEEK_END)
size = file.tell()
sent = k.set_contents_from_file(upload_file, rewind=True)
# Rewind for later use
upload_file.seek(0)
if sent == size:
#LOGGER.info("jobvector model for %s has been sucessfully uploaded", job_id)
print(" It worked")
else:
#LOGGER.info("job vector model for %s has not sucessfully uploaded", job_id)
print("Try again")
except Exception as error:
#LOGGER.info("Cannot upload %s job vector model as file not found in local disk")
#LOGGER.exception(error)
print("file not found in local disk")
return 0
if __name__ == '__main__':
MODEL_FOLDER = expanduser("~") + '/modelsdata/job_vectors/'
BUCKET_FOLDER = 'w2v_model/jobvectors/'
BUCKET_NAME = 'test-voip'
aws_access_key_id = CONFIG["aws-s3"]["key_id"]
aws_secret_access_key = CONFIG["aws-s3"]["key_access"]
upload_file(aws_access_key_id,aws_secret_access_key,\
BUCKET_NAME, BUCKET_FOLDER, MODEL_FOLDER, '237091_model', 6789)
我尝试将“wav”文件上传到我的 s3 存储桶,上面的代码成功了。我遇到了问题
解决方案
推荐阅读
- python - 如何在 Matplotlib 中绘制算术和和自然对数函数?
- syslog - QRadar 没有监听 514 端口
- r - Adjust graph done in R
- css - 将自定义图像显示为光标
- sql - SQL Server 触发器在 UPDATE 和 INSERT 上保存行修订
- php - 未定义的属性:stdClass:: on Object
- javascript - getElementsByClassName 显示来自选择框的未定义值
- jquery - OR 链中的 .val() 表达式的 jQuery 3.5.1 问题
- typescript - 如何在打字稿中“填写”通用参数
- pandas - 如何使用 Pandas 删除特定列?