python - python 随机数据生成器预期的 str 实例,找到 numpy.datetime64
问题描述
您好一直在尝试将具有随机日期的随机数据创建为 csv 文件,但出现以下错误expected str instance, numpy.datetime64 found
数据生成器代码
import pandas as pd
import numpy as np
import string
import random
def gen_random_email():
domains = [ "hotmail.com", "gmail.com", "aol.com", "mail.com" , "mail.kz", "yahoo.com"]
letters = string.ascii_letters +'.'*5
email = ''.join(np.random.choice(list(letters),10))+'@'+ np.random.choice(domains)
email = email.replace('.@', '@')
return email, "Email"
def gen_random_float():
num = np.random.random()*np.random.randint(2000)
decimal_points = np.random.randint(8)
num = int(num*10**(decimal_points))/10**decimal_points
return str(num), 'Float'
def gen_random_sentence():
nouns = ["puppy", "car", "rabbit", "girl", "monkey"]
verbs = ["runs", "hits", "jumps", "drives", "barfs"]
adv = ["crazily", "dutifully", "foolishly", "merrily", "occasionally"]
adj = ["adorable.", "clueless.", "dirty.", "odd.", "stupid."]
random_entry = lambda x: x[random.randrange(len(x))]
random_entry = " ".join([random_entry(nouns), random_entry(verbs),
random_entry(adv), random_entry(adj)])
return random_entry, 'String'
def gen_random_int():
num = np.random.randint(1000000)
return str(num), 'Int'
def gen_random_date():
monthly_days = np.arange(0, 30)
base_date = np.datetime64('2020-01-01')
random_date = base_date + np.random.choice(monthly_days)
return random_date, 'Date'
def gen_dataset(filename, size=5000):
randomizers = [gen_random_email, gen_random_float, gen_random_int, gen_random_sentence,gen_random_date]
with open(filename, 'w') as file:
file.write("Text, Type\n")
for _ in range(size):
file.write(",".join(random.choice(randomizers)())+"\n")
gen_dataset('dataaaa.csv')
TypeError: sequence item 0: expected str instance, numpy.datetime64 found
解决方案
首先,捕获错误并查看导致它的原因。
def gen_dataset(filename, size=5000):
randomizers = [gen_random_email, gen_random_float, gen_random_int, gen_random_sentence,gen_random_date]
with open(filename, 'w') as file:
file.write("Text, Type\n")
for _ in range(size):
f = random.choice(randomizers)
result = f()
try:
file.write(",".join(result)+"\n")
except TypeError:
print(result)
raise
>>>
(numpy.datetime64('2020-01-09'), 'Date')
Traceback (most recent call last):
File "C:\pyProjects\tmp.py", line 80, in <module>
gen_dataset('dataaaa.csv')
File "C:\pyProjects\tmp.py", line 75, in gen_dataset
file.write(",".join(result)+"\n")
TypeError: sequence item 0: expected str instance, numpy.datetime64 found
嗯,我想知道是否join
只有字符串作为参数除外?
是的,来自文档:
如果 iterable 中有任何非字符串值,包括字节对象,则会引发 TypeError。
我想知道如何将 numpy datetime64 转换为字符串。搜索numpy datetime64 to string
是有成效的:Convert numpy.datetime64 to string object in python
这些工作
>>> q = gen_random_date()[0]
>>> q
numpy.datetime64('2020-01-27')
>>> np.datetime_as_string(q)
'2020-01-27'
>>> q.astype(str)
'2020-01-27'
>>>
然后只需修改try/except
.
def gen_dataset(filename, size=5000):
randomizers = [gen_random_email, gen_random_float, gen_random_int, gen_random_sentence,gen_random_date]
with open(filename, 'w') as file:
file.write("Text, Type\n")
for _ in range(size):
f = random.choice(randomizers)
a,b = f()
try:
q = ",".join([a,b,"\n"])
except TypeError:
a = np.datetime_as_string(a)
q = ",".join([a,b,"\n"])
file.write(q)
或者只是先发制人地将第一项设为字符串。
def gen_dataset(filename, size=5000):
randomizers = [gen_random_email, gen_random_float, gen_random_int, gen_random_sentence,gen_random_date]
with open(filename, 'w') as file:
file.write("Text, Type\n")
for _ in range(size):
f = random.choice(randomizers)
a,b = f()
q = ",".join([str(a),b,"\n"])
file.write(q)
推荐阅读
- java - RecyclerView 水平项目没有对齐
- javascript - d3.timeWeek ,一周的第一天
- multilingual - Angular-Nodejs 中的多语言
- powerbi - 如何查看 PBIX 文件的内容
- javascript - 引诱报告:index.html 在 Firefox 和 chrome 中均未显示任何内容
- html - 悬停效果在 PC 上运行代码时有效,但在将其上传到服务器并查看网站悬停效果后不起作用
- javascript - 下一个 js:TypeError:无法读取未定义的属性“地图”
- python - matplotlib FigureCanvas - 我可以在 PyQt5 中的一个 Canvas 小部件下并行绘制多个图形吗?
- php - PHP无法引用多维数组中的第一个数组
- javascript - Javascript - 将一系列数字转换为另一个范围