django - How to use Django Bulk Add to Database Feature
问题描述
I am trying to add around 40-50k rows to pgsql Database from Django from a Text File Dump from an Application for Data Processing
Following is my Function
def populate_backup_db(dumpfile):
sensordata=sensorrecords() **** This is the Model
start_time = time.time()
file= open(dumpfile)
filedata = file.readlines()
endcount=len(filedata)
i=0
imagecount=0
while i<endcount:
lineitem = split_entry(filedata[i])
if (lineitem[0]== "HEADER"):
imagecount=imagecount+1
sensordata.Sensor = lineitem[1]
sensordata.Date1 = lineitem[2]
sensordata.Date2 = lineitem[3]
sensordata.Version = lineitem[4]
sensordata.Proxyclient = lineitem[8]
sensordata.Triggerdate = ctodatetime(lineitem[13])
sensordata.Compression = lineitem[16]
sensordata.Encryption = lineitem[17]
sensordata.Fragments = lineitem[21]
sensordata.Pbit = lineitem[37]
sensordata.BlockIntFT = lineitem[38]
sensordata.OriginServer = lineitem[56]
sensordata.save()
i=i+1
elapsed_time = time.time() - start_time
print(imagecount ,'entries saved to database from ',dumpfile,'. Time Taken is ',elapsed_time,' seconds.')
file.close()
This is taking around 2-3 minutes to save all data to Database. This Dumpfile is likely to Increase in size, and if this Function is to be used, It can take a couple of minutes to Save All data to Database
How can I fetch all Data from Dump File and then save it all to the Database in single go.
I see a DJANGO Method called bulk_create()
bulk_create()¶
bulk_create(objs, batch_size=None, ignore_conflicts=False)¶
This method inserts the provided list of objects into the database in an efficient manner (generally only 1 query, no matter how many objects there are):
>>> Entry.objects.bulk_create([
... Entry(headline='This is a test'),
... Entry(headline='This is only a test'),
... ])
The Example seems to be adding entries Manually , The Function I am using is running a loop until all entries are fetched , saving the in the process.
How Do I run it in Loop ? Do I replace sensordata.save()
with some_list.append(sensordata)
and in the end after the loop ends, do a
sensordata.objects.bulk_create(some_list)
I edited my Code to Append the Object to a List and then do a Bulk Update in the end as below
def populate_backup_db(dumpfile):
sensordata=sensorrecords() **** This is the Model
datalist =[]
start_time = time.time()
file= open(dumpfile)
filedata = file.readlines()
endcount=len(filedata)
i=0
imagecount=0
while i<endcount:
lineitem = split_entry(filedata[i])
if (lineitem[0]== "HEADER"):
imagecount=imagecount+1
sensordata.Sensor = lineitem[1]
sensordata.Date1 = lineitem[2]
sensordata.Date2 = lineitem[3]
sensordata.Version = lineitem[4]
sensordata.Proxyclient = lineitem[8]
sensordata.Triggerdate = ctodatetime(lineitem[13])
sensordata.Compression = lineitem[16]
sensordata.Encryption = lineitem[17]
sensordata.Fragments = lineitem[21]
sensordata.Pbit = lineitem[37]
sensordata.BlockIntFT = lineitem[38]
sensordata.OriginServer = lineitem[56]
datalist.append(sensordata)
i=i+1
elapsed_time = time.time() - start_time
print(imagecount ,'entries saved to database from ',dumpfile,'. Time Taken is ',elapsed_time,' seconds.')
sensordata.objects.bulk_create(datalist)
file.close()
This throws an error below
Traceback:
File "C:\Python\Python36\lib\site-packages\django\core\handlers\exception.py" in inner
34. response = get_response(request)
File "C:\Python\Python36\lib\site-packages\django\core\handlers\base.py" in _get_response
126. response = self.process_exception_by_middleware(e, request)
File "C:\Python\Python36\lib\site-packages\django\core\handlers\base.py" in _get_response
124. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "C:\Python\Python36\lib\site-packages\django\contrib\auth\decorators.py" in _wrapped_view
21. return view_func(request, *args, **kwargs)
File "C:\Users\va\eclipse-workspace\prod\home\views.py" in process_data
68. get_backup_data()
File "C:\Users\va\eclipse-workspace\prod\home\process.py" in get_backup_data
8. populate_backup_db('c:\\users\\va\\desktop\\vsp\\backupdata_server.txt')
File "C:\Users\va\eclipse-workspace\prod\home\process.py" in populate_backup_db
122. sensordata.objects.bulk_create(datalist)
File "C:\Python\Python36\lib\site-packages\django\db\models\manager.py" in __get__
176. raise AttributeError("Manager isn't accessible via %s instances" % cls.__name__)
Exception Type: AttributeError at /process_data/
Exception Value: Manager isn't accessible via sensorrecords instances
解决方案
像这样更新您的代码:
def populate_backup_db(dumpfile):
datalist =[]
start_time = time.time()
file= open(dumpfile)
filedata = file.readlines()
endcount=len(filedata)
i=0
imagecount=0
while i<endcount:
lineitem = split_entry(filedata[i])
if (lineitem[0]== "HEADER"):
imagecount=imagecount+1
sensordata = sensorrecords() # initiating object here
sensordata.Sensor = lineitem[1]
sensordata.Date1 = lineitem[2]
sensordata.Date2 = lineitem[3]
sensordata.Version = lineitem[4]
sensordata.Proxyclient = lineitem[8]
sensordata.Triggerdate = ctodatetime(lineitem[13])
sensordata.Compression = lineitem[16]
sensordata.Encryption = lineitem[17]
sensordata.Fragments = lineitem[21]
sensordata.Pbit = lineitem[37]
sensordata.BlockIntFT = lineitem[38]
sensordata.OriginServer = lineitem[56]
datalist.append(sensordata)
i=i+1
elapsed_time = time.time() - start_time
print(imagecount ,'entries saved to database from ',dumpfile,'. Time Taken is ',elapsed_time,' seconds.')
sensorrecords.objects.bulk_create(datalilist) # you need to access objects method via model class name
file.close()
推荐阅读
- docker - 为多服务项目(单域)设置 SSL 证书
- swift - 在使用 AlamoFire 的 RequestInterceptor 类时,如何保证一次只运行一次重试?
- soap - 如何使用groovy在soapUi中的所有请求中添加自定义安全标签
- discord - 在 say 命令中添加作者
- html - Atom 显示灰色代码(不可执行)
- google-cloud-platform - 如何在作曲家/气流中清楚地列出所有连接?
- php - Laravel Auth::login 在页面重新加载后“过期”
- python - 如何从单独的 .py 文件运行导入语句?
- sql - 我想使用基于表上主机名字段的 SQL 连接 DB2 中的两个表,但必须使用一些字符串函数,
- java - 声称 long 越界