sql - PostgreSQL type casting failure in INSERT queries on copied tables
问题描述
I'm trying to migrate tables in my PostgreSQL DB from one schema to another in my Django project. I removed and recreated my migration scripts to consolidate my DB changes and that part works beautifully. However I am running into issues copying my data.
The query
INSERT INTO table_name
SELECT * FROM other_schema.table_name
will work about 1/4 of the time. However, I often get odd TYPE
errors like the following:
ERROR: column "doc_date" is of type timestamp with time zone but expression
is of type integer LINE 2: SELECT * FROM django_apps.db_comments_dbcomment
Then I break out my INSERT statement using a CTE and type casting like so:
WITH dbComments AS (
SELECT
id,
created_date,
modified_date,
doc_date::TIMESTAMP,
customer_number,
customer_name,
db_table,
db_table_number,
note_processed::BOOLEAN,
note_modified::BOOLEAN,
comment_id,
customer_id,
created_by_id,
modified_by_id
FROM django_apps.db_comments_dbcomment
)
INSERT INTO db_comments_dbcomment
SELECT * FROM dbComments;
However, I still get the following error
ERROR: column "note_modified" is of type boolean but expression
is of type integer LINE 21: SELECT * FROM dbComments;
despite the fact that I've already cast that field to boolean (since I know conversion between BOOLEAN and INTEGER is a problem with glob SELECT statements).
If anyone has any ideas how I can force no modification in my type casting I would very much appreciate it.
UPDATE 7/16/2018
I followed Gordon Linoff's suggestion along with explicit type casting and that did the trick...eventually. I just want to share how I eventually got there in case it helps anyone else.
So since the purpose of this was to migrate tables to the public schema I started by clearing my public schema using the following query (clear_public.sql
):
DROP SCHEMA public CASCADE;
CREATE SCHEMA public AUTHORIZATION duser; -- Application user
GRANT ALL ON SCHEMA public TO PUBLIC;
GRANT ALL ON SCHEMA public TO postgres;
GRANT ALL ON SCHEMA public TO duser;
I then recreated all my tables in the public
schema simply by running python manage.py migrate
.
I tried to encapsulate the data transfer in the final Django migration to run but, due to dependencies this did not work out as hoped. However, I used the function I wrote to generate SQL queries and dump them into an SQL file (not shown):
def copy_table_data(apps, schema_editor):
"""
Iterating over all our apps and copying data from the django_apps
schema to the newly created tables in the public schema.
"""
# This approach should work for the tables that directly relate to our defined models. IT WILL SKIP TABLES CREATED FOR MANY TO MANY RELATIONS AND OTHER SYSTEM TABLES!
tables = connection.introspection.table_names()
models = connection.introspection.installed_models(tables)
for model in models:
table = model._meta.db_table
vals = ''
cols = ''
for field in model._meta.get_fields():
if hasattr(field, 'column'):
mytype = field.db_type(connection) if hasattr(field, 'db_type') else None
if mytype == None:
pass # fields that do not pertain to an actual column (e.g. One side of ManyToOne relationship)
elif mytype == 'serial':
cols += f', {field.column}' #PostgreSQL don't know serial
vals += f' ,{field.column}'
else:
cols += f', {field.column}'
vals += f' ,{field.column}::{mytype}'
vals = vals[2:] # removing leading comma
cols = cols[2:]
query = f"""
INSERT INTO public.{table} (
{cols}
)
SELECT
{vals}
FROM django_apps.{table};
I used this to generate data_migrations.sql
. So the final protocol which I tested locally and then ran in my dev and production servers was this:
SSH into PostgreSQL Docker container - run
clear_pulic.sql
SSH in Django Docker container - run
python manage.py migrate
SSH into PostgreSQL Docker container - run
data_migrations.sql
Even with hoping in and out of different Docker containers, I got all my tables ported over with all their data in ~30 seconds. I'm sure there are better ways to do it but I'm happy with my approach. Thanks to Gordon Linoff for highlighting the initial flaw in my SQL!
UPDATE 7/18/2016
Be forewarned that the above approach will cause ID conflicts unless you follow up with a SELECT setval('sometable_id_seq', COALESCE((SELECT MAX(id)+1 FROM sometable),1), false);
set of queries to reset the ID sequences for your DB.
I preserved the original ID values because I have a developer who is depending on them for a REST API. In general I would advise not including the id
field if you do not have to.
解决方案
List out all the columns:
INSERT INTO db_comments_dbcomment ( . . . )
SELECT . . .
FROM dbComments;
This will ensure that they line up as you expect.
推荐阅读
- javascript - Javascript:速率限制,但超过限制时继续执行
- reactjs - React 导航 v5 + firebase 身份验证
- javascript - 用 ejs 的客户端 javascript 文件表达
- java - 如何连接Mysql ussign Intellij Idea
- java - Spring Data(Hibernate)动态 WHERE 子句
- kubernetes - Helm 与 Terraform
- python - 在 Django 中使用 `annotate` + `values` + `union` 的结果不正确
- node.js - 我不明白 node js 中以下代码中的应用
- php - Laravel 变量缓存
- c# - 我想将全局变量分配给局部变量,但出现错误。;预期的