首页 > 解决方案 > PostgreSQL type casting failure in INSERT queries on copied tables

问题描述

I'm trying to migrate tables in my PostgreSQL DB from one schema to another in my Django project. I removed and recreated my migration scripts to consolidate my DB changes and that part works beautifully. However I am running into issues copying my data.

The query

INSERT INTO table_name
SELECT * FROM other_schema.table_name

will work about 1/4 of the time. However, I often get odd TYPE errors like the following:

ERROR: column "doc_date" is of type timestamp with time zone but expression 
is of type integer LINE 2: SELECT * FROM django_apps.db_comments_dbcomment

Then I break out my INSERT statement using a CTE and type casting like so:

WITH dbComments AS (
SELECT
id,
created_date,
modified_date,
doc_date::TIMESTAMP,
customer_number,
customer_name,
db_table,
db_table_number,
note_processed::BOOLEAN,
note_modified::BOOLEAN,
comment_id,
customer_id,
created_by_id,
modified_by_id
FROM django_apps.db_comments_dbcomment
)

INSERT INTO db_comments_dbcomment
SELECT * FROM dbComments;

However, I still get the following error

ERROR: column "note_modified" is of type boolean but expression 
is of type integer LINE 21: SELECT * FROM dbComments;

despite the fact that I've already cast that field to boolean (since I know conversion between BOOLEAN and INTEGER is a problem with glob SELECT statements).

If anyone has any ideas how I can force no modification in my type casting I would very much appreciate it.

UPDATE 7/16/2018

I followed Gordon Linoff's suggestion along with explicit type casting and that did the trick...eventually. I just want to share how I eventually got there in case it helps anyone else.

So since the purpose of this was to migrate tables to the public schema I started by clearing my public schema using the following query (clear_public.sql):

DROP SCHEMA public CASCADE;

CREATE SCHEMA public AUTHORIZATION duser; -- Application user

GRANT ALL ON SCHEMA public TO PUBLIC;

GRANT ALL ON SCHEMA public TO postgres;

GRANT ALL ON SCHEMA public TO duser; 

I then recreated all my tables in the public schema simply by running python manage.py migrate.

I tried to encapsulate the data transfer in the final Django migration to run but, due to dependencies this did not work out as hoped. However, I used the function I wrote to generate SQL queries and dump them into an SQL file (not shown):

def copy_table_data(apps, schema_editor):
"""
Iterating over all our apps and copying data from the django_apps
schema to the newly created tables in the public schema.
"""
# This approach should work for the tables that directly relate to our defined models. IT WILL SKIP TABLES CREATED FOR MANY TO MANY RELATIONS AND OTHER SYSTEM TABLES!

tables = connection.introspection.table_names()
models = connection.introspection.installed_models(tables)
for model in models:
    table = model._meta.db_table
    vals = ''
    cols = ''
    for field in model._meta.get_fields():
        if hasattr(field, 'column'):
            mytype = field.db_type(connection) if hasattr(field, 'db_type') else None


            if mytype == None:
              pass # fields that do not pertain to an actual column (e.g. One side of ManyToOne relationship)
            elif mytype == 'serial':
                cols += f', {field.column}' #PostgreSQL don't know serial
                vals += f' ,{field.column}'
            else:
                cols += f', {field.column}'
                vals += f' ,{field.column}::{mytype}'
    vals = vals[2:] # removing leading comma
    cols = cols[2:]
    query = f"""
    INSERT INTO public.{table} (
        {cols}
        )
        SELECT
        {vals}
        FROM django_apps.{table};

I used this to generate data_migrations.sql. So the final protocol which I tested locally and then ran in my dev and production servers was this:

  1. SSH into PostgreSQL Docker container - run clear_pulic.sql

  2. SSH in Django Docker container - run python manage.py migrate

  3. SSH into PostgreSQL Docker container - run data_migrations.sql

Even with hoping in and out of different Docker containers, I got all my tables ported over with all their data in ~30 seconds. I'm sure there are better ways to do it but I'm happy with my approach. Thanks to Gordon Linoff for highlighting the initial flaw in my SQL!

UPDATE 7/18/2016

Be forewarned that the above approach will cause ID conflicts unless you follow up with a SELECT setval('sometable_id_seq', COALESCE((SELECT MAX(id)+1 FROM sometable),1), false); set of queries to reset the ID sequences for your DB.

I preserved the original ID values because I have a developer who is depending on them for a REST API. In general I would advise not including the id field if you do not have to.

标签: sqldjangopostgresql

解决方案


List out all the columns:

INSERT INTO db_comments_dbcomment ( . . . )
    SELECT . . . 
    FROM dbComments;

This will ensure that they line up as you expect.


推荐阅读