首页 > 解决方案 > How to insert data into multiple table using single query in postgresql

问题描述

Hi I am new to SQL and postgresql and currently using it for timescaledb. It's a good timeseries database but I think I have hit a wall as my query took longer and longer as the data grows. Currently my table look like this.

dnsvault_timeseries_db=# \d bind_query_log
                     Table "public.bind_query_log"
  Column   |           Type           | Collation | Nullable | Default 
-----------+--------------------------+-----------+----------+---------
 time      | timestamp with time zone |           | not null | 
 node      | character varying(253)   |           | not null | 
 view      | character varying(50)    |           | not null | 
 qname     | character varying(253)   |           | not null | 
 qtype     | character varying(50)    |           | not null | 
 count     | integer                  |           | not null | 
 client_ip | inet                     |           | not null | 
 iface_ip  | inet                     |           | not null | 
 country   | character varying(50)    |           |          | 
Indexes:
    "bind_query_log_time_idx" btree ("time" DESC)
Triggers:
    ts_cagg_invalidation_trigger AFTER INSERT OR DELETE OR UPDATE ON bind_query_log FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.continuous_agg_invalidation_trigger('23')
    ts_insert_blocker BEFORE INSERT ON bind_query_log FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Number of child tables: 1 (Use \d+ to list them.)

Some example on how the data look like

dnsvault_timeseries_db=# select * from bind_query_log limit 10;
          time          |          node           |  view   |                              qname                               | qtype | count |   client_ip   |  iface_ip   | country 
------------------------+-------------------------+---------+------------------------------------------------------------------+-------+-------+---------------+-------------+---------
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | m17.ifxrtcm.com                                                  | A     |     1 | 10.39.56.74   | 10.251.3.17 | Unknown
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | m6.asnycoq.com                                                   | A     |     1 | 10.119.18.105 | 10.251.3.17 | Unknown
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | eventtrack.pandasecurity.com                                     | A     |     1 | 10.201.79.11  | 10.251.3.17 | Unknown
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | m0.obxouwd.biz                                                   | A     |     1 | 10.119.18.105 | 10.251.3.17 | Unknown
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | 53.199.87.211.in-addr.arpa                                       | PTR   |     1 | 10.36.22.35   | 10.251.3.17 | Unknown
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | df158c7425fec09b149da59bfa0f731c.safeframe.googlesyndication.com | A     |     1 | 10.12.100.157 | 10.251.3.17 | Unknown
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | cpp-kw.pandasecurity.com                                         | A     |     1 | 10.112.56.3   | 10.251.3.17 | Unknown
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | m24.itwjrgn.com                                                  | A     |     1 | 10.62.72.131  | 10.251.3.17 | Unknown
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | m33.oslkjwq.net                                                  | A     |     1 | 10.119.18.105 | 10.251.3.17 | Unknown
 2020-10-18 14:08:30+08 | dnsvgitndnsnode03.local | private | m19.xruogma.com                                                  | A     |     1 | 10.71.42.18   | 10.251.3.17 | Unknown
(10 rows)

From my research this is not a good way to store data as it will produce a lot of duplicate in the table and storing some of the data into separate table is better way. I can manually add data to each table but I find it really not optimize and may caused slow down in insert performance. Is there a way I can use a single query that do the transaction in one go. eg.

  1. insert node in node_table if the node name does not exist and return the id.
  2. insert view in view_table if the view name does not exist and return the id.
  3. insert qname in qname_table if the node name does not exist and return the id.
  4. insert client_ip in ip_table if the client_ip does not exist and return the id.
  5. insert iface_ip in ip_table if the iface_ip does not exist and return the id.
  6. insert country in country_table if the country does not exist and return the id.
  7. insert the time value and count value and use the return id for each column to complete the bind_query_log table.

extra

can i also do bulk insert of data if I have large number of data to insert into a single transaction.

标签: postgresql

解决方案


推荐阅读