首页 > 解决方案 > 如何在不锁定整个表的情况下强制 Postgres 返回合理的行数?

问题描述

我们最近将一个大型应用程序和数据库从 EngineYard 迁移到 Heroku。我们的新数据库比 EngineYard 上的资源少,并且暴露了一些以前可能因为大铁而只能工作的查询。

一个特别讨厌的问题是我们无法从一张大表中返回计数。该表是应用程序中最大的表,包含超过 10 亿行。有些客户会有数万行,而有些客户可能有上亿行。

该表由 asubscription_id和 a索引status

查询很简单:

select count(*)
from my_large_table
where subscription_id = 123
  and status = 'Valid'

不幸的是,此查询似乎ExclusiveLock在表上执行,导致请求排队。对于拥有 500 万行数据的客户,查询必须在搅动一个小时后终止。这个查询实际上使我们的应用程序停止了。

但是,我们需要知道该表中每个客户和状态的计数。它不一定是死的。但它不可能是一个完全虚构的count_estimate程序,就像实际上只是报告虚构数字的程序一样。

我确信有一个解决方案。我该怎么做才能得到这个计数?有什么办法让它不被锁定吗?

这张桌子很大,有很多属性。当我设计它时,我在值上过分了citext,因为我是从 MySQL 过来的,在那里我习惯于不区分大小写的搜索,并且认为它们是理所当然的。我真的只需要citext大概 4 列顶部(attribute1,attribute2)。这些实际上是字段的名称,而不是混淆。该表是不同类型数据的目标,基于standard_id.

我很感激帮助。

/*
 Navicat PostgreSQL Data Transfer

 Source Server         : Heroku myapp-production
 Source Server Version : 100600
 Source Host           : ec2-34-196-135-106.compute-1.amazonaws.com
 Source Database       : d6hrvd8r3u28t0
 Source Schema         : public

 Target Server Version : 100600
 File Encoding         : utf-8

 Date: 02/20/2019 09:37:49 AM
*/

-- ----------------------------
--  Table structure for apps
-- ----------------------------
DROP TABLE IF EXISTS "public"."apps";
CREATE TABLE "public"."apps" (
    "id" int8 NOT NULL DEFAULT nextval('apps_id_seq'::regclass),
    "attribute1" "public"."citext" COLLATE "default",
    "attribute2" "public"."citext" COLLATE "default",
    "attribute3" "public"."citext" COLLATE "default",
    "attribute4" "public"."citext" COLLATE "default",
    "attribute5" "public"."citext" COLLATE "default",
    "attribute6" "public"."citext" COLLATE "default",
    "attribute7" "public"."citext" COLLATE "default",
    "attribute8" "public"."citext" COLLATE "default",
    "attribute9" "public"."citext" COLLATE "default",
    "attribute10" "public"."citext" COLLATE "default",
    "attribute11" "public"."citext" COLLATE "default",
    "attribute12" "public"."citext" COLLATE "default",
    "attribute13" "public"."citext" COLLATE "default",
    "attribute14" "public"."citext" COLLATE "default",
    "attribute15" "public"."citext" COLLATE "default",
    "attribute16" "public"."citext" COLLATE "default",
    "attribute17" "public"."citext" COLLATE "default",
    "attribute18" "public"."citext" COLLATE "default",
    "attribute19" "public"."citext" COLLATE "default",
    "attribute20" "public"."citext" COLLATE "default",
    "attribute21" "public"."citext" COLLATE "default",
    "attribute22" "public"."citext" COLLATE "default",
    "attribute23" "public"."citext" COLLATE "default",
    "attribute24" "public"."citext" COLLATE "default",
    "attribute25" "public"."citext" COLLATE "default",
    "attribute26" "public"."citext" COLLATE "default",
    "attribute27" "public"."citext" COLLATE "default",
    "attribute28" "public"."citext" COLLATE "default",
    "attribute29" "public"."citext" COLLATE "default",
    "attribute30" "public"."citext" COLLATE "default",
    "attribute31" "public"."citext" COLLATE "default",
    "attribute32" "public"."citext" COLLATE "default",
    "attribute33" "public"."citext" COLLATE "default",
    "attribute34" "public"."citext" COLLATE "default",
    "attribute35" "public"."citext" COLLATE "default",
    "attribute36" "public"."citext" COLLATE "default",
    "attribute37" "public"."citext" COLLATE "default",
    "attribute38" "public"."citext" COLLATE "default",
    "attribute39" "public"."citext" COLLATE "default",
    "attribute40" "public"."citext" COLLATE "default",
    "attribute41" "public"."citext" COLLATE "default",
    "attribute42" "public"."citext" COLLATE "default",
    "attribute43" "public"."citext" COLLATE "default",
    "attribute44" "public"."citext" COLLATE "default",
    "attribute45" "public"."citext" COLLATE "default",
    "attribute46" "public"."citext" COLLATE "default",
    "attribute47" "public"."citext" COLLATE "default",
    "attribute48" "public"."citext" COLLATE "default",
    "attribute49" "public"."citext" COLLATE "default",
    "attribute50" "public"."citext" COLLATE "default",
    "created_at" timestamp(6) NOT NULL,
    "updated_at" timestamp(6) NOT NULL,
    "standard_id" int4 NOT NULL,
    "status" "public"."citext" COLLATE "default",
    "listing_id" int4 NOT NULL,
    "repository_id" int4 NOT NULL,
    "subscription_id" int4 NOT NULL,
    "attribute_info" "public"."hstore",
)
WITH (OIDS=FALSE);
ALTER TABLE "public"."apps" OWNER TO "ufn67drbuner1e";

-- ----------------------------
--  Primary key structure for table apps
-- ----------------------------
ALTER TABLE "public"."apps" ADD PRIMARY KEY ("id") NOT DEFERRABLE INITIALLY IMMEDIATE;

-- ----------------------------
--  Indexes structure for table apps
-- ----------------------------
CREATE INDEX  "app_listing_idx" ON "public"."apps" USING btree(listing_id "pg_catalog"."int4_ops" ASC NULLS LAST);
CREATE INDEX  "app_subscription_idx" ON "public"."apps" USING btree(subscription_id "pg_catalog"."int4_ops" ASC NULLS LAST);
CREATE UNIQUE INDEX  "apps_listing_idx" ON "public"."apps" USING btree(listing_id "pg_catalog"."int4_ops" ASC NULLS LAST, "id" "pg_catalog"."int8_ops" ASC NULLS LAST);
CREATE INDEX  "apps_repository_idx" ON "public"."apps" USING btree(repository_id "pg_catalog"."int4_ops" ASC NULLS LAST, subscription_id "pg_catalog"."int4_ops" ASC NULLS LAST);
CREATE INDEX  "listing_and_attr_idx" ON "public"."apps" USING btree(listing_id "pg_catalog"."int4_ops" ASC NULLS LAST, attribute1 COLLATE "default" "public"."citext_ops" ASC NULLS LAST, attribute2 COLLATE "default" "public"."citext_ops" ASC NULLS LAST, attribute3 COLLATE "default" "public"."citext_ops" ASC NULLS LAST);
CREATE INDEX  "listing_and_attr_idx" ON "public"."apps" USING btree(listing_id "pg_catalog"."int4_ops" ASC NULLS LAST, attribute1 COLLATE "default" "public"."citext_ops" ASC NULLS LAST, attribute2 COLLATE "default" "public"."citext_ops" ASC NULLS LAST, attribute3 COLLATE "default" "public"."citext_ops" ASC NULLS LAST);
CREATE INDEX  "listing_and_attr_idx" ON "public"."apps" USING btree(listing_id "pg_catalog"."int4_ops" ASC NULLS LAST, attribute1 COLLATE "default" "public"."citext_ops" ASC NULLS LAST, attribute2 COLLATE "default" "public"."citext_ops" ASC NULLS LAST, attribute3 COLLATE "default" "public"."citext_ops" ASC NULLS LAST);

标签: postgresqlherokucountheroku-postgres

解决方案


关于锁:它不是由SELECT您正在运行的查询引起的。

EXCLUSIVE LOCK我唯一的解释是它正在一个事务中运行,该事务已经完成了导致被占用的其他事情。

唯一好的理论是REFRESH MATERIALIZED VIEW CONCURRENTLYThan 在同一个事务中运行。诸如关系扩展锁(在事务期间不持有)或ALTER TYPE ... ADD VALUE(仅阻止其他此类语句)之类的其他事情看起来不像可能的嫌疑人。

我不知道 Heroku 在他们的 PostgreSQL 中内置了哪些讨厌的东西,但我敢打赌EXCLUSIVE LOCKSELECT.

但即使没有这种虚假锁,计算表中的行数也是缓慢且资源密集型的。

如果通常的估计 (pg_stat_get_live_tuples()pg_class.reltuples) 不够好,您可以使用触发器:

CREATE TABLE row_counter (
   reloid oid PRIMARY KEY,
   count bigint NOT NULL
);

CREATE FUNCTION count_trig() RETURNS trigger
   LANGUAGE plpgsql AS
$$BEGIN
   IF TG_OP = 'INSERT' THEN
      UPDATE row_counter
      SET count = count + 1
      WHERE reloid = TG_RELID;

      RETURN NEW;
   ELSIF TG_OP = 'DELETE' THEN
      UPDATE row_counter
      SET count = count - 1
      WHERE reloid = TG_RELID;

      RETURN OLD;
   END IF;
END;$$;

CREATE TRIGGER count_trig AFTER INSERT OR DELETE ON my_large_table
   FOR EACH ROW EXECUTE PROCEDURE count_trig();

您只需要在某个时候初始化表。

语句级触发器TRUNCATE留给读者作为练习。


推荐阅读