在大型关联上高效选择和区分

我有三个模型：Catalog、Product和Value。Value表有一个characteristic_id列，我想在一组values上获取不同characteristic_id的列表。

这些关系是：

一个catalog有很多products
一个product有很多values

这是我提出的查询：

Value.joins(:product).select(:characteristic_id).distinct.where(products: {catalog_id: catalog.id}).pluck(:characteristic_id)
=> [441, 2582, 3133]

返回正确的结果，但在包含一百万种产品(约 50 秒(的大型目录上非常慢。我找不到更有效的方法来做到这一点。

下面是查询的EXPLAIN：

=> EXPLAIN for: SELECT DISTINCT "values"."characteristic_id" FROM "values" INNER JOIN "products" ON "products"."id" = "values"."product_id" WHERE "products"."catalog_id" = $1 [["catalog_id", 1767]]
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
HashAggregate  (cost=1515106.82..1515109.15 rows=233 width=4)
Group Key: "values".characteristic_id
->  Hash Join  (cost=124703.76..1492245.65 rows=9144469 width=4)
Hash Cond: ("values".product_id = products.id)
->  Seq Scan on "values"  (cost=0.00..1002863.07 rows=34695107 width=8)
->  Hash  (cost=114002.20..114002.20 rows=652285 width=4)
->  Bitmap Heap Scan on products  (cost=12311.64..114002.20 rows=652285 width=4)
Recheck Cond: (catalog_id = 1767)
->  Bitmap Index Scan on index_products_on_catalog_id  (cost=0.00..12148.57 rows=652285 width=0)
Index Cond: (catalog_id = 1767)
(10 rows)

关于如何更快地运行此查询的任何想法？

确保两个外键都有索引：

索引"values"."product_id"
索引"products"."catalog_id"

尝试在values.characteristic_id上添加索引。
通常GROUP BY可能比DISTINCT快：

Value.joins(:p roduct(.where(products： {catalog_id： catalog.id}
(.select(：characteristic_id(.group(：characteristic_id(.pluck(：characteristic_id(

相关内容

最新更新

热门标签：