postgresql 按关联值与从左外联接进行排序会产生重复项

9wbgstp7  于 2023-06-29  发布在  PostgreSQL
关注(0)|答案(1)|浏览(174)

我得到重复的,有时无序的结果与做左外加入。
我认为问题在于有许多artwork_colors可以匹配WHERE条件。但是我只需要artwork的artwork_color具有最高的pixel_percent(也匹配WHERE子句),并使用该值对artwork进行排序。当然,每件作品在结果中只出现一次。
如何重写ActiveRecord查询以正确排序结果,同时避免重复?(这个查询必须使用ActiveRecord进行,这样我就可以将其插入到现有的分页系统中,等等)
针对此问题简化的查询:

# color_ids is an group of ids of colors that are similar to the colors being searched for
# for example if you search for FF0000 it will include similar shades of red

SELECT DISTINCT
    "artwork_colors"."pixel_percent",
    artworks.*
FROM
    "artworks"
    LEFT OUTER JOIN "artwork_colors" ON "artwork_colors"."artwork_id" = "artworks"."id"
WHERE
    "artwork_colors"."color_id" IN(106, 108, 119, 120, 128, 133, 156, 160)
ORDER BY
    "artwork_colors"."pixel_percent" DESC
LIMIT 120 OFFSET 0;

ActiveRecord中的原始查询:

artworks
  .includes(:artwork_colors)
  .where('artwork_colors.color_id': color_ids)
  .order(pixel_percent: :desc)
  .select('artworks.*', 'artwork_colors.pixel_percent')
  .distinct

相关模型和表格:

class Artwork < ApplicationRecord
  has_many :artwork_colors, dependent: :destroy
  has_many :colors, through: :artwork_colors
end

class ArtworkColor < ApplicationRecord
  belongs_to :artwork
  belongs_to :color
end

CREATE TABLE public.artwork_colors (
  id bigint NOT NULL,
  pixel_percent double precision, # this is the desired sort column
  artwork_id bigint,
  color_id bigint
);

class Color < ApplicationRecord
  # These colors were extracted from the Artwork images by Amazon Rekognition, an image analysis tool
  has_many :artwork_colors, dependent: :destroy
  has_many :artworks, through: :artwork_colors
end

# h s l are hue, saturation, lightness (the color value)
CREATE TABLE public.colors (
  id bigint NOT NULL,
  h double precision,
  s double precision,
  l double precision
);
n8ghc7c1

n8ghc7c11#

如果我对目标的理解正确的话,你正在寻找每个Artwork,按最高的artwork_colors.pixel_percent排序。
为了实现这一点,您需要一个子查询来查找每个artwork的最大pixel_percent,然后将该子查询连接到artworks表。
这可以如下实现:

subquery = ArtworkColors.select(
             :artwork_id,
             ArtworkColors.arel_table[:pixel_percent].maximum.as('highest_pixel_percent'))
          .where(color_id: color_ids)
          .group(:artwork_id)
          .arel.as('artwork_colors')

Artwork.joins(
  Arel::Nodes::OuterJoin.new(
    subquery,
    subquery.create_on(subquery[:artwork_id].eq(Artwork.arel_table[:id]))
))
.select(Artwork.arel_table[Arel.star], subquery[:highest_pixel_percent])
.order(subquery[:highest_pixel_percent].desc)

这将生成以下查询:

SELECT 
  artworks.*, 
  artwork_colors.highest_pixel_percent
FROM 
  artworks
  LEFT OUTER JOIN 
   ( SELECT 
       artwork_colors.artwork_id, 
       MAX(artwork_colors.pixel_percent) AS highest_pixel_percent 
     FROM 
       artwork_colors 
     WHERE 
       artwork_colors.color_id IN (106, 108, 119, 120, 128, 133, 156, 160)
     GROUP BY 
       artwork_colors.artwork_id
   ) artwork_colors ON artwork_colors.[artwork_id] = [artworks].[id]
ORDER BY 
  artwork_colors.highest_pixel_percent DESC

由于这是外部连接,因此“highest_pixel_percent”可能为NULL。如果您希望将其表示为0,则可以将subquery[:highest_pixel_percent]更改为:

Arel::Nodes::NamedFunction.new(
  'ISNULL',
  [subquery[:highest_pixel_percent],0]
).as('highest_pixel_percent')

这将导致ISNULL(artwork_colors.highest_pixel_percent,0) AS highest_pixel_percent

相关问题