在Django中同时使用annotate和distinct(field)



我的应用中有很多评论,用户可以"点赞"。评论。

我想要得到最喜欢的评论。然而,该应用程序上有一些受欢迎的用户,他们所有的评论都有最多的赞。我只想为每个用户选择一条评论(最好是最受欢迎的评论)。

这是我的对象,

class Review(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE, related_name='review_user', db_index=True)
review_text = models.TextField(max_length=5000)
rating = models.SmallIntegerField(
validators=[
MaxValueValidator(10),
MinValueValidator(1),
],
)
date_added = models.DateTimeField(db_index=True)
review_id = models.AutoField(primary_key=True, db_index=True)
class LikeReview(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE, related_name='likereview_user', db_index=True)
review = models.ForeignKey(Review, on_delete=models.CASCADE, related_name='likereview_review', db_index=True)
date_added = models.DateTimeField()
class Meta:
unique_together = [['user', 'review']]

以下是我目前获得最喜欢评论的内容:

reviews = Review.objects.filter().annotate(
num_likes=Count('likereview_review')
).order_by('-num_likes').distinct()

正如你所看到的,我得到的评论将按照点赞数排序,但有可能点赞数最多的评论都来自同一用户。我想在这里加上distinct('user')但我得到的是annotate() + distinct(fields) is not implemented

我怎样才能做到这一点?

由于您的相关名称,这将有点不好读。我建议将Review.user.related_name改为reviews,这样会更容易理解,但我已经在答案的第二部分详细说明了这一点。

与您当前的设置,我设法做到这一点完全在DB使用子查询:

from django.db.models import Subquery, OuterRef, Count
# No DB Queries
best_reviews_per_user = Review.objects.all()
.annotate(num_likes=Count('likereview_review'))
.order_by('-num_likes')
.filter(user=OuterRef('id'))
# No DB Queries
review_sq = Subquery(best_reviews_per_user.values('review_id')[:1])
# First DB Query
best_review_ids = User.objects.all()
.annotate(best_review_id=review_sq)
.values_list('best_review_id', flat=True)
# Second DB Query
best_reviews = Review.objects.all()
.annotate(num_likes=Count('likereview_review'))
.order_by('-num_likes')
.filter(review_id__in=best_review_ids)
.exclude(num_likes=0)  # I assume this is the case

# Print it
for review in best_reviews:
print(review, review.num_likes, review.user)
# Test it
assert len({review.user for review in best_reviews}) == len(best_reviews)
assert sorted([r.num_likes for r in best_reviews], reverse=True) == [r.num_likes for r in best_reviews]
assert all([r.num_likes for r in best_reviews])

让我们试试这个完全等价的模型结构:

from django.db import models
from django.utils import timezone

class TimestampedModel(models.Model):
"""This makes your life much easier and is pretty DRY"""
created = models.DateTimeField(default=timezone.now)
class Meta:
abstract = True

class Review(TimestampedModel):
user = models.ForeignKey(User, on_delete=models.CASCADE, related_name='reviews', db_index=True)
text = models.TextField(max_length=5000)
rating = models.SmallIntegerField()
likes = models.ManyToManyField(User, through='ReviewLike')

class ReviewLike(TimestampedModel):
user = models.ForeignKey(User, on_delete=models.CASCADE, db_index=True)
review = models.ForeignKey(Review, on_delete=models.CASCADE, db_index=True)

点赞是评论和用户之间明确的m2m关系,有一个额外的时间戳列——这是Through模型的模型使用。文档在这里。

现在所有的东西都更容易阅读了。

from django.db.models import OuterRef, Count, Subquery

# No DB Queries
best_reviews = Review.objects.all()
.annotate(like_count=Count('likes'))
.exclude(like_count=0)
.order_by('-like_count')
# No DB Queries
sq = Subquery(best_reviews.filter(user=OuterRef('id')).values('id')[:1])
# First DB Query
user_distinct_best_review_ids = User.objects.all()
.annotate(best_review=sq)
.values_list('best_review', flat=True)
# Second DB Query
best_reviews = best_reviews.filter(id__in=user_distinct_best_review_ids).all()

方法如下:

  1. 获取代表用户的元组列表。Id和review。id,按用户和点赞数排序。ASCENDING
  2. 将列表转换为字典以删除重复的用户id。后面的项会替换前面的项,这就是为什么步骤1中的排序很重要
  3. 创建评审列表。从字典
  4. 中的值获取id。
  5. 使用查看列表获取查询集。id,按喜欢数降序排列
from django.db.models import Count
user_review_list = Review.objects
.annotate(num_likes=Count('likereview_review'))
.order_by('user', 'num_likes')
.values_list('user', 'pk')
user_review_dict = dict(user_review_list)
review_pk_list = list(user_review_dict.values())
reviews = Review.objects
.annotate(num_likes=Count('likereview_review'))
.filter(pk__in=review_pk_list)
.order_by('-num_likes')

最新更新