django:预取GenericForeignKey的相关对象

假设我有一个模型Box和一个指向Apple实例或Chocolate实例的GenericForeignKey。Apple和Chocolate分别具有指向Farm和Factory的ForeignKey。我想显示Boxes的列表，为此我需要访问Farm和Factory。如何在尽可能少的数据库查询中做到这一点？

最小示例：

class Farm(Model):
...
class Apple(Model):
farm = ForeignKey(Farm)
...
class Factory(Model):
...
class Chocolate(Model):
factory = ForeignKey(Factory)
...
class Box(Model)
content_type = ForeignKey(ContentType)
object_id = PositiveIntegerField()
content_object = GenericForeignKey('content_type', 'object_id')
...
def __unicode__(self):
if self.content_type == ContentType.objects.get_for_model(Apple):
apple = self.content_object
return "Apple {} from Farm {}".format(apple, apple.farm)
elif self.content_type == ContentType.objects.get_for_model(Chocolate):
chocolate = self.content_object
return "Chocolate {} from Factory {}".format(chocolate, chocolate.factory)

以下是我试过的一些东西。在所有这些示例中，N是长方体的数量。查询计数假定Apple和Chocolate的ContentTypes已经缓存，因此get_for_model()调用不会命中DB。

1) Naive：

print [box for box in Box.objects.all()]

这会执行1(提取盒子)+N[为每个盒子提取苹果或巧克力]+

N2)select_related在这里没有帮助，因为Box.content_object是GenericForeignKey。

3) 从django 1.4开始，prefetch_related可以获取GenericForeignKeys。

print [box for box in Box.objects.prefetch_related('content_object').all()]

这会执行1(提取盒子)+2[为所有盒子提取苹果和巧克力]+N[为每个苹果提取农场，为每个巧克力提取工厂]查询。

4) 显然，prefetch_related不够聪明，无法遵循通用ForeignKeys的ForeignKeys。如果我尝试：

print [box for box in Box.objects.prefetch_related( 'content_object__farm', 'content_object__factory').all()]

它理所当然地抱怨Chocolate对象没有farm字段，反之亦然。

5) 我可以做：

apple_ctype = ContentType.objects.get_for_model(Apple) chocolate_ctype = ContentType.objects.get_for_model(Chocolate) boxes_with_apples = Box.objects.filter(content_type=apple_ctype).prefetch_related('content_object__farm') boxes_with_chocolates = Box.objects.filter(content_type=chocolate_ctype).prefetch_related('content_object__factory')
这会执行1(提取盒子)+2[为所有盒子提取苹果和巧克力]+2[为所有苹果提取农场，为所有巧克力提取工厂]查询。缺点是我必须手动合并和排序两个查询集(boxes_with_apples、boxes_with_chocolates)。在我的实际应用程序中，我在分页的ModelAdmin中显示这些框。目前还不清楚如何在那里集成这个解决方案。也许我可以写一个自定义分页器来透明地进行缓存？
6) 我可以在此基础上拼凑一些也可以执行O(1)查询的东西。但如果我能避免的话，我宁愿不去惹内部(_content_object_cache)。
总之：打印方框需要访问GenericForeignKey的ForeignKey。如何在O(1)查询中打印N个方框(5)是我能做的最好的，还是有更简单的解决方案？
要点：您将如何重构此DB模式以使此类查询更容易

您可以手动实现类似prefetch_selected的东西，并使用Django的select_related方法，该方法将在数据库查询中进行联接。

apple_ctype = ContentType.objects.get_for_model(Apple)
chocolate_ctype = ContentType.objects.get_for_model(Chocolate)
boxes = Box.objects.all()
content_objects = {}
# apples
content_objects[apple_ctype.id] = Apple.objects.select_related(
'farm').in_bulk(
[b.object_id for b in boxes if b.content_type == apple_ctype]
)
# chocolates
content_objects[chocolate_ctype.id] = Chocolate.objects.select_related(
'factory').in_bulk(
[b.object_id for b in boxes if b.content_type == chocolate_ctype]
)

这应该只进行3个查询(省略了get_for_model查询)。in_bulk方法返回格式为｛id:model｝的dict。因此，要获得您的content_object，您需要一个代码，如：

content_obj = content_objects[box.content_type_id][box.object_id]

但是，我不确定这段代码是否会比O(5)解决方案更快，因为它需要对boxes queryset进行额外的迭代，而且它还生成了一个带有WHERE id IN (...)语句的查询。

但是，如果只根据Box模型中的字段对框进行排序，则可以在分页后填充content_objectsdict。但您需要以某种方式将content_objects传递给__unicode__。

您将如何重构此DB架构以使此类查询更容易？

我们有类似的结构。我们将content_object存储在Box中，但在Apple和Chocolate中使用ForeignKey(Box)而不是object_id和content_object。在Box中，我们有一个get_object方法来返回Apple或Chocolate模型。在这种情况下，我们可以使用select_related，但在大多数用例中，我们会根据content_type过滤Boxes。所以我们有同样的问题，就像你的第五个选择。但是，我们在Django 1.2上启动了我们的项目，当时没有预取选择。

如果您将农场/工厂重命名为一些通用名称，如创建者，那么预取相关的名称会起作用吗？

关于选项6

我不能说任何反对填补_content_object_cache空缺的话。如果你不喜欢处理内部，你可以填写一个自定义属性，然后使用

apple = getattr(self, 'my_custop_prop', None)
if apple is None:
apple = self.content_object

相关内容

最新更新

热门标签：