This was my problem for Dynastie (a static blog generator). I have a main super class Post and a derived class Draft that directly inherit from the first one.
class Post(models.Model):
title = models.CharField(max_length=255)
category = models.ForeignKey(Category, blank=True, null=True, on_delete=models.SET_NULL)
creation_date = models.DateTimeField()
modification_date = models.DateTimeField()
author = models.ForeignKey(User, null=True, on_delete=models.SET_NULL)
description = models.TextField(max_length=255, blank=True)
...
class Draft(Post):
pass
A draft is a note that will not be published soon. When it's published, the creation date is reset. Using POO and inheritance, it's quick and easy to model this behavior. Nevertheless, there is one problem. When I do Post.objects.all() I get all Post objects + all Draft objects, which is not what I want !!
The trick to obtain only Post is a mix with Python and Django mechanisms called Managers. Managers are at the top level of QuerySet construction. To solve our problem, we'll override the models.Model attribute objects (which is a models.Manager).
Inside inheritance
To find a solution, we need to know what exactly happens when we do inheritance. The best thing to do, is to inspect the database.
CREATE TABLE "dynastie_post" (
"id" integer NOT NULL PRIMARY KEY,
"title" varchar(255) NOT NULL,
"category_id" integer REFERENCES "dynastie_category" ("id"),
"creation_date" datetime NOT NULL,
"modification_date" datetime NOT NULL,
"author_id" integer REFERENCES "auth_user" ("id"),
"description" text NOT NULL);
CREATE TABLE "dynastie_draft" (
"post_ptr_id" integer NOT NULL PRIMARY KEY REFERENCES "dynastie_post" ("id")
);
We can see that dynastie_draft has a reference to the dynastie_post table. So, doing Post.objects.all() is like writing "SELECT * from dynastie_post" that includes Post part of drafts.
Solution 1 : Without altering base class
The first solution is to create a Manager that will exclude draft id from the request. It has the advantage to keep base class as is, but it's not efficient (especially if there is a lot of child objects).
class PostOnlyManager(models.Manager):
def get_queryset(self):
query_set = super(PostOnlyManager, self).get_queryset()
drafts = Draft.objects.all().only("id")
return query_set.exclude(id__in=[draft.id for draft in drafts])
class Post(models.Model):
objects = PostOnlyManager()
class Draft(Post):
objects = models.Manager()
With this solution, we do two requests at each access. Plus, it's necessary to know every sub class we want to exclude. We have to keep the BaseManager for all subclasses. You can note the use of only method to limit the query and de serialization to minimum required.
Solution 2 : With altering base class
The solution here is to add a field called type that will be filtered in the query set. It's the recommended one in the Django documentation.
class PostOnlyManager(models.Manager):
def get_query_set(self):
return super(PostOnlyManager, self).get_queryset().filter(post_type='P')
class Post(models.Model):
objects = PostOnlyManager()
post_type = models.CharField(max_length=1, default='P')
class Draft(Post):
objects = models.Manager()
@receiver(pre_save, sender=Draft)
def pre_save_draft_signal(sender, **kwargs):
kwargs['instance'].post_type = 'D'
The problem here is the add of one field which increase database size, but filtering is easier and is done in the initial query. Plus, it's more flexible with many classes. I used a signal to setup post_type value, didn't found a better solution for now.
Conclusion
Depending on your constraints you can use the solution 1 or 2. These solutions can also be extended to a more complex filtering mechanism by dividing your class tree in different families.