Background
When you update or delete an instance of a Django model that has an image field, existing image files will be unaffected. This is actually the behavior of FileField
which ImageField
inherits from and it's a feature, not a bug. According to PEP 20:
Explicit is better than implicit.
Rather than a second-order operation following the primary action on the database record, Django's developers reserve that logic for you to implement directly.
Why do we care? Without manually handling file and image deletions, these objects will become orphaned when the database records they're associated with are updated or deleted. Orphaned objects serve no purpose apart from the ability to revert to a previous state. Furthermore, objects not in use will incur costs and consume resources that could otherwise be deployed productively.
Signals
The signals we'll connect to are the pre_delete
and pre_save
signals. We wan't to use the "pre" signals because we need to be able to query the database for the object before changes are committed. This allows us to easily access the original image file so we can delete it and also so we can compare the old image URL to the new one to see if it's been modified.
There are other options theoretically. The pre_save
signal has an update_fields
argument that could be used to test whether the image field has been modified. However, this requires you to pass update_fields
to the model's save()
method directly and defaults to None
if not set. Considering this could lead to unexpected behavior, I prefer to simply compare the old object to the new one.
Last thing I'll mention is the Django documentation actually discourages the use of signals. The idea is that signals can introduce behavior that might not be expected since it's out of the normal flow and can make your code more difficult to maintain. You can place a signals file anywhere and therefore another developer may not even be aware it exists.
Instead, the recommendation is to use a custom manager or override model methods. That said, the situation discussed in this article is one I still often use signals for.
Receiver Functions
First, some clarification if you're new to signals. Often, we're not creating custom "signals." It's more likely that we're instead implementing logic in "receiver functions."
These functions take the form:
def my_callback(sender, **kwargs):
print("Request finished!")
The function accepts the "sender" argument and wildcard keyword arguments. The sender is any Python object (often models but not exclusively) that emits a signal and the kwargs are specific to the sender. Even if the sender isn't documented to dispatch any kwargs, you still need to include them or an exception will be raised.
Connecting Receivers
We connect our receivers in one of two ways: manual connection or using a decorator.
If you're using a manual connection approach, it's as simple as importing the signal and passing the receiver function to the signal's connect
method. This should be done within the scope of the ready()
method of an application configuration class.
Alternatively, you can use a decorator to wrap the receiver function which is the approach I most often take. Simply import the signal file within the ready()
method and you'll be good to go.
Delete
Handling deletion is the simpler of the two signals we need to implement. We know the record is being destroyed (unless it's a soft delete), so all we need to concern ourselves with is eliminating the file.
from django.db.models.signals import pre_delete
from django.dispatch import receiver
from core.models import Article
# user deletes image
@receiver(pre_delete, sender=Article)
def delete_image(instance, **_kwargs):
if instance.image:
instance.image.delete(save=False)
In this logic, we catch the pre_delete
signal dispatched by the Article
model with the @receiver
decorator. The receiver function is passed the Article
instance (sender) and receives some number of keyword arguments collected with the wildcard. I use an underscore prefix on kwargs to suppress IDE warnings that the argument isn't actually used in my logic. Here, all we need is the instance.
Within the function block, we first check that the image exists so we're not calling delete()
on a nonexistent file. If an image is found, we delete it. Additionally, we pass "save=False" to delete()
. This will prevent the model instance from being saved. Remember, this instance is already on its way to being saved since this is a "pre" signal. We don't want the duplicate operation.
Update
Next, we want to cover when an instance is updated and an image file may be affected. We'll use the pre_save
signal for this case. But first, there are additional considerations for updates.
First, I don't want this to be triggered when installing fixtures. Next, I want to make sure this isn't running when new objects are being created. And last, I only want this to run if the image field on the instance is being affected given that the signal will be triggered on any call to save()
.
import logging
from django.db.models.signals import pre_save
from django.dispatch import receiver
from core.models import Article
logger = logging.getLogger(__name__)
# user updates image
@receiver(pre_save, sender=Article)
def change_image(instance, **kwargs):
if kwargs.get('raw', False):
# don't run on fixtures
return
# noinspection PyProtectedMember
if instance._state.adding:
# don't run if we don't have an update
return
try:
old_article = Article(id=instance.id)
if old_article.image and \
old_article.image.url != instance.image.url:
old_article.image.delete(save=False)
except Article.DoesNotExist:
logger.error(f"Unable to retrieve article that should exist for "
f"ID {instance.id}.")
A simple way to exit the function if fixtures are being loaded is to check for "raw" in kwargs. If present, we're likely dealing with a fixture being processed. We don't want our signal to run because one, it's not an update and two, related fields may not be available and therefore exceptions could be raised.
Next, I check if the instance is being created. A simple way to do this is with _state.adding
present on the instance. Because this is a protected member, I added # noinspection PyProtectedMember
to prevent my PyCharm IDE from complaining.
Last, I run the file deletion within a try/except block. The object is queried from the database and compared to the instance containing modified fields. If the URL on the database object doesn't match the instance being saved, then I know I have an update and the old file should be deleted or else it will be orphaned.
An object should always be available from the database. But just in case, I catch the error and log a message to help debug what occurred.
Final Thoughts
Signals are a great way to add functionality to loosely coupled applications. Yet, there are times when I find them useful to extend the functionality of native logic. The above demonstrates how I use signals to ensure that image files aren't orphaned so that resources are deployed most productively.