Background
I generally start off by deploying a project with a single S3-compatible storage bucket. In the beginning of a project's life cycle the volume of files you need to store is often low so there's no need to overcomplicate things. There comes a time, however, when breaking apart the handling of media files and static files into separate buckets makes sense.
Is a multi-bucket approach right for your project? Consider the tradeoffs presented in the following table.
Aspect | Single Bucket | Multi-Bucket |
---|---|---|
Complexity | Low (single config) | Higher |
Control | Limited (shared settings) | Granular |
Cost |
*Undetermined |
*Undetermined |
CDN Integration | Possible, but uniform | Bucket-level flexibility |
*Adding buckets likely won't materially increase cost since cloud providers generally bill by usage and not bucket count.
Maybe you want more control over configuration or perhaps you're interested in serving the resources via a content delivery network (CDN). Whatever the reason, this article is going to explore how to accomplish this in Django using the libraries boto3
and django-storages
.
Dependencies
This discussion assumes Django=5.2, boto3=1.38.46 and django-storages=1.14.6.
Single Bucket Approach
First let's take a look at how we'd approach a single-bucket setup in Django so we have a starting point. To make things simple, I'm going to assume we're using the same backend(s) for all environments. That is, we're not going to treat file storage differently between development and production.
Next, we need to select a storage backend. I'm using the django-storages
S3Boto3Storage
backend in my setup. This backend needs a couple of global settings.
# settings.py
AWS_ACCESS_KEY_ID = os.environ.get('STORAGE_ACCESS_KEY')
AWS_SECRET_ACCESS_KEY = os.environ.get('STORAGE_SECRET_KEY')
AWS_STORAGE_BUCKET_NAME = '<bucket_name>'
AWS_S3_REGION_NAME = '<region>'
# To avoid AuthorizationQueryParametersError errors, region_name
# should also be set.
AWS_S3_ENDPOINT_URL = '<protocol>://<region>.<storage_provider>'
# overrides region and use_ssl
AWS_S3_OBJECT_PARAMETERS = {
'CacheControl': 'max-age=86400',
'ACL': 'public-read'
}
Rather than use the S3Boto3Storage
backend class directly in the Django STORAGES
setting, you often see people subclassing this object in a separate file. This is because Django <4.2 required subclassing to override global settings. Let's say you want static and media files to be nested in their own bucket directories.
# app.backends.py
from storages.backends.s3boto3 import S3Boto3Storage
class StaticRootS3BotoStorage(S3Boto3Storage):
location = 'static'
class MediaRootS3BotoStorage(S3Boto3Storage):
location = 'media'
The backends would then be tied in to pre-Django 4.2 with the settings STATICFILES_STORAGE
and DEFAULT_FILE_STORAGE
.
# settings.py
STATICFILES_STORAGE = 'app.backends.StaticRootS3BotoStorage'
DEFAULT_FILE_STORAGE = 'app.backends.MediaRootS3BotoStorage'
In Django 4.2, this approach was deprecated in favor of the new STORAGES
setting. In 5.1, support was removed entirely.
Now we use the STORAGES
setting as mentioned.
# settings.py
STORAGES = {
'default': {
'BACKEND': 'app.backends.MediaRootS3BotoStorage'
},
'staticfiles': {
'BACKEND': 'app.backends.StaticRootS3BotoStorage'
},
}
This change gives us more flexibility to extend storage backends that we didn't previously have. Specifically, we now have the OPTIONS
setting which passes a dictionary to the backend's init method. Instead of subclassing the backend and establishing separate class attributes to customize "location" as I did previously, we can pass this value through OPTIONS
and avoid the subclassing altogether.
# settings.py
STORAGES = {
'default': {
'BACKEND': 'storages.backends.s3boto3.S3Boto3Storage',
'OPTIONS': {
'location': 'media',
},
},
'staticfiles': {
'BACKEND': 'storages.backends.s3boto3.S3Boto3Storage',
'OPTIONS': {
'location': 'static',
},
},
}
The result of this will be two directories in our bucket: "static" and "media."
.
├── media
└── static
Pretty cool. But instead, let's update this approach to use separate buckets.
Multi-Bucket Approach
Instead of having a single bucket with directories for "media" and "static" files, we're going to use independent buckets to give us greater control. First, let's update our global settings. Because we're now setting the bucket name for each backend, we don't need a global bucket name configuration. The setting AWS_STORAGE_BUCKET_NAME
can be removed.
# settings.py
AWS_ACCESS_KEY_ID = os.environ.get('STORAGE_ACCESS_KEY')
AWS_SECRET_ACCESS_KEY = os.environ.get('STORAGE_SECRET_KEY')
#AWS_STORAGE_BUCKET_NAME = '<bucket_name>'
# ^^ remove
AWS_S3_REGION_NAME = '<region>'
# To avoid AuthorizationQueryParametersError errors, region_name
# should also be set.
AWS_S3_ENDPOINT_URL = '<protocol>://<region>.<storage_provider>'
# overrides region and use_ssl
AWS_S3_OBJECT_PARAMETERS = {
'CacheControl': 'max-age=86400',
'ACL': 'public-read'
}
Next, let's discuss what the buckets should be named. If you're using a single cloud provider and a single storage location, I'd stick to communicating information about your project and prefix something specific to your organization since buckets have to be unique globally. Consider the following structure.
<organization>-<project>-<environment>-<purpose>
For example, this could be acme-myapp-prod-static
. This tells me that the bucket is used with the project "myapp," in production, and its purpose is to serve static files. And again, the bucket is prefixed with "acme," the name of my hypothetical organization, to ensure my buckets are unique globally.
If there are other variables that may otherwise communicate useful information, include them in your naming convention. For example, let's say you're using multiple storage providers across multiple regions. In that case, I'd include that data in the name as well. I'm using one region with one provider so I'll keep mine simple to keep the bucket name length to a minimum.
After these buckets have been created with our cloud provider, we can go ahead and update our STORAGES
setting. We're not going to need location
set anymore since our buckets are already purpose-specific. Additionally, we'll add bucket_name
to both of our storage configurations pointing to their respective buckets.
# settings.py
STORAGES = {
'default': {
'BACKEND': 'storages.backends.s3boto3.S3Boto3Storage',
'OPTIONS': {
'bucket_name': 'acme-myapp-prod-media',
},
},
'staticfiles': {
'BACKEND': 'storages.backends.s3boto3.S3Boto3Storage',
'OPTIONS': {
'bucket_name': 'acme-myapp-prod-static',
},
},
}
With bucket_name
configured, we're good to go. For static files, make sure you run python manage.py collectstatic
to re-populate your static files in the newly created bucket. For media files, you're going to need to transfer these either manually through shell or using your cloud provider's UI if they have a tool available for migrations. Even if they do, I still recommend using shell to accomplish this.
CDN
Rather than serve static and media files from the bucket's URL, we can serve our files using a CDN and a custom domain. This means that we can deliver files like our main.css
from something like static.acme.com/main.css
instead of <bucket-region>.<cloud-provider>.com/<bucket-name>/main.css
. But the benefit isn't just aesthetic. Using CDNs to deliver files often results in quicker delivery and thus faster page loads which in turn improve user experience. Let's see how to set up our project for a CDN.
The discussion is limited to the application-level set up of CDNs. Refer to your cloud provider's documentation on how to implement CDNs for your storage buckets.
Back in our STORAGES
setting, all we need to do is set custom_domain
for each bucket. If we were using the same CDN, we could set this value once globally with AWS_S3_CUSTOM_DOMAIN
. However, each bucket will have its own domain so we'll populate the OPTIONS
dictionary for each bucket instead.
# settings.py
STORAGES = {
'default': {
'BACKEND': 'storages.backends.s3boto3.S3Boto3Storage',
'OPTIONS': {
'bucket_name': 'acme-myapp-prod-media',
'custom_domain': 'media.acme.com',
},
},
'staticfiles': {
'BACKEND': 'storages.backends.s3boto3.S3Boto3Storage',
'OPTIONS': {
'bucket_name': 'acme-myapp-prod-static',
'custom_domain': 'static.acme.com'
},
},
}
That's all we need to do at the application level and we're now ready to serve files using a CDN.
Final Thoughts
Using multiple buckets gives you greater control over how files are served from cloud storage providers. In this article we looked at how to migrate from a single-bucket structure setup to a multi-bucket setup where media and static files are being served from different locations. Furthermore, we structured our storage so that our files are being served from a custom domain over a CDN rather than the unwieldy URLs that comes with delivering files directly from a bucket. Our files will also load faster as a result.
Whether you're scaling a growing application or simply seeking more granular configuration options, this approach lays a solid foundation for efficient file management in Django. Give it a try in your next deployment, and enjoy the improved flexibility and speed!