Content Components to Consider When Designing a Blog

What types of content structure to consider, how to break it apart and considerations on indexing content.

Background

When beginning a blog project your constraints on how to handle content are limited in scope because, well, there isn't much or any content. The obvious place to start is with a model for an article/post. But quickly you'll find there are other types of content that need to be treated differently.

This article is a discussion on how I've chosen to break apart different content and how I've grouped or otherwise indexed the content to make retrieval better.

Content Types

In my experience with blogs, and in my experience building this particular project, I've found that content tends to fit into a handful of different buckets. Some of these will be obvious and others perhaps less so. Don't over think your structure or over-engineer your project either. If there was a content type I didn't have a need for initially, I wouldn't spend any time building features around it.

Let's dig in.

Pages

Before you begin writing blog posts, you're going to need to build out a few primary components for your website. At the very least you'll need a home page and pages for your "privacy policy" as well as for "terms and conditions." Additionally, you may have other high-level pages like an "about" page, and so on.

You'll likely want at least some of the content presented on pages like these stored in a database and not statically coded. That way, you can use whatever backend to implement changes without needing to update source code.

If you're using Django, then you have the Flatpages app at your disposal. This is a really powerful tool and I highly recommend using it if you're building with Django. Take a look at two articles I wrote earlier on past implementations with Flatpages:

If you're not using Django, you can easily build your own data model since there aren't a lot of moving parts. At the most basic level, you just need a field to store content and another to store the URL. And depending on your set up you may not even need the URL field.

Other fields you could include in a pages model may include fields to handle data for meta tags, title and subtitle, etc.

Posts

Probably the most obvious of the data you'll need to manage for a blog is posts or articles. This, after all, is the whole point of any blog project. If you're using Django, you could once again use Flatpages to manage your posts. However, I personally prefer to manage this particular model directly.

When building the article content model, the primary fields I'm interested in are title, subtitle, and content. I like storing title and subtitle directly in the model rather than including these values in content. That way, I have the flexibility to reuse the values elsewhere. For example, a "card" linking to the article could be populated with the title and subtitle.

Additionally, any page should only ever have just one "heading 1" tag. To enforce this, I store the title as a model field and then set up my WYSIWYG editor to not allow "heading 1" tags. When the content is rendered, only a single "heading 1" tag is created and the title is inserted.

Just as with pages, I would also have fields for storing meta data for managing SEO. Other fields that may be useful to consider here would be author, date created, date updated, and so forth.

Images

When you write blog posts, you're going to run into situations where images will complement your written content. Assuming you'll want more than one image associated with a blog post at times, then you'll need a separate model to manage your images.

This model will need to include at a minimum a field to store the image's URL to wherever it's been uploaded (file system, S3 bucket, etc.) and a field to relate the image to a particular blog post. Strictly speaking, the foreign key isn't actually necessary since you'll be providing the images URL directly into the image tag in a blog post's content.

Storing which article the image relates to isn't going to affect this retrieval at all. Where I think there's a benefit, however, is for cascading deletes. If you delete a blog post you'll probably want to remove any associated images so they don't persist. It's unlikely that images would be re-used across posts. Additionally, knowing what post an image should belong to will allow us to more easily detect orphaned images.

Object storage is a direct cost to us for hosting our content. We want to minimize that cost. Thus, we don't want to store images needlessly.

This article I recently wrote goes into greater detail on my approach to images: How Do You Embed Images in a Django Blog From a WYSIWYG Editor?.

Referrals

Now we're starting to get to the less obvious types of content you may wish to manage. When including an outbound link in your content, you'll likely want some way of managing resources like this. Perhaps you want to see which links your visitors engage most with. Or maybe you're tracking outbound referrals to see what content creators you may want to collaborate with. I implemented my system so I could detect if outbound links become broken so that way my content doesn't refer to bad locations. Either way, tracking outbound traffic could be useful.

There's different ways to approach managing outbound links and some implementations are more or less popular than others. I recommend researching this topic deeper before making design decisions because there's considerable path dependency and refactoring can be difficult.

One approach would be to store the links directly in a data model. You use an internal link to identify a resource and then hit a database to retrieve the link you want to refer out to. The benefit here is that if the link needs to be updated, you only have to make a change in one place. The downside is you now have a database hit required to process a user action.

Another approach would be to store the link directly in the content but point to an internal redirect that accepts the link as an argument. For example:

www.example.com/external-redirect/?url=google.com

With this approach, I don't need to ask the user to wait for the system to retrieve the resource because it's being directly passed as an argument and the redirect engine can simply return this argument as the new destination for the browser. To have a similar amount of control and oversight over outbound links, you can then store (preferably asynchronously) the link each time a redirect is processed.

I prefer the second approach to the first because it's less invasive to the user experience. However, there's a major security risk to manage with this approach. If you're not careful, external origins could pass redirects through your system. I recommend validating the referrer at a minimum. But even still, headers can be spoofed. Be careful with this approach.

Another idea could be to use javascript on the page, avoiding a redirect completely, and having the browser send back to the server the URL of an outbound link with the user clicks on an anchor tag. Of all three, this is my favorite. There are fewer security concerns to manage and you can handle the action with minimal to no user interruption.

Affiliate Marketing

Links I would definitely store in a database would be links for affiliate marketing. Additionally, I would want my own way of tracking how many times these resources are accessed since this will directly determine your revenue and you'll want means of keeping your counterparty honest.

Short-Form Content (Experimental)

Having a separate data model for short-form content is a new idea I'm considering. I'm not convinced one way or another. Maybe having a separate model for short posts isn't necessary and instead redundant? The question I'm trying to answer is whether I want shorter blog posts to commingle with longer posts. Does a 2,000 word article belong next to something that's 200 words? I'm not sure.

Indexing

Once you have your models defined for your content, you need to index your content to make it easier for your visitors to find related content and thus boost engagement. The obvious choice is a "category" model that you'll set up as a foreign key to your posts and other content. You could also use a many-to-many field for categories because one article may span multiple categories (some of my articles fit within business and real estate). Or perhaps, an article doesn't fit cleanly within any one category.

I prefer to keep categories very high level; think broad topics. My goal is to keep the count as low as possible. 

In order to provide more granular indexing, I prefer using some sort of "tag" approach. If you're building with Django, I like the django-taggit package.

Categories and tags will provide a lot of dimensionality to group and index your posts. But there comes a point when your indexes themselves become content! I would consider a way to store some sort of content to preface or introduce the index. Maybe this isn't necessary for tags since they'll be a lot of them as your content volume grows. However, I would consider introducing pages for categories with some form of content (although as of this article being published I don't).

Final Thoughts

Blogs are fun projects to build and I've certainly enjoyed the process. When you get to building, you find that not everything fits into the tidy box of "blog post." So, we build other models to store and maintain our content as it develops and grows in different ways and dimensions. Furthermore, we'll need a way to keep all of this content organized with appropriate indexing. I hope the ideas shared here help you on your way to building your blog!

Details
Published
April 10, 2024
Next
October 27, 2023

Template Tests for Django TemplateView

A mixin for testing Django TemplateView class-based views to reduce having to write redundant test cases.