Find An Object's Page Number Within A Django Paginated Queryset

Efficiently find object page numbers in Django paginated views using Window functions, RowNumber expressions, and optimized database queries.

Background

I had an Django UpdateView that returned back to a ListView on a successful form submission. The object being modified wasn't complex enough to warrant a DetailView in-between the list and update views. Each item in the list had a link to a form to modify the object and a good form submission would redirect back to this list. This worked well until enough objects accumulated for pagination to come into play.

The success_url on the UpdateView always pointed back to the start of the list since there wasn't a page number to communicate where I was in the QuerySet. That wasn't really the behavior I desired. Instead, l wanted to return to where I was on the list before leaving to work with the object.

Consider a situation where you're working on objects in a list sequentially for some sort of maintenance. You're not going to want to navigate to the page you were working on after every form submission starting from the first page. That would quickly become tedious.

I came up with a quick solution to determine the position and page of an object within an ordered QuerySet. The entirety of the logic exists within the get_success_url method and results in just a single query.

The Code

# views.py

from django.views.generic import UpdateView
from django.urls import reverse_lazy
from django.db.models.expressions import Window
from django.db.models.functions import RowNumber
from django.db.models import F

from .models import MyModel
from .views import MyListView
from .forms import MyForm


class MyUpdateView(UpdateView):
    success_url = reverse_lazy('url-pattern-name')
    ...

    def get_success_url(self):
        # url to list view
        url_base = super().get_success_url()
        # reset_queries()

        list_view = MyListView()
        ordering = list_view.get_ordering()
        if isinstance(ordering, str):
            ordering = (ordering,)

        expressions = []
        for field in ordering:
            if field.startswith('-'):
                # descending
                expressions.append(F(field[1:]).desc())
            else:
                expressions.append(F(field))

        qset = MyModel.objects.all().order_by(*ordering)\
        .annotate(
            row_number=Window(
                expression=RowNumber(), order_by=expressions
            )
        )
        tgt_row = None
        for item in qset:
            if item.pk == self.object.pk:
                tgt_row = item.row_number
                break

        page_num = None
        if tgt_row:
            page_num = ((tgt_row - 1) // list_view.paginate_by) + 1

        # print(len(connection.queries))
        if page_num and page_num > 1:
            return '%s?page=%s' % (url_base, page_num)
        return url_base

What's Happening

First, I retrieve the success_url by calling super() on the method. If we're on the first page or if I'm unable to determine a page, I'll just return this unmodified.

Next, I create an instance of the ListView where I'm returning to so I can retrieve its ordering. Realistically you could just access the order_by property directly on the class without creating an instance. The get_ordering method just returns this property unless you've customized the class.

# django ListView get_ordering method

    def get_ordering(self):
        """Return the field or fields to use for ordering the queryset."""
        return self.ordering

Source code on GitHub.

For my purposes, I'm going to be thorough in case this logic is abstracted later. Additionally, I check what's returned from this method and determine whether it's a string or a tuple. If it's a string, I convert it to a tuple. This mirrors Django's approach in ListView.get_queryset().

Next, I compile the expressions for the query. Because the fields being applied to ordering might be structured for inverse ordering, I check each field if it's prefixed with a dash ("-"). If it is, I call desc() on F() which is a method of the Expression class. Additionally, I slice the string with field[1:] so the dash doesn't wind up in the expression.

Next, I build the actual query. The goal is to annotate each record with a row number that corresponds to its position in the ordered queryset. I use Django's Window function with the RowNumber() expression to assign sequential numbers (1, 2, 3...) to each record based on the same ordering used by the ListView. The order_by=expressions parameter ensures the row numbers follow the exact same sort order as the paginated list.

Next, I loop through the resulting QuerySet and try to find my object within it. If I'm able to find it (which I should always be able to - might be worth logging if you don't because something funky is happening), then I retrieve the row number.

Next, I perform a little arithmetic on the object count at and before my target object to back out what the page number should be using the paginate_by property on the ListView. Note, I don't need to actually count the objects because the row number communicates this same information.

Last, if all of the above is successful and I have a resulting page number, I format the original success_url string to include the page number as a query parameter. If one isn't produced, we just return the original URL.

Other Considerations

This method could be improved by checking whether the order_by and paginate_by properties are set on the view class. It's possible they're not. I would place these checks at the beginning of the block since there's no point to any of this logic if these properties aren't available.

Furthermore, there's a potential memory issue with the following:

        tgt_row = None
        for item in qset:
            if item.pk == self.object.pk:
                tgt_row = item.row_number
                break

The entire QuerySet is being loaded into memory. If the data is rather large, we're going to run into problems. In my case the data size is and will remain small so I'm not too worried.

To be on the safe side, we can call the iterator() method on the QuerySet which will chunk the query rather than loading the entire thing into memory. If you're only needing to access the object set once, this will likely be more performant.

Keep in mind that caching won't occur at the QuerySet level should you need to work with the dataset down stream. Also, calling iterator() on a QuerySet that has already been evaluated will result in additional calls. In our case, the call to iterator() is the first and only evaluation of the query so that's not a concern.

The chunk size for the iterator is determined by the argument "chunk_size" and has a default value of 2,000.

        tgt_row = None
        for item in qset.iterator():
            if item.pk == self.object.pk:
                tgt_row = item.row_number
                break

Memory usage could be further improved by only fetching the fields needed to perform this operation using the QuerySet method only() or at least exclude fields for large data (think content fields for blogs) using the defer() method, should there be any. We're only trying to determine a row's position within a QuerySet relative to pagination breaks. Any fields not needed for that (anything besides the primary key and fields ordering is applied on) adds no benefit to this logic and consumes additional resources.

The last thing I'll mention, you could simply memorize the page number instead of determining an object's position within a QuerySet. There's a couple options to implement this approach. You could retain the query parameter in the URL at each step. Furthermore, you could save the page number to cache, either browser or server-side, and retrieve it when you return to the list. However, maintaining state may become challenging if operations don't occur in the order you expect them to (e.g. the user navigates away). For that reason alone, I'd go with the URL query parameter approach.

Final Thoughts

The approach discussed here provides an efficient solution to a common UX problem in Django applications. By leveraging window functions and row numbering, we can maintain the user's context in paginated lists with minimal database overhead. While there are memory considerations for very large datasets, the iterator optimization and graceful fallback behavior make this technique suitable for most real-world applications. The code is straightforward to implement and understand, making it easy to adapt for different models and ordering requirements. Most importantly, it significantly improves the user experience by eliminating the frustration of losing your place in long lists after performing operations.

Find An Object's Page Number Within A Django Paginated Queryset

Background

The Code

What's Happening

Other Considerations

Final Thoughts

Details

Topics

Tags

Next

Flatpages In Django Part 2: Extending Flatpages