Efficient bulk deletions in Django

I needed to bulk-delete a large number of objects today. Django deletions are relatively inefficient by default, because Django implements its own version of cascading deletions and fires signals for each deleted object.

I knew that I wanted to avoid both of these and run a bulk DELETE SQL operation.

Django has an undocumented queryset._raw_delete(db_connection) method that can do this:

reports_qs = Report.objects.filter(public_id__in=report_ids)
reports_qs._raw_delete(reports_qs.db)

But this failed for me, because my Report object has a many-to-many relationship with another table - and those records were not deleted.

I could have hand-crafted a PostgreSQL cascading delete here, but I instead decided to manually delete those many-to-many records first. Here's what that looked like:

report_availability_tag_qs = (
    Report.availability_tags.through.objects.filter(
        report__public_id__in=report_ids
    )
)
report_availability_tag_qs._raw_delete(report_availability_tag_qs.db)

This didn't quite work either, because I have another model Location with foreign key references to those reports. So I added this:

Location.objects.filter(latest_report__public_id__in=report_ids).update(
    latest_report=None
)

That combination worked! The Django debug toolbar confirmed that this executed one UPDATE followed by two efficient bulk DELETE operations.

Created 2021-04-09T10:58:37-07:00, updated 2022-03-20T21:56:38-07:00 · History · Edit