django-pgtrigger-tutorial
This is a tutorial repository that you can use to configure and run various Postgres Triggers inside a Django app using django-pgtrigger.
This tutorial was part of the talk given by Wes Kendall at the SF Django Meetup on July 23rd, 2020.
Contents
Setup
The tutorial assumes you have Docker installed on your computer.
Run the following command to set up the entire project:
git clone https://github.com/wesleykendall/django-pgtrigger-tutorial.git && \
cd django-pgtrigger-tutorial && \
docker-compose build
Wait a few seconds for the build to start and then run this to finish setup:
docker-compose run --rm app python manage.py migrate
Tutorial
The models and associated triggers in this tutorial are all located under tutorial/models.py in the main Github repo. The models have several triggers defined, many of which we will go through in this tutorial.
Although not required, it is helpful to have a basic understanding of
django-pgtrigger
before going through this tutorial. Check out the
docs here
Note Every example in the tutorial that uses docker-compose
can
be executed by you locally, assuming that you followed the setup instructions
above.
Protecting model deletions
pgtrigger.Protect
is a trigger definition that raises an exception,
i.e. “protecting” an operation from happening. The trigger can be configured
for whatever conditions and events that need to be protected.
For example, the CannotDelete
model registers a pgtrigger.Protect
trigger to protect against deletions:
class CannotDelete(models.Model):
"""This model cannot be deleted.
The ``pgtrigger.Protect`` trigger protects the deletion operation
from happening
"""
class Meta:
triggers = [
pgtrigger.Protect(name='protect_deletes', operation=pgtrigger.Delete)
]
Run this code locally with the following code that tries to delete the model. It will fail loudly:
docker-compose run --rm app python manage.py shell_plus --quiet-load -c "
# This object cannot be deleted
cannot_delete = CannotDelete.objects.create()
cannot_delete.delete()
";
The error will look something like:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute
return self.cursor.execute(sql, params)
psycopg2.errors.RaiseException: pgtrigger: Cannot delete rows from tutorial_cannotdelete table
CONTEXT: PL/pgSQL function pgtrigger_protect_c1608bb7cb9f8647() line 5 at RAISE
Instead of trying to protect deletions of a model in the application layer,
a method that can be easily circumvented or ignored, a pgtrigger.Protect
trigger can reliably protect against operations you don’t want to happen.
Append-only models
The pgtrigger.Protect
trigger is useful for many more things than just protecting
deletions. Depending on the combinations of operations and conditions, one
can express many types of protections on models and fields.
For example, this trigger creates an “append-only” model:
class AppendOnly(models.Model):
"""This model can only be appended.
The ``pgtrigger.Protect`` trigger protects the update or delete operations
from happening, making this an "append-only" model.
"""
int_field = models.IntegerField()
class Meta:
triggers = [
pgtrigger.Protect(
name='append_only',
operation=pgtrigger.Update | pgtrigger.Delete
)
]
Running this code shows how we can create the model but not update or delete it (i.e. an “append-only” model).
docker-compose run --rm app python manage.py shell_plus --quiet-load -c "
from django.db import InternalError
# This model does not allow updates or deletes
append_only = AppendOnly.objects.create(int_field=0)
try:
# Saving anything will result in an error
append_only.save()
except InternalError:
print('Cannot update!')
try:
append_only.delete()
except InternalError:
print('Cannot delete!')
";
Since we’re catching the errors in this example, you’ll see output that looks like this:
Cannot update!
Cannot delete!
Read-only fields
The pgtrigger.Protect
trigger operates at a table level, so one might
ask themselves if it can be used to protect column-level operations.
This is where conditions come into play.
A condition on a row-level trigger is a WHERE
clause that can operate
on the OLD
and NEW
rows that are part of the database operation
that’s happening. Although
django-pgtrigger
allows one to write raw SQL for conditions, the
pgtrigger.Q
and pgtrigger.F
objects allow you to construct clauses
on the OLD
and NEW
rows similar to how Django’s Q
and F
objects work.
In this example, we’ve made the “created_at” field be read-only by protecting updates only when this field changes.
class ReadOnlyField(models.Model):
"""
The "created_at" field cannot be updated (i.e. a read-only field).
Updates to other fields will pass, but any updates to created_at will
result in an error
"""
created_at = models.DateTimeField(auto_now_add=True)
int_field = models.IntegerField()
class Meta:
triggers = [
pgtrigger.Protect(
name='read_only',
operation=pgtrigger.Update,
condition=pgtrigger.Q(old__created_at__df=pgtrigger.F('new__created_at'))
)
]
Running this code shows how we are not allowed to update the “created_at” field after the model is created.
docker-compose run --rm app python manage.py shell_plus --quiet-load -c "
from django.db import InternalError
from django.utils import timezone
# This model does not allow updates or deletes
read_only_field = ReadOnlyField.objects.create(int_field=0)
# An update on int_field is fine
read_only_field.int_field = 1
read_only_field.save()
try:
# An update on created_at will result in an error
read_only_field.created_at = timezone.now()
read_only_field.save()
except InternalError:
print('Cannot update field!')
";
Since we’re catching the errors in this example, you’ll see output that looks like this:
Cannot update field!
The error is thrown from the above code because we configured this protection
trigger to only fire whenever the created_at
of the OLD
row
(old__created_at
)
is distinct from (old__created_at__df
) the
created_at
in the NEW
row of the
update. When this condition doesn’t happen, the protection trigger doesn’t
fire and we are allowed to proceed with the operation.
Note If you want to make multiple read-only fields, extend your condition
to support it. For example, this condition will make created_at
and
int_field
both read-only fields.
class ReadOnlyFields(models.Model):
created_at = models.DateTimeField(auto_now_add=True)
int_field = models.IntegerField()
class Meta:
triggers = [
pgtrigger.Protect(
name='multiple_read_only',
operation=pgtrigger.Update,
condition=(
pgtrigger.Q(old__created_at__df=pgtrigger.F('new__created_at')) |
pgtrigger.Q(old__int_field__df=pgtrigger.F('new__int_field'))
)
)
]
Soft-deleting models
A model is “soft” deleted whenever a field is updated to indicate that
the object is deleted. For example, oftentimes one will set an is_active
flag to False
to indicate a model is deleted without actually deleting it
from the database.
Triggers, specifically pgtrigger.Before
triggers that run before an
operation, can dynamically ignore the operation being executed and
apply a different operation. The pgtrigger.SoftDelete
trigger
behaves in this very way, setting a field on a model and saving it
whenever a deletion happens on the model.
Here’s an example of how to configure the pgtrigger.SoftDelete
trigger on a model.
class SoftDelete(models.Model):
"""
This model cannot be deleted. When a user tries to delete it, the
model will be "soft" deleted instead and have the ``is_active``
boolean set to ``False``
"""
is_active = models.BooleanField(default=True)
class Meta:
triggers = [
pgtrigger.SoftDelete(
name='soft_delete',
field='is_active',
value=False
)
]
Running this code shows how we can delete the model using standard Django ORM calls without it actually being deleted.
docker-compose run --rm app python manage.py shell_plus --quiet-load -c "
# This model does not allow updates or deletes
soft_delete = SoftDelete.objects.create(is_active=True)
soft_delete_id = soft_delete.id
# Let's try to delete the model. Django will still behave as it normally does
# and will flush the object's contents, but the object will still be in the
# database in an inactive state
soft_delete.delete()
soft_delete = SoftDelete.objects.get(id=soft_delete_id)
print('is_active =', soft_delete.is_active)
";
The following will print:
is_active = False
As mentioned in the comments of the example, Django still behaves the same
way, but the database sets the is_active
flag to False
under the hood.
It is important to keep in mind that Django will still try to cascade
delete models that reference soft deleted models. One has the option to
make these models soft delete-able as well, and one can also fully delete
these with on_delete=models.CASCADE
or ignore the delete altogether
with on_delete=models.DO_NOTHING
in their foreign key definition. It
is up to the engineer to decide how cascading model deletion should happen
on soft-delete models in their application.
Versioning a model
The previous soft-delete example showed an example of a pgtrigger.Before
trigger, which is a trigger that modifies rows before the intended operation
takes place. We’re going to use a base pgtrigger.Trigger
class here to
write a trigger that will increment an integer field before an update is applied.
In this example, we’re going to write some raw SQL that will be embedded
directly into the trigger function that fires. When writing raw SQL for
a pgtrigger.Trigger
class, one only needs to write the SQL that executes
inside the trigger function and nothing else.
Below is an example of a model with a “version” field that is updated any time the model is updated. We’ve also made sure that this field is a read-only field so that nobody tries to update the version of the model.
class Versioned(models.Model):
"""
This model is versioned. The "version" field is incremented on every
update, and users cannot directly update the "version" field.
"""
version = models.IntegerField(default=0)
char_field = models.CharField(max_length=32)
class Meta:
triggers = [
# Protect anyone editing the version field directly
pgtrigger.Protect(
name='protect_version_edits',
operation=pgtrigger.Update,
condition=pgtrigger.Q(old__version__df=pgtrigger.F('new__version'))
),
# Increment the version field on changes
pgtrigger.Trigger(
name='versioned',
when=pgtrigger.Before,
operation=pgtrigger.Update,
func='NEW.version = NEW.version + 1; RETURN NEW;',
# Don't increment version on redundant updates.
condition=pgtrigger.Condition('OLD.* IS DISTINCT FROM NEW.*')
)
]
Given our versioned model, the following code shows how the version field is automatically updated:
docker-compose run --rm app python manage.py shell_plus --quiet-load -c "
from django.db import InternalError
versioned = Versioned.objects.create(version=0, char_field='hello')
print('initial version', versioned.version)
# Updating the model results in a new version
versioned.char_field = 'hi'
versioned.save()
versioned.refresh_from_db()
print('version after first update', versioned.version)
# Doing a redundant update does not bump the version
versioned.save()
versioned.refresh_from_db()
print('version after redundant update', versioned.version)
# Bump the version one more time
versioned.char_field = 'new field'
versioned.save()
versioned.refresh_from_db()
print('version after second update', versioned.version)
# Try to edit the version. It will result in an error
versioned.version = 1
try:
versioned.save()
except InternalError:
print('Cannot update version!')
";
The above example will output this:
initial version 0
version after first update 1
version after redundant update 1
version after second update 2
Cannot update version!
Note In our example trigger, we are incrementing the NEW
row
and returning it. There is no need to do any database operations. Postgres
will take this modified NEW
row and use it when performing the final
UPDATE
operation.
Note The return value of a BEFORE
trigger is very important since
it tells Postgres what row to use in the database operation. If NULL
is returned, Postgres
will block the operation from being performed. This is how pgtrigger.SoftDelete
works.
Creating “official” interfaces
Have you ever wanted to force engineers to use exactly one interface for
performing a specific database operation? For example, ensuring that
User.objects.create_user()
is always used instead of calling
User.objects.create()
?
The pgtrigger.Protect
trigger used in combination with
the pgtrigger.ignore()
context manager / decorator can be used to
accomplish this.
For example, imagine we have an OfficialInterface
model, and we want to ensure
that the OfficialInterface.objects.official_create()
interface is the only
method that can be used to create our example model:
class OfficialInterfaceManager(models.Manager):
@pgtrigger.ignore('tutorial.OfficialInterface:protect_inserts')
def official_create(self):
return self.create()
class OfficialInterface(models.Model):
"""
This model has inserts protected and can only be created by
using OfficialInterface.objects.official_create()
"""
objects = OfficialInterfaceManager()
class Meta:
triggers = [
pgtrigger.Protect(
name='protect_inserts',
operation=pgtrigger.Insert,
)
]
Let’s try to create this model using the standard OfficialInterface.objects.create()
method. It will result in an error, but we can create the object with
our official OfficialInterface.objects.official_create()
interface:
docker-compose run --rm app python manage.py shell_plus --quiet-load -c "
from django.db import InternalError
num_objects = OfficialInterface.objects.count()
try:
OfficialInterface.objects.create()
except InternalError:
print('No no no... Cannot use that interface!')
assert OfficialInterface.objects.count() == num_objects
# We can create the object using the official interface
obj = OfficialInterface.objects.official_create()
print('created', obj)
assert OfficialInterface.objects.count() == num_objects + 1
";
You should see something similar to the following output (the ID of the object might be different on your computer):
No no no... Cannot use that interface!
created OfficialInterface object (2)
In the example, we’ve used pgtrigger.ignore()
to dynamically ignore
our insertion protection trigger. pgtrigger.ignore()
will ignore
the execution of that trigger for a single thread of execution, thus
allowing us to define an “official” interface for a protected operation.
Note This example means that the Django admin and any other tools
that call the default objects.create()
won’t work. While this may
be desired behavior, keep in mind that a protection trigger will protect
the operation from anywhere (unless it is specifically ignored).
Restricting fields that transition
Oftentimes a status field on a model has particular states and transitions
among those states. The pgtrigger.FSM
trigger allows one to ensure
valid transitions (i.e. enforcing a finite state machine) of a field.
Consider the following model that has the following possible transitions of its status:
- “unpublished” to “published”
- “unpublished” to “inactive”
- “published” to “inactive”
We can list all of these valid transitions in a pgtrigger.FSM
trigger
like so, and it will raise an exception any time there is an invalid transition:
class FSM(models.Model):
"""The "status" field can only perform configured transitions during
updates. Any invalid transitions will result in an error.
"""
class Status(models.TextChoices):
UNPUBLISHED = 'unpublished'
PUBLISHED = 'published'
INACTIVE = 'inactive'
status = models.CharField(
choices=Status.choices,
default=Status.UNPUBLISHED,
max_length=16
)
class Meta:
triggers = [
pgtrigger.FSM(
name='validate_status_transitions',
field='status',
transitions=(
('unpublished', 'published'),
('unpublished', 'inactive'),
('published', 'inactive'),
)
)
]
The following code shows how we can perform valid transitions and how an invalid transition will result in an error:
docker-compose run --rm app python manage.py shell_plus --quiet-load -c "
from django.db import InternalError
# The "status" defaults to "unpublished"
fsm = FSM.objects.create()
# We are allowed to transition to "published"
fsm.status = FSM.Status.PUBLISHED
fsm.save()
# We can go to "inactive"
fsm.status = FSM.Status.INACTIVE
fsm.save()
# We cant go from "inactive" to "published"
fsm.status = FSM.Status.PUBLISHED
try:
fsm.save()
except InternalError:
print('Cannot make that transition!')
";
You’ll see the following output:
Cannot make that transition!
Note All triggers in django-pgtrigger
can have conditions applied to them.
For example, we can also define a pgtrigger.FSM
trigger that only enforces
transitions under certain conditions.
Tracking model history and changes
django-pgtrigger
can be used to snapshot all model changes, snapshot
changes whenever a particular condition happens, and even attach context from
your application (e.g. the authenticated user) to the triggered event.
Historical tracking and auditing is a problem that is going to be different
for every organization’s needs. Because of the scope of this problem, we
have created an entire history tracking library called
django-pghistory
that solves common needs for doing history tracking. It is implemented
using django-pgtrigger
, and we show an example of it here.
The following example configures a model so that the following changes are tracked:
- All model changes. Whenever an insert or update happens, a “snapshot” event will be created that snapshots every field of the model by default. This is a basic way of how one can track all history for a model.
- Creations of the model. When the model is created, a special “create” event is fired that snapshots the fields of the model at time of creation.
- When a field matches a condition. We fire the “low_int” event whenever the “int_field” is under 0 for any model update.
@pghistory.track(
# Create a "snapshot" event on every insert/update
pghistory.Snapshot('snapshot'),
# Create a "create" event whenever a model is created
pghistory.AfterInsert('create'),
# Create a "low_int" event on every update where int_field < 0
pghistory.AfterUpdate('low_int', condition=pgtrigger.Q(new__int_field__lt=0)),
)
class Tracked(models.Model):
"""
This model uses django-pghistory to track and snapshot field
changes and events on this model. django-pghistory is a library on
top of django-pgtrigger that helps make history tracking on Django
models easy
"""
int_field = models.IntegerField()
char_field = models.CharField(max_length=64)
Snapshot triggers from django-pghistory
will create event tables that
hold all events. Run this code to create events and access them. We’ll also
attach some context to the events:
docker-compose run --rm app python manage.py shell_plus --quiet-load -c "
import pghistory.models
import pprint
# All changes to this model will be tracked, and we'll also track
# other configured events
tracked = Tracked.objects.create(int_field=1, char_field='hello')
# Make 4 different changes
tracked.char_field = 'hi'
tracked.int_field = 0
tracked.save()
tracked.char_field = 'bye'
tracked.save()
# Let's add some context to these last events
with pghistory.context(misc='context'):
# Since int_field < 0, we'll start firing a "low_int" event here
tracked.int_field = -1
tracked.char_field = 'foo'
tracked.save()
tracked.char_field = 'bar'
tracked.save()
# Now that we've done our updates, let's explore the events. First,
# let's print all of values of the tracked snapshots along with any
# context
print('snapshots')
pprint.pprint(list(
tracked.event
.filter(pgh_label='snapshot')
.values('int_field', 'char_field', 'pgh_context__metadata')
.order_by('pgh_created_at')
))
print()
# Use the special AggregateEvent proxy model in pghistory to render
# snapshot diffs
print('diffs')
pprint.pprint(list(
pghistory.models.AggregateEvent.objects.target(tracked)
.filter(pgh_label='snapshot')
.values('pgh_diff')
.order_by('pgh_created_at')
))
print()
create_events = tracked.event.filter(pgh_label='create').count()
low_int_events = tracked.event.filter(pgh_label='low_int').count()
print('create events', create_events)
print('low_int events', low_int_events)
";
You’ll see output that looks like this:
snapshots
[{'char_field': 'hello', 'int_field': 1, 'pgh_context__metadata': None},
{'char_field': 'hi', 'int_field': 0, 'pgh_context__metadata': None},
{'char_field': 'bye', 'int_field': 0, 'pgh_context__metadata': None},
{'char_field': 'foo',
'int_field': -1,
'pgh_context__metadata': {'misc': 'context'}},
{'char_field': 'bar',
'int_field': -1,
'pgh_context__metadata': {'misc': 'context'}}]
diffs
[{'pgh_diff': None},
{'pgh_diff': {'char_field': ['hello', 'hi'], 'int_field': [1, 0]}},
{'pgh_diff': {'char_field': ['hi', 'bye']}},
{'pgh_diff': {'char_field': ['bye', 'foo'], 'int_field': [0, -1]}},
{'pgh_diff': {'char_field': ['foo', 'bar']}}]
create events 1
low_int events 2
In the “snapshots” section of the output, we’ve rendered all of our snapshots
in order. The last two snapshots have context associated with them that we
attached using pghistory.context()
.
The “diffs” section shows how to use pghistory.models.AggregateEvent
to aggregate events for the target model and render diffs of events. These
diffs show a “before” and “after” of the fields that were changed in the
filtered events.
The last two prints show counts of the specific events that we configured to track.
This example only touches the surface of what is offered by django-pghistory. Be sure to read the docs to learn more about how to configure it for your use case.