Tenant Isolation with Postgres Row Level Security and SQLAlchemy

This write-up was inspired by Michael Beardsley’s excellent intro to the Row Level Security (RLS) feature in Postgres and how it could be leveraged to implement a multi-tenant data isolation solution. It provides a lot of context that I will not repeat below, so I recommend reading it first.

Here is the TL;DR:

Isolating tenant data is a fundamental responsibility for most Software as a Service (SaaS) providers.
A robust implementation protects your brand and your business.
Common partitioning architectures include:
- Silo (e.g. separate instances)
- Bridge (e.g. separate schemas)
- Pool (e.g. co-located data with partition keys e.g. a tenant_id column)
The pool model is the most cost efficient. Not only does it require fewer resources which means lower infrastructure costs, but it also has a significantly lower maintenance burden. Unfortunately, it is commonly implemented by hoping the correct WHERE clause is implemented in every SQL statement.
The solution presented in this post takes advantage of the benefits of the pool model and reduces the risk of cross-tenant data leaks (CTDL) by centralizing isolation enforcement within the database management system (DBMS) itself.

If you need any more motivation for why this topic deserves attention, a friend of mine presented at BlueHat earlier this year about CTDL in the wild. Most of that talk hints at issues in the authorization layer, but nonetheless, CTDL vulnerabilities are evidently present at some of the largest IT shops in the world (G-Suite, Azure).

Introduction

Early SaaS teams are often focused on the same types of goals: the ability to ship quickly, and the ability to win over customers by establishing trust in a delightful product. Unfortunately, the need for speed can lead teams to squander the unique opportunity they have early on to think critically about how the architecture of their system can help them achieve their goals.

Architecture is a property of a system. You will have an architecture. It doesn’t matter if you want one or not. It’s just a matter of whether it’s intentional or accidental.
— Stefan Tilkov

The pool storage model allows teams to live the dream of continuous delivery, agility, scalability, cost optimization, and a great developer experience - all without sacrificing security! This should be the default choice for most teams until requirements demand something else. With this approach, teams also do not need to worry about painting themselves into a corner. The pool model does not prohibit provisioning isolated silos for exceptional tenants. As long as the storage provider sits behind a centralized data access layer, developers can continue to build with limited awareness of how tenant identity ultimately ends up in data access requests.

Let’s pretend we are a SaaS provider that chooses the pool model and needs to defend its security posture in some sort of PR-FAQ. We want to claim the following:

Although tenants of our platform share data storage resources, our DBMS is implemented with strong guarantees that a given session may only read or mutate data values belonging to one tenant. In other words, from the perspective of any DBMS client session, this shared database appears to only contain data belonging to a single tenant (the one who initiated the session). This is made possible by a centralized isolation enforcement mechanism native to the DBMS itself.

The rest of this post works towards this stated goal by configuring RLS in Postgres and implementing a basic multi-tenant web app to exercise the RLS capabilities.

Postgres user starter pack

It’s a good idea to have at least a few different Postgres users scoped with different responsibilities. This is generally useful, but it is also required in order to use RLS. This is because the superuser is not subject to RLS, and by default, table owners are also not subject to the security policies on any of their tables.

Name	Description	Bypasses RLS?
`postgres`	The superuser responsible for managing other users.	Yes
`db_admin`	The user responsible for performing schema migrations and owning the schema objects.	Yes
`db_user`	The user that apps and services use to connect to the database.	No

The postgres user should run the following script to configure the other users:

-- explicitly excluded from RLS
ALTER ROLE postgres BYPASSRLS;
ALTER ROLE db_admin BYPASSRLS;

-- use conservative default privileges
REVOKE CREATE ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON DATABASE postgres FROM PUBLIC;

-- selectively grant privileges for db_user
-- this user can read and write data values, but it cannot alter the schema
GRANT CONNECT ON DATABASE postgres TO db_user;
GRANT USAGE ON SCHEMA public TO db_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO db_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO db_user;

-- selectively grant privileges for db_admin
-- this user can do everything db_user can, plus it can alter the schema
GRANT db_user TO db_admin;
GRANT CREATE ON DATABASE postgres TO db_admin;
GRANT CREATE ON SCHEMA public TO db_admin;

It may seem strange that we can GRANT one User to another User. This is possible because in Postgres, Users are Roles. The two are almost identical. The only difference is that a User happens to also have a password to enable external authentication.

Configuring users and role privileges can be incredibly finicky, especially if you are using a managed service with its own set of restrictions, but it’s a one-time upfront cost that pays off over time. There are lots of other settings, some of which are used in the companion repo init.sql, but this script shows the general idea.

Table partitioning

There are at least two tables needed to demonstrate RLS: first, a partitioning table, and second, a table containing records partitioned by the partitioning table via a foreign key. Let’s connect as the db_admin user to create these tables.

The tenants table holds a set of tenant records, and the id value will be used as the partition key throughout the database.

CREATE TABLE tenants (
    id UUID PRIMARY KEY,
    name TEXT NOT NULL
);

This items table holds a set of item records belonging to multiple tenants. The tenant_id column partitions the set. This is a very simple example, but it represents any number of other tables that would also be partitioned by tenant_id.

CREATE TABLE items (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    title TEXT NOT NULL,
    tenant_id UUID NOT NULL REFERENCES tenants(id)
);

Certain constraints are applied using composite keys e.g. “an item must have a unique title within the context of a tenant:

CREATE UNIQUE INDEX items_unique_title ON items (tenant_id, title);

Table RLS policies

For this demo, configuring the row security policies on each table requires just 2 steps:

Enable RLS for the table.
Configure a POLICY with a USING predicate.

When the predicate evaluates to true, the row is made visible to the user. Otherwise, it is silently hidden.

These changes are applied to the schema itself and thus should be performed by the db_admin user.

ALTER TABLE tenants ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation_policy ON tenants
USING (current_setting('app.current_tenant_id')::UUID = id);

ALTER TABLE items ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation_policy ON items
USING (current_setting('app.current_tenant_id')::UUID = tenant_id);

By default, the policy applies to all roles performing all SELECT, INSERT, UPDATE, and DELETE operations. This is the most common use case, but it is possible to configure policies for individual roles and specific operations. There are some other options that provide even more control which are described in the CREATE POLICY documentation, but I will not cover them here.

The `current_setting` function

current_setting gets the value of the run-time configuration parameter of the given name on the relevant database session. In this case, the parameter is named 'app.current_tenant_id'. The 'app.' prefix is a sort of namespace. It can be anything, but it cannot be blank.

At this point, the database is all set! If you try connecting to the database as db_user and do select * from items;, you will see an error indicating that app.current_tenant_id has not been set. This means RLS is working as intended!

unrecognized configuration parameter "app.current_tenant_id"

The `SET` statement

If current_setting is the getter, then SET is the setter. The syntax is as simple as it gets:

SET app.current_tenant_id = 'ebdba44d-ad48-4e73-9bd5-339e3c3fc590';

Now if you try select * from items;, you will see only item records with a matching tenant_id value. This is the same mechanism that our web app will use, but it will be done programmatically of course :).

The web app

With the database configured, it is now up to the web server to:

Determine the tenant ID for each request, typically from an auth token issued by an identity provider.
Connect to the database as db_user.
Ensure each database session is using the correct tenant ID value.

The solution for part 1 will vary widely, so that is left as an exercise for the reader. In this demo, the value is read as plaintext from a request header just to keep it simple.

The solution for part 2 is straight-forward but important!

The solution for part 3 consists of a SET statement and a mechanism for attaching this statement to every unique session. The latter may vary depending on the capabilities of the database connection manager being used. This example uses SQLAlchemy which comes with a set of event listeners out of the box. The before_cursor_execute event is one of the lowest-level events, meaning that it’s just about the last opportunity one has to modify the SQL before it is issued to Postgres. Other events may be more appropriate depending on the app’s connection management strategy in relation to tenants. For example, the connection pool checkin/checkout events could be used to set/unset the parameter value. The important thing is to design this mechanism in a way that ensures the tenant ID value being SET matches the tenant ID value of the original request.

from uuid import UUID
from starlette_context import context as ctx
from sqlalchemy import event

def _get_session_params_sql(self, tenant_id: UUID) -> str:
    """
    Generate the SQL to SET the session parameters for Postgres RLS.
    """
    cursor = self.engine.raw_connection().cursor()
    result = cursor.mogrify(
        "SET app.current_tenant_id = %s;",
        (str(tenant_id),),
    )
    return result.decode("utf-8")


@event.listens_for(self.engine, "before_cursor_execute", retval=True)
def receive_before_cursor_execute(
    _conn,
    _cur,
    statement,
    parameters,
    _ctx,
    _executemany,
) -> tuple[str, Any]:
    """
    Set RLS session parameters by packing them into each SQL statement.
    """
    rls_statement = self._get_session_params_sql(
        tenant_id=UUID(ctx["tenant_id"])
    )
    statement = rls_statement + statement
    return statement, parameters

The API

Finally, we can provide a basic GET REST API endpoint to demonstrate the tenant isolation in action.

@app.get("/items/")
def get_items(session=Depends(get_session)):
    return session.query(models.Item).all()

Note how the ORM code here does not need to .filter() by tenant_id. It can depend on the RLS policy on the items table to take care of that. It also does not need to SET the tenant ID value session parameter. That is taken care of by the SQLAlchemy event listener in the data access layer. This is a big win for developer experience and productivity!

The result

Avoiding Pitfalls

Unfortunately, there are some pitfalls which are easy to fall into if proper precautions are not taken. A simple integration test suite can help prevent most of these issues.

If the application connects as a privileged user such as db_admin, this would side-step RLS altogether.

def test_app_pg_connection(session, with_request_context):
    res = session.execute("""
            select
            current_user,
            current_setting('app.current_tenant_id');
        """
    ).fetchone()
    assert res[0] == "db_user", "this user must be subject to RLS"
    assert res[1] is not None

If a new table is added without RLS being enabled, then the RLS policy will not be applied. This test ensures that any table with a tenant_id column has RLS enabled.

def test_rls_per_table(session):
    sql = """
    WITH tenant_partitioned_tables AS (
        SELECT t.table_name
        FROM information_schema.tables t
        JOIN information_schema.columns c ON c.table_name = t.table_name
        AND c.table_schema = t.table_schema
        WHERE c.column_name = 'tenant_id' AND t.table_schema = 'public'
    ),
    rls_enabled_tables AS (
        SELECT relname
        FROM pg_class
        JOIN pg_catalog.pg_namespace n ON n.oid = pg_class.relnamespace
        WHERE n.nspname = 'public' AND relkind = 'r' AND relrowsecurity = TRUE
    )
    SELECT * FROM tenant_partitioned_tables
    EXCEPT
    SELECT * FROM rls_enabled_tables;
    """

    res = session.execute(sql)
    rows = res.cursor.fetchall()
    assert list(rows) == []

There are more subtle things to watch out for, some of which are described in the CREATE POLICY notes section including:

The order of execution for DB triggers and policy predicate evaluation. e.g. if a BEFORE ROW trigger modifies the row values, does this influence the result of the policy check? (In some cases, yes.)
Integrity constraint errors are not suppressed by RLS. This opens up the possibility for one tenant to discover something about the values belonging to the set of all other tenants such as the existence of a record with a value bound by a unique constraint.
How are VIEW and MATERIALIZED VIEW handled?
How do shared indexes play into this?

If this post has a sequel some day, these are the types of questions I would like to dig into.

Conclusion

As with any solution, RLS has its tradeoffs. However, one of my favorite things about this approach is that it scales well for customers and engineers. When I say it scales for customers, I mean that the performance of the queries is largely unaffected. And when I say it scales for engineers, I mean that it requires very little maintenance once the initial configuration is complete. The declarative configuration as part of a schema migration workflow also makes it self-documenting. This is in stark contrast to the overhead of provisioning a new service, new schema objects, or new database users whenever a new tenant is onboarded. Of course there are teams who will (and rightfully should) decide that RLS is not a good fit for their requirements, but for most teams, this is an option worth exploring and one that will help them sleep better at night.

Also see

AWS Multi-tenant Storage Whitepaper
Almost anything from Tod Golding