Building the optimal user database model for your application

user-schema

In the past, I’ve often wrestled with designing an optimal relational database model to represent users in a multi-tenant system. I’ve wanted to get the model just right—enough structure to accomodate future enhancements without bloating the schema with excessive JOINs, overly-wide tables, or any other number of typical relational-database “sins”.

Over four years into our current iteration of DoneDone, I can happily say that the database model we’ve built has been resilient to all sorts of feature updates we’ve made over the years.* I wanted to share our thought process behind how it’s designed, starting with a simple database model and evolving it into the model we currently use.

A simple model

Let’s start with the most bare-bones relational model to represent users in a system like ours. Naturally, we’ll start with a table called Users. The attributes within Users fall within one of three general groups of data.

  • Personal information: Name, email address, phone, avatar, Twitter handle, etc.
  • Login credentials: Username, password salt and hash, and any 3rd-party auth tokens
  • Foreign keys: Most likely, one to an Accounts table to link a user to an account.

So, our starting model looks something like this, with unique constraints on items like Username and EmailAddress:

A simple starting point for database modeling users in a system

A simple starting point for database modeling users in a system

I’ve used many systems whose model follows closely to this two-table architecture. But, there’s one very large drawback to this system: The foreign key from Users to Accounts restricts a user’s access to only one account in the system. If someone wants to have access to multiple accounts, they are forced to create a new user record with another email address. That quickly becomes a nuisance to manage for the user.

Associating users to multiple accounts

So, unless I am 100% sure that the system will never allow a user access to multiple accounts (and, sometimes, even if I am 100% sure), I prefer breaking this relationship out as a many-to-many by introducing an association table in between. In DoneDone, we call this a membership.

Adding a Memberships table to allow users access to multiple accounts

The Memberships table holds foreign keys to both Users and Accounts, so that we can both associate any number of users to an account (e.g. employees in a company) as well as any number of accounts to a user (e.g. a user’s freelance account along with their company account).

We can also add membership-specific columns to Memberships. In DoneDone, for instance, the Memberships table can hold columns like an office phone number or fax, as well as an account-specific email address. This way, we can send profile information and password reset links to the user’s global email address, stored in Users, while sending email updates on specific DoneDone issues to the account-level email address, stored in Memberships.

Additionally, the Memberships table can hold information like roles—a user might be an administrator in one account, a regular user in another account, and an account owner in a third account.

Finally, the Memberships table helps clarify the intent of certain database relationships involving a user. For instance, in DoneDone, we tie system alerts to a membership record. This lets us create a specific alert to a person’s specific account (say, a billing notice to an account owner). However, we tie issue relationships to a user record. If a person creates an issue, and then leaves a company, we can delete their Memberships record while still preserving their relationship to that issue.

As I mentioned earlier, even if I’m fairly certain that users will not belong to multiple accounts, I still prefer the additional table in the design. If you avoid it now, should the business one day decide to support multiple accounts for a user, you not only need to massage your data into a new data model, you’ll need to provide some mechanism for a user to link once-disparate accounts together.

Improving performance and database integrity

This three-table architecture provides a lot of flexibility, but there are still other ways we can improve. In DoneDone, we segment off the authentication information away from the Users table into a separate Logins table. We then add a required foreign key from Logins to Users (a login must be associated to an existing user). Since a user can only have one set of credentials, we could further tighten the data integrity by placing a unique constraint against the foreign key.

Extracting credentials into its own table has several relational modeling benefits

Extracting credentials into its own table has several relational modeling benefits

There are a few different reasons why I prefer this separation.

First, it provides a cleaner way of representing users in a system who do not have credentials to login. If we don’t extract credentials to a separate table, we’re left with only the unsavory option of allowing NULL values for usernames and password salts/hashes. Even further, many database systems (Sql Server 2008 and prior) do not support an out-of-the-box “unique if not null” constraint. This means that you can’t place a unique constraint on, say, username, if you had more than one unregistered user (i.e. two or more records with username equal to NULL). As a workaround, you might be left with managing uniqueness on usernames solely outside of the database—a game I’d rather not play.

You may be asking why you’d ever want users in your system who can’t login. Where would this come into play? Here are two examples:

  • A closed system that pre-populates the database with all allowable users. This would be common for a company intranet, or any kind of in-house system. When a user registers, they must match on personal information tied to an unregistered user already in the database.
  • A system that allows partial access to features without requiring an explicit login. This, in fact, is how we manage Public Issues in DoneDone. Anyone who sends a public issue email into DoneDone will have a user record logged in our database, but they won’t necessarily have credentials tied to them.

Secondly, creating a separate Logins table lets you keep credential information in as thin a table as possible. This will only help with indexing and any lazy selects (select * from Logins). Since the Logins table will usually have significantly more reads than writes, the more efficient those reads are, the better.

Finally, while you could argue placing the foreign key on Users (since this is a one-to-one relationship), I find more benefits placing it on Logins. This way, we can make the relationship required, which again avoids adding any nullable columns or “unique if not null” issues in your data model.

Adding access to groups of users

The last step we’ve made in DoneDone is to add a Companies table between Memberships and Accounts.  This lets us further differentiate access levels between groups of users in an account. For example, in DoneDone, we can mark a company as an admin company, and only allow users associated to the admin company to be administrators or account owners. A Companies table also would allow you to easily make certain messages private amongst a user’s respective company.

With this in place, here is the current user data model we’ve used in DoneDone.

Our full-blown database architecture for modeling users

Our full-blown database architecture for modeling users

We’ve been really happy with this level of architecture in DoneDone thus far, and I hope this helps your product as well.

*–Much credit goes to former DoneDone colleague Mustafa Shabib for his influence on the data model design that’s largely withstood significant changes since it was released.

Ka Wai Cheung is the original creator of DoneDone and author of The Developer’s Code. Follow him personally on Twitter via @developerscode and read more at Life Imitates Code.