Delete Your Foreign Key Constraints
By Scott White
I learned the same rules about doing relational database schemas like everyone else. When you create a table with a column that joins to another table column, you put a foreign key constraint so that link will always stay valid. A long time ago I started a new job and when I created a pull request with foriegn key constraints included I was politely asked to remove them. I was shocked - “It’s best practice!” Everyone knows you need to add the foreign key constraints or your database will surely fall into chaos. When I asked why, I found out they had FK constraints in the past but, when they ran into database scaling issues, they deleted them all.
From then on, when I had the choice to use foreign key constraints, I just skipped them. When new developers started I would typically get two reactions. The most common is the same as my original one: “Don’t you know you should have FK constraints?” The second, which happens more often than you’d think, is “My last company didn’t use them either.” For the skeptical I’ll go over a little bit the pros and cons of the FK constraint.
Pros
The purpose of a constraint is to keep referential integrity. This means no dangling IDs and a super clean schema. If a record is supposed to be there it will be there. When they fail it’s a signal to the developer that they’ve made a programming mistake that would cause database corruption. At least this is what we all were told.
Cons
The way FK constraints work is that on every write (inserts, updates or deletes) all the applicable constraints need to be checked. If any of them are violated, the write will fail. This usually happens in two ways, first as mentioned is a programmer error. This can happen at development time, test time or escape all the way to production. In theory this is a good thing, the databse integrity is being upheld, but the reality is often data loss. Every failure is some data the application was trying to save and was prevented from doing so. I’ve seen a bug in a minor feature prevent critical updates from happening due to a FK constraint. In my career I’ve seen FK constraints cause data loss far more often than they have prevented any major DB corruption. If you have constraint errors, usually it’s fast to track down and fix the issue so the violation won’t occur again… which leads to the second point.
If your application code writing to the database is correct, then the constraint adds no additional value. The FK constraint is a double check of your business logic that run on every write. At another company, we were having major DB scaling problems and I wrote a script that deleted all the FK constraints… and the app behavior didn’t change at all. That’s expected because prior to the deletion we didn’t have any errors caused by the FK checks failing. But it did have an effect; removing all the FK checks caused the DB CPU under heavy load to drop by 10%. You will almost certianly not see that big of an improvement in your DB but this particular one had a ton of FK checks on hot write paths. The tables involved were large as well. The checks use indexes so they’re typically very fast, but they’re not free. Lots of checks, like writes to tables that join a lot of others and especially big tables with big indexes are going to add up.
Performance is generally the reason people remove or stop adding FK constraints. Just be sure to add the appropriate indexes manually if you’re doing joins. If you hire a DB performance consultant, it’s one of the first things they’ll do if you need more juice because it’s almost always safe to remove them.
Living the Constraint Free Life
If you don’t use them at all you will get dangling records. Someone writing a delete method usually forgets about some child table. In over ten years of no FKs, this has never once been a major issue. And if it really bothers someone, it’s generally pretty easy to find and clear up the records.
One side benefit is it often makes some updates much easier, usually deletes. It can be hard to figure out how to delete a record when it has a ton of constraints. I was asked to do some manual deletes on that same DB before I had removed the FKs, and it took half a day for me to unravel the dependency tree to kill the records we needed to remove. No one cared about some dangling records or schema integrity, but they did care that the customer had to wait a half a day for me to solve their issue.
Digression and Conclusion
If you’re a DBA, you’re likely flaming me on Bluesky by now. But if you’re still reading, it’s about to get worse. The history of RDBMS’s goes back a loooong time. SQL is from the 70’s and in that era your company typically had just one mainframe computer. The RDBMS was your application server and your DBA was the application developer. The customer interface was often a SQL prompt delivered through a dumb terminal. If you were a DBA in that era, you had to protect the integrity of the data from your users writing SQL using things like strict schema permissions and FK constraints. In this environment, it was imperative that you protect the data as best you can and give hints to the user about he schema and table relationships. The “MS” in RDBMS stands for “Management System”; It’s not just a database, it’s a whole relational database management system allowing a ton of business logic to run in the DB. But times have changed.
Today, application servers are separate horizontally scalable units typically serving up HTTP of some variety. SQL is programmed into the application code reviewed instead of ad-hoc at the SQL prompt. We don’t need all the features of the RDBMS, we just need a DB now. And it’s ok to move some of the functionality that used to live in the DB out to the application servers. This leaves the RDB to do what’s it’s really good at: querying and saving data. In this era, the DB is often both the most expensive part of the stack and the biggest bottleneck. We need to save every DB CPU cycle and offload as much as possible because it improves performance and saves money. So go ahead and delete those Foreign Key Constraints, they’re a leftover from another era and you won’t miss them.
Postscript
You might be saying “Dude you’re such hypocrit, I’ve seen you use FK constrainsts” and you’d be right because only my coworkers read my blog. I do sometimes use them. Fundamentally, FK constraints are a double-check for your application logic, and sometimes I want a double check. Typically when I’m working with money or other value transactions. Accountants are persnickity when there are dangling records so it’s a good way to do double checking. I will tend to save records first into a unconstrained table so there’s always a record, like a journal. Then process it into the other final tables. If the constraint ever fails, I’ll have the inital records to look at to be able to figure out what went wrong and fix it. Sometimes you want both belt and suspenders.