The case against ORMs

2017-11-02

•

I still see many developers singing praises to ORMs, but I have stopped using ORMs several years ago. I have accumulated enough experience on both sides now, so I’d like to lay out my case against using ORMs.

UPDATE: based on the interesting Reddit discussion of this post which helped me clarify my thinking, I should make it clear that I’m specifically against ORMs which attempt to reimplement SQL in their target language. Lightweight helper libraries to facilitate working with SQL (like Dapper, for example) are a much more useful proposition. I’ve also added an extra point about complacent developers which is another risk of ORMs.

CORRECTION: The first quote was incorrectly attributed to Jeremy Evans (the current maintainer of Sequel), instead of Sharon Rosner.

I’d like to start with a couple of quotes:

I’m the original author of Sequel [1], an ORM for Ruby. Lately I’ve been finding that ORM’s actually get in the way of accomplishing stuff. I think there’s a case to be made for less abstraction in programming in general, and access to data stores is a major part of that.

For an in-production system I’ve been maintaining for the last 10 years, I’ve recently ripped out the ORM code, replacing it with raw SQL queries, and a bit of DRY glue code. Results: less code, better performing queries, and less dependencies.

- Sharon Rosner, Sequel author

I’ve seen it from the outside, I’ve seen it from the inside; and I don’t want to deal with any ORM headaches anymore. Enough is enough. I ditched my ORM.

- Piotr Solnica, Data Mapper core team

This mirrors my experience. I’ve used ORMs, and I’ve written a lot of plain SQL. For four years, I worked on a product in a company I co-founded, and I made a choice not to use an ORM in that product, relying on plain SQL instead. This choice absolutely paid off: we leveraged Postgres to the max by writing big complex queries (see example in this post: Debugging complex PostgreSQL queries with pgdebug), and I’m certain we ended up with less code overall. When I think about what would have happened if we tried to implement all the same functionality with an ORM, I’m certain that it would have crippled the product.

ORMs only work for simple stuff, and become a hindrance for any non-trivial scenarios (e.g. using PostGIS immediately creates problems with ORMs and migrations). Let’s take a look at the specific reasons for why this happens.

ORM is an extremely leaky abstraction

Here’s the deal: no matter how good your ORM, you won’t be able to avoid having to know SQL. Except with an ORM, on top of learning SQL, you’ll have to learn all sorts of contrived syntax that tries to reproduce SQL in your programming language (and inevitably fails, because the model is fundamentally different).

ORMs are acceptable for CRUD operations, but those operations don’t actually represent the majority of the interaction with the database for products of any non-trivial complexity. The majority of what I have seen in my 15 year career was anything but CRUD. Besides, the SQL for CRUD operations is really simple too, so what does an ORM really buy you?

On the flip side, like any abstraction, it has costs too. It’s another dependency for your project, it’s going to have bugs, it will need to be upgraded, it might prevent you from upgrading other dependencies, it might have incompatibilities with other dependencies and so on.

Lowest common denominator features

An ORM usually forces you to use the lowest common denominator subset of features between all the databases it supports. You miss out on a lot of powerful database features. For example, the issues I struck with Postgres when I tried to use an ORM were that it wouldn’t support any PostGIS functionality unless I bolted on some poorly supported extension to the ORM. And then I’d have to support this precarious pile of code! Range types, JSON types, arrays etc. would be other examples of Postgres features that I would have to forego for the privilege of using an ORM. It doesn’t stack up!

Considering multi-database support in an ORM to be an advantage is actually a fallacy. Migrating to another database on a whim is simply not something that happens in real life. If there’s a business case for migration, then it won’t be a huge deal to incorporate the necessary changes to the SQL as part of that process (and a lot of your SQL may not need changes!).

Adverse effects on schema and migrations

An ORM potentially affects the schema of your database due to its limitations. Maybe you’ll choose not to normalise the data because it’s easier that way, or maybe you’ll choose different types for your columns. Going through a proxy layer often constrains your choices.

Your ORM may or may not support migrations, and if it does, migrations will be more complicated if you write them in code instead of SQL. Once again, you’ll have to learn new syntactic constructs that duplicate SQL operations, and you will be constrained by what’s implemented in the ORM.

Weakened data validation

Instead of relying on the database features such as check or uniqueness constraints, the ORM might try to take on the validation. Famously, ActiveRecord in Rails tried (still tries?) to do uniqueness checks in the application code. Obviously, this fails if there’s any degree of concurrency. Once again, it comes down to replacing something that the database is very good with an inferior version at a higher level of abstraction.

Performance problems

ORMs can cause performance problems, like the well known (n + 1) queries issue (where individual queries are issued for each child record linked to the parent record).

Another problem caused by ORMs is multiple roundtrips to the database. When data processing is pushed up into the application layer, you’ll end up doing something like:

Get a list of projects
Get a list of users
Do something with them
Write the result back into the database

This means multiple queries and multiple roundtrips to the database. If you were writing plain SQL, you could plausibly do it all in just one query (e.g. join projects and users in a select, do the necessary filtering in the where clause and some processsing in the select expressions, then use the result of the select as the values for insert in the same query).

In addition, there are other performance issues caused by suboptimal SQL generated by the ORM (e.g. issues with locking).

Poor mapping

A lot has been written about the impedance mismatch in object-relational mapping. SQL result sets don’t necessarily map well to objects (if you’re using an object oriented language). An ORM also encourages a one-to-one mapping between tables and “business objects” on the application side. But in reality, it’s very rarely the case that there is a one-to-one correspondence.

Preventing SQL injection attacks doesn’t require an ORM

SQL injection attacks are effectively mitigated by using prepared statements and positional parameters, which don’t actually require an ORM.

Conversely, an ORM doesn’t guarantee the absence of SQL injection sites. As soon as somebody added plain SQL somewhere (which is more or less inevitable), we’re back to square one.

ORMs can make developers complacent

When developers are only working with the database through the insulating layer of ORM, they can become complacent about what goes under the hood, or may never even learn the workings of a database if they join the project without prior experience. This will eventually lead to problems which could have been prevented if the developers had a more direct interaction with the database and were forced to learn more about it.

Conclusion

The conclusion I’ve drawn is that the benefits of ORMs (avoiding some of the SQL boilerplate, sweeping the relational model under the rug) are much too low to justify the costs, and plain SQL gives you access to the full power of a database with less code. So I’m going to stick with plain SQL, and avoid ORMs.

Comments or questions? I’m @alexkorban on Twitter.

Is the time dimension of your data giving you headaches?

“PostgreSQL: Time and Temporal Data” is a short book that will help you get it under control. Master Postgres time tools, get a handle on time zones, learn how to represent and query complex temporal data.

Pop in your email to receive a sample chapter.

You will also get notifications of new posts along with other mailing list only freebies.

|> Created with Elmstatic