Tuesday, February 17, 2009

ORM: The Leaky Abstraction

I strongly dislike ORM.

Object-Relational-Mapping that is, I quite like the other kind.

The main reason for me to get this dislike is that ORM is one of the worst cases of leaky abstractions I've ever encountered. Again and again I find myself having to jump out of the object world, trying to identify the particular query I want to do in the relational world and then having to figure out the way I can convince my JPA persistence layer to do exactly this. Instead of just formulating a query in some relational query language I know have to understand not only the query but also how my JPA provider of choice maps objects and their annotations into the relational world. Life certainly didn't get easier this way.

My current problem is the way Hibernate does eager fetching.

All I want is a fully initialized object which I can pass out of my JPA session and it will work. This object has a few one-to-many relationships to small objects, which all should be available. Some of these sets can be reasonably large, but not really large enough to be of concern for in-memory storage. Unfortunately Hibernate tries to fetch them all in a single query, which means instead of fetching first N1 entries, then N2 entries, ... then Nn entries, it creates a single query for a cross-product that has N1*N2*...*Nn rows -- enough to run aout of half a gigabyte of heap space with a database that's less than half a megabyte of plaintext SQL.

I could try another JPA provider, but I somehow suspect quite strongly that it is not going to help and thanks to some omissions in the JPA spec I'm kind of committed to Hibernate already. The JPA spec actually doesn't define what "eager fetching" or "eager loading" means: both terms are used quite a bit but never defined -- at least I didn't find a definition searching trough the document.

I suspect the JPA crowd is going to tell me not to use eager fetching then. If my session would live at least as long as the object that would be ok, but that is not the case. So now I'll have to write code that traverses everything I need to fetch myself, maybe even invent my own annotation so I'll be able to maintain that with reasonable effort across multiple entry vectors. What a pain.

Maybe it is time for me to try using some object database technology. There should be some way out of the ORM pain.

1 comment:

Pietu Pohjalainen said...

Hi Peter,
I happened to browse to your site when searching for info about FCA tools. Accidentally, I found your rant about ORM technilogies, especially Hibernate.

I've had similar experience with JPA. My solution was to use the source code's properties to determine which objects to fetch. There's a paper explaining the technique at:
P. Pohjalainen and J. Taina: Self-configuring object-to-relational mapping querties. PPPJ '08 Proceedings of the 6th international symposium on Principles and practice of programming in Java, 2008.
http://dl.acm.org/citation.cfm?id=1411740

Maybe you're no longer struggling with JPA, but at least that's something to look at.