Persistence Blues

After almost 2 years into my current project, it hasn’t been a week without a hiccup involving JPA and Hibernate. I’m not claiming that Hibernate or JPA are bad tools, nor would I drop it completely for every project, but let’s face it – sometimes it can be the cause of frameworkitis – as a colleague once brought up.

ORM tools sound like a very good idea from an architectural level. No JDBC mapping code to write, relationship management with cascading and all, a powerful query language, all for free. Well, that’s the problem: it’s not for free, at least not outside the sweet spot.

I’ve found that, for JPA (and classic Hibernate), the sweet spot lies in small web applications NOT using EJBs, nor doing any java-based remoting, such as AMF’s or GWT’s RPC. It might also work for pure java desktop applications, such as ALL.com’s.

When you introduce JPA to the Java EE stack though, things start to crumble. First of all, you can’t cheat by following the “Session per Request” pattern suggested by the Hibernate folks – THERE IS NO GLOBAL REQUEST CONTEXT in an EJB. So you get the EntityManager injected by the container into your EJBs, but when you go over a @Remote interface… Whoops, LazyInitializationException! Same goes for things being used by your java web application, be it JSP, JSF, Servlet, or anything else based.

You can fix that by using only @Local interfaces for intra-EJB communication, but then you loose the nice scalability benefits of EJBs, which are the main reason why I use them (or maybe second to free TX). Let me elaborate a bit on this.

Suppose your OrderService invokes your CustomerService for retrieving billing details. If you use a @Local interface, the CustomerService NEEDS to reside on the same VM as your OrderService. If you used a @Remote interface, it could be at any point in your cluster. Doesn’t make sense? Think of this, in an enterprise, the CRM, the e-Commerce and the Backoffice systems all could share the same CustomerService, thus it would make sense to give it more resources, or host it on faster hardware than the OrderService. Making more sense now?

Now the second problem is client/server communication. If things can go wrong inside the Java EE cluster, think of how wrong they could go when you go over the network. Sending an entity over the wire then receiving it back from the client gets it detached from your persistence context. Your collections, entities and everything related to persistence can go cuckoo over that. I’ve had all sorts of errors on this, from LazyInitializationExceptions to insane “transient object found” on relationships, to not even being able to call em.remove on entities sent by the client. This holds true for RMI, BlazeDS / AMF and GWT RPC, and also over XML with JAXB. I’m quite sure the same would happen over JSON or Message Bindings or any other communication protocol, it’s just the nature of the beast.

Now if all that hasn’t scared you yet, there’s relationships and their impact on performance. Lazy load it  -  you’ll say. Well, easy if you’re in the sweet spot, but outside… BAM, LazyInitializationException, or no values being sent to the client.

Even with lazy loading, performance suffers due to AOP proxying and the such. I’ve recently reconstructed a query that took 5 minutes to load with Hibernate by using plain JDBC and manual mapping to objects, and the same query runs in 0.6 seconds.

Having to run over the objects generated by Hibernate to remove the proxied collections and variables before sending it over RMI, AMF or GWT RPC can double your processing time, as I experienced when using Gilead, and later on using beanlib in my own custom adapter for AMF.

The only solution to this is having a DTO to every domain object you have, so instead of writing Object / Relational mapping code, you write Domain / DTO mapping code, yay! That’s how my last GWT project ended up.

So here’s my word of warning: Skip JPA unless you’re going to use it in it’s sweet spot. If you’re building a Java web application (be it using Seam, JSPs, JSF, Tiles, Wicket, Struts, or whatever other technology) using Spring,Guice or POJO backend components, then by all means do use it. Everyone else, think twice before jumping in the fire.

On the positive side, I will follow up with some ideas on how to break JPA blues on the next post this weekend.

Disagree? COMMENT!

4 thoughts on “Persistence Blues

  1. I’ve found myself in the exact same situation you describe as “so instead of writing Object / Relational mapping code, you write Domain / DTO mapping code”.

    Wondering, how do you think this pathologies can be broken?

  2. Hey Pablo, thanks for shiming in!

    Dynamic languages such as JavaScript and Ruby have flexible data structures that can be bent to work with ORM without the need for proxies and/or bytecode manipulation.

    If you’re stuck with Java like I am though, you will have to decide between dropping your ORM and going with something simpler, or stick to converting between the persistent objects and the “API objects”

    I’ve been using Apache Commons dbutils: it can automagically map rows you retrieve from the database to (shallow) objects without relationships. Another approach would be trying to automate the Entity stripdown in the transport layer, like Gilead for instance does for AMF and GWT remoting, but it’s only half a solution.

    I think in order to be able to do this well, you would need a library that can dynamically generate “free-form objects” from your domain classes, removing all the ORM proxying and decorating done to them, and then passing it over to JAXB, a JSON library or Protobuf or what have you for serialization. Problem is since most remoting libraries expect precisely the object type you have in the interface, you would have to somehow fake the “free-form” object to be what they expect – maybe by using a different classloader?

    In the Flex world, the guys from GraniteDS went kind of the opposite direction and tried extending the sweet spot to be also on the client – if you hit any lazily loaded property on the client, it goes to the server to fetch it. I’m not particularly fond of their approach because it’s Flex-specific and breaks your server API’s compatibility with pretty much anything else, but it’s yet another path one could walk.

    Wish you luck in your quest for persistence!

    Alex

  3. Hi Alex !
    Late to the party, but this is one I cannot miss. I was for a (long) time in this “sweet spot” you seem to describe. Basically, Spring + Hibernate works really nicely (never did care much about JPA, when it was finally out, we were used to have more features that that…). We had a client/server architecture (rich client/Eclipse RCP), and were using the same objects across all layers without major problem. Of course, trying to “lazy load” a property on the remote client will logically result in a LazyLoadException, but it is quite controllable but good design (basically, being clear about how “far” the client can go navigating a single object and/or having some trick to go back to the server when needed – without abusing it).

    Never had much Hibernate performances problems, as long as you use it wisely (agressive caching on frequently asked objects and queries, batch-size at all places to avoid N+1 queries). In my experience, the true value of Hibernate was not really the query generation, but the facilities in caching, session management, batch load, etc.

    Ah, and I forget : do not try to use it for batch updates/creates. It is just not appropriate (by the book itself).

    Martin

  4. From experience, it solves more issues than it introduces. But this technical blues exists…

    I’ve summarized my experience (and the one from martin) in this article : http://mestachs.wordpress.com/2012/03/24/dont-get-caught-hibernating/

    But It’s a bit like maven… most of the time you can’t live without it… than you touch something you and everything goes wrong, you loose days debugging and want to throw it away. But if you start a new project, you write your pom ;)

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>