DNR, generics, and O/R mapping

I recently discovered the radio shows over at DotNetRocks hosted by Carl
Franklin
. These shows are pretty good, I love that
you can download the full mp3 show, I’m downloading all of them and I put them
on my
Dell DJ and listen to them in the car; it
makes the drive to work much more appealing.

I’m in the middle of the show with Mark Pollack, Ted
Neward
, and Don
Box
, which is especially interesting because these
guys are not afraid to push each other’s buttons.  They talked about
generics in java, and how
Sun just put it into jdk 1.5 because of
all the hype that C# was getting for putting generics into Whidbey.  While
java supports generics now, you don’t get the total benefits of generics like
you do in C# because all the java compiler does is convert all the generic types
into objects, so you still have the performance overhead of the boxing
operations, whereas in C# if you define a List<string>, it really is a
List of strings.

The part of the conversation that I’m at right now is where
they are all discussing object relational mapping, and the problems with current
attempts to transpose relational data into language objects.  This topic is
particularly interesting to me, because we use some custom o/r mapping objects
at work, in C#. 

They used a Person object as an example, so suppose you have
a person table in your database, and a corresponding Person class in your
application.  When you are accessing the Person instance, you would supply
the primary key, say 1, of the person that you want to access. So your
application needs to go to the database and retrieve the fields of the person
table, and populate your Person instance.  The problem is what fields
should the application retrieve?  Do you grab all the fields, even if you
just want that person’s email address? Thats a waste of database overhead. 
But you cannot just load every field on demand because the situation will arise
where you need all or most of the fields, and its silly to query the database
separately for each field.  So what to do?  In our implementation at
work, I load all the fields at once.  I find the performance is not an
issue, even for the case of one table that has > 100 fields.  So IMO,
loading all at once is not a problem.  But Carl made a suggestion that I
thought was really clever.  He suggested using custom attributes to specify
which fields are loaded by default, and which fields are “lazy loaded.” So for
our Person class, maybe FirstName, LastName, Email, and Address are always
loaded, and then Phone, Company, Fax, City, State, Zip, Country are only loaded
when we explicitly access those fields.  This scheme would work very well
with that table I mentioned with many fields, and I think the benefits would
certainly be seen if you had a table that was storing blobs, so you could load
all the fields besides the blob, and just grab the blob when you actually needed
the binary data.

This show is compelling so far, I’m looking forward to the
rest of it on my way to work Monday.

6 Comments so far »

  1. Anonymous said,

    Wrote on December 4, 2004 @ 8:55 pm

    ANY static defined lazy loading scheme is stupid. The reason for that is that it can’t scale to situations in which you want to change the statically defined settings.

    So instead of using statically defined lazy loading (with attributes, which is also cumbersome to implement, i.e. what if your system has 500 entities?), implement a lazy loading scheme which allows the developer (i.e. the user of the data at that point in time) to define what scheme to use as only that developer can tell what’s best for that particular situation.

    So the obvious choice is passing a construct to the fetch logic which describes which fields to load. Obviously, the system has to keep track of which fields are already loaded. (but perhaps the developer wants to re-load the field at every access, also a reason not to use statically defined lazy loading).

    Lazy loading of blobs is the only real world scenario which will increase performance. However, in general, these scenario’s are pretty rare. Lazy loading has a disadvantage: your tier in which the lazy loading is triggered has to have a context which is connected to a database. Especially with the larger applications, this is not what you want in for example a GUI tier.

    I can only conclude that the show you’ve listened to contained people who heared about O/R mapping but clearly don’t understand what it really is nor what it means for enterprise applications to use O/R mapping.

  2. Anonymous said,

    Wrote on December 5, 2004 @ 4:32 am

    You make an excellent point about the static nature of the attributes, and the inability to change them to fit different scenarios, I hadn’t considered that. You are right about needing to pass information in order to load the proper fields, I’m just wondering if there is an elegant way to accomplish this without having to write anything that is similar to SQL code, I would rather do it in an OO fashion.

  3. Anonymous said,

    Wrote on December 5, 2004 @ 5:33 am

    The only way I can think of is that you specify a set of fields for a given type T you’re fetching objects of, which have to be excluded for this particular fetch. The engine then has to offer a way to fetch these fields ‘when necessary’, later on, either by a lazy loading mechanism or by a manual fetch.

  4. Anonymous said,

    Wrote on December 5, 2004 @ 7:14 am

    I checked out your LLBLGen Pro demo, I think its really cool, I like the how you attach an event to every field. Judging from your previous comment I would expect that your product does not use lazy loading, is this correct?

  5. Anonymous said,

    Wrote on December 6, 2004 @ 6:44 am

    I am very sorry to say, but the proposal is utterly stupid in 99% of the occasions. Basically the transfertime for object retrieval is much lower than the round trip to the database, and a smart in memory caching scheme may allow only to retrieve the pk and the lock fields, then reuse the object if it is unchanged. MUCH better.

    The one time when this is not the case is basically when the table is getting larger than let’s say 8k (page size in SQL Server). Once you store large documents or eve nbinary data (images), lazy loading does start to really make a show out of this.

    And this coming from the maintaine of one of the most powerfull O/R mappers for .NET on the market. I am totally in agreement with Frans here. Simply makes no difference, makes the object slower. It also kills caching layers and produces way more database round trips.

  6. Anonymous said,

    Wrote on December 6, 2004 @ 1:16 pm

    I agree that choosing not to load a few int or varchar fields is pointless. I just thought that the the idea of using attributes was interesting, whenever I hear about people using attributes it usually relates to logging or security, so to hear about another application of attributes caught my attention. I’m by no means an expert on o/r mapping, I didn’t even know it was called that until hearing the show! Thanks for the comments!

Comment RSS · TrackBack URI

Leave a Comment

Name: (Required)

E-mail: (Required)

Website:

Enter my name (ben) in this box, so I know you're a human.

Comment: