May 29, 2005 16:10
Programming, Java, Design Patterns, ATG Dynamo
For some years, I've been using a pattern to abstract away access to a repository and make it more manageable. Here's how I do it. I understand that this looks like (and probably is) DAO, but not everyone has heard of it, and I think that a concrete example is more helpful than a three letter acronym.

So. You have some custom information that you need to store in a repository. You know that you're going to need at least one new item-descriptor, and you're going to have to expose that data to the rest of the system.

The item descriptor looks like this:

<item-descriptor name="car">
  <property name="displayName" column-name="name" data-type="string"/>
  <property name="color" column-name="color" data-type="string"/>
  <property name="price" column-name="price" data-type="double"/>
</item-descriptor>

And you want to be able to search for cars, create new cars, modify existing ones, etc.

First thing to do is to create a JavaBean for the item descriptor:

public class Car {
  protected String mId;
  protected String mDisplayName;
  protected String mColor;
  protected Double mPrice;

  public Car() { 

  }

  // get and set methods
}

Note that the variables are protected rather than private. I do this because someone after me may need access to those variables in a subclass. I don't advocate it, but it's up to them to make the call if it's necessary. There is also no link directly to the repository item.

And then create a manager which interacts with these Car objects, and throws exceptions on the same level.

public interface CarManager {
   public Car getCar(String pId) throws CarException;
   public void deleteCar(String pId) throws CarException;
   public void updateCar(String pId, Car pCar) throws CarException;

   // Known searches 
   public Car[] getCarsByColor(String pColor) throw CarException;
   public Car[] getCarsByPrice(Double pPrice) throw CarException;
}


One advantage of explicitly defining an interface is that clients can write against the interface even when the implementation is still incomplete. This means that you can write a mock object that implements that interface and hook your business logic to it for unit tests.

After defining the interface, you create the manager to translate between repository items and domain objects:

public class CarManagerImpl implements CarManager 
{
    public Car getCar(String pId) throws CarException 
    {                                       
        if (isLoggingDebug()) 
        {
            String msg = "getCar: pId = {0}";
            Object[] params = { pId };
            msg = MessageFormat.format(msg, params);
            logDebug(msg);
        }
        
        if (StringUtils.isEmpty(pId)) 
        {
            throw new CarException("null pId");
        }
      
        String repositoryId = pId;
        try
        {
            boolean rollback = true;
            TransactionManager tm = getTransactionManager();
            TransactionDemarcation td = new TransactionDemarcation();
            // Make sure that a transaction exists before we do anything.
            td.begin(tm, TransactionDemarcation.MANDATORY);
            try
            {
                Repository r = getRepository();
                RepositoryItem item = r.getItem(repositoryId, CAR_ITEM_DESC_NAME);
                if (item == null) 
                {
                    return null;
                }
                
                Car car = new Car();
                car.setId(repositoryId);
                copyProperties(item, car);

                rollback = false;
                return car;
            } finally 
            {
                td.end(rollback);
            }
        } catch (RepositoryException re)
        {
            throw new CarException(re);
        } catch (TransactionDemarcationException tde) 
        {
            throw new CarException(tde);
        }
   }
}

There's a number of things going on in the code above. First, the manager takes responsibility for logging debug information. Yes, you can turn on debugging in an item-descriptor directly, but that will spit out debugs for any access of that item-descriptor. Logging on the manager level allows for more direct and customized control.

Second, the manager handles transaction management. It may be the case that a transaction already exists before this method is called, but that's not the important bit. The important bit is that an existing transaction gets rolled back if this method fails. I can't stress this enough: almost nothing is as frustrating as a method that both throws an exception but still commits bad data to the repository.

Third, the manager does not use RepositoryException. Instead, exceptions are nested and thrown so that the client may determine how to deal with the error. I used to explicitly log errors in the manager, but that got tedious as the applications scaled up and every single component logged the same error at different levels. So now I have a simple rule: if you catch an exception and don't rethrow it, you are responsible for logging it. This typically means that the form handler or droplet at the UI end of the chain catches the exception, logs it to console and adds a form exception for user display.

So this handles the basic case. Let's see what happens with modification:

public void updateCar(String pId, Car pCar) throws CarException {
    // logging code
    // Check input for nulls
    // transaction code wrapper
        
    MutableRepository mr = (MutableRepository) getRepository();
    MutableRepositoryItem mutItem = mr.getItemForUpdate(pId, CAR);
        
    // Copies all the public properties from pCar to the mutable repository item
    copyProperties(pCar, mutItem);
        
    mr.updateItem(mutItem);
}

Much like you'd expect, except for a couple of points: I explicitly use an id for the update. I could put the id in the Car object, but that would get confusing as we are using the Car as a bag of data here for application, not query. The "copyProperties" code is actually not hard: you can leverage the DynamicBeans API with an array of public properties to get and set property values from one to the other. However, this is a blind copy. You can't selectively change a property value with this method. Usually this isn't a problem (in forms which display all properties at once), but it's always possible to use a key-value approach or allow for more selective updates.

The advantage of updateCar(Car) is that the Car can contain as much data as needed. If you do updateCar(String pId, String pColor, Double pPrice), then you have to change the interface every time you add a property to the item-descriptor.

Finally, there's the searching. RQL statements are very useful in this context, as they can be defined in the component properties, and it's much less work to get data in and out of them.

public Car[] getCarsByColor(String pColor) throws CarException {
   // logging debug code
    // Check input for nulls
    // assume this happens in a transaction code wrapper..
    
   // this is set in the properties file as "carsByColor=color = ?0"
   RqlStatement carsByColor = getCarsByColor();
   Object[] params = { pColor };
   Repository r = getRepository();
   RelationalView view = r.getView(CAR);
   RepositoryItem[] items = carsByColor.executeQuery(view, params);

   // always pass a zero length array, as this makes client access less fiddly...
   if (items == null) {
       if (isLoggingDebug()) { ... }
       return new Car[0];
   }
   
   Car[] cars = convertItemsToCars(items);
   return cars;
}

There are a number of complexities that can arise with this pattern. I don't cover what happens when you have items that reference other repository items. I don't cover the memory bloat that this pattern can cause with large repositories. And I don't cover the 'stale data' problem that happens if you keep references to the Car object. All of these problems are solvable, but the best solution depends on the circumstances.

« Salvation Story | Home | Tivo »

Will, I can't disagree with you more on the value of what you are doing. I try my best to never wrap repository items. All you gain is some type saftey, which is not even needed if you manipulate your items via the RepositoryFormHandler and dsptaglib. Nor can I agree with the manner in which you wrap your items. You copy you properties into member variables. This both prevents these objects from being cached, and your copied objects no longer reflect the current state of the object in the event of a rollback.

If you absolutly must wrap your objects, don't copy them. Use delegation

String getProperty(){
return (String) item.getPropertyValue("property");
}

Not only are you consuming less memory, but you are reflecting the current state of the item , which will change in the event of a rollaback or cache invalidation. You can even cache these wrapped objects.

ATG does ship with a wrapper generater tool in atg.repository.tojava. While not documented it should be usable, and generates the wrappers that delegate.


But even with automated tools such as this, the amount of code requried to get a little type saftey makes this a lose lose situtaion. Don't forget that all this wrapper code is not typesafe, only the code that uses it will gain from the typesafe wrappres.

Dealing with transaction at the manager level is also of little value. Very few times does an operation only want some modified items to be commited rather then others. The whole operation should be in an existing transaction. This means creating and commiting the transaction at the form handler level. Anything else would leave the DB in an inconsistent state.

> Will, I can't disagree with you more on the value of what you are doing. I try my best to never wrap repository items. All you gain is some type saftey, which is not even needed if you manipulate your items via the RepositoryFormHandler and dsptaglib.

Type safety isn't really what I'm aiming at. Yes, it's an effect, but the intent is to abstract the business domain logic from the repository so that there's a clear interface in front of the implementation.

> Nor can I agree with the manner in which you wrap your items. You copy you properties into member variables. This both prevents these objects from being cached, and your copied objects no longer reflect the current state of the object in the event of a rollback.

I don't see this happening: the objects are only returned if there is no exception. The objects aren't cached because they're single use: you pull these beans out of the manager for a single operation, use them, and then throw them away. Ideally, you never keep a reference around an item retrieved from the manager.

For modification, having a simple databean to pass into a manager means that modifications to a MutableRepositoryItem are limited to a single call, rather than being spread out over the lifetime of the repository item. It frankly gives me the willies when I see raw repository items bounced around and modified through several methods, and then finally updated without a clear idea of what's been changed.

If you keep a long running data object around which is supposed to reflect the state of a repository item, you'll be better off using the ChangedProperties interface and delegating access through the repository item, as you said.

Memory consumption (new objects, plus the garbage collection) is an issue. But it depends on scale and usage. With a large enough dataset, even raw RepositoryItems won't help. A billion items and the repository architecture will tap out at the JVM limit: you're effectively limited to 2GB of data no matter what (although the 1.4 VM may have improved matters). For most situations, this approach is sufficient to the task and has the advantage of being straightforward.

> ATG does ship with a wrapper generater tool in atg.repository.tojava. While not documented it should be usable, and generates the wrappers that delegate.

True, but as it's not documented I don't think I can really exploit that codebase. The ATG codebase is vast, and even the documented bits can be incomprehensible to the newbie.

> Dealing with transaction at the manager level is also of little value. Very few times does an operation only want some modified items to be commited rather then others. The whole operation should be in an existing transaction. This means creating and commiting the transaction at the form handler level. Anything else would leave the DB in an inconsistent state.

This is true -- most transactions should happen at the form handler level and should rollback everything if possible. Putting the XA code at the manager is more of a failsafe than a robust transaction management strategy: it ensures that a transaction exists and can be rolled back during that call, that a transaction exists AFTER that call, and it also ensures that any FOLLOWING calls that use that transaction will rollback if a previous method has failed. This is not an ideal situation, but unfortunately I can't guarantee that formhandler code is going to be written correctly. And yes, I know that's a lame excuse -- I could wrap a transaction servlet around the dropleteventservlet, but this seems less like magic.

Well, crap.

You are correct: the transaction demarcation I am using is wrong. I've changed the code from td.REQUIRED to td.MANDATORY. The demarcation should be outside of the manager.

I like reading your code snippets since your cleverness usually leads to elegant solutions. But I fear this is an exception. Here are some objections:

1) Possibility of "Car" objects getting out of sync with their corresponding RI: if two clients are modifying a RI of descriptor "car," they will be pulled out into different "Car" objects, so even a big sync block won't prevent concurrent modification update races. Ben's pass-through calls to a member RI would sidestep this a bit, while retaining type safety and abstraction from Repository. The better I understand the types of problems that ChangedProperties prevents, the more it seems its pattern is the cleanest way to avoid concurrency voodoo in a secondary persistence layer.

2) When you add a property to "car," you've got four more modification points: a member, getter, & setter in the "Car" object, and an addition to your proposed key-value map behind copyProperties. (This is in addition to FH/Manager manipulation of the new property, schema changes, descriptor XML changes, &NamedProperties getter/setter addition, which will be necessary in any case.)

3) Lack of XA transparency: would the caller want the root XA marked rollback-only if CarManager's operations fail? Probably, but are you sure it will always be that way? It might be better to allow passing XA mode as an argument, leaving your current "getCar" signature as an overload with default value TX_REQUIRED. I notice that you have not wrapped "updateCar" or "getCarsByColor" in XAs, and you've left them as separate operations. In other words, you've still left it to the caller to implement the root XA that bridges retrieval & update operations, so I don't see what the Manager XA really buys you. If the problem is that you "can't guarantee that formhandler code is going to be written correctly," you have not solved it since your Manager does not force the FH to bridge the two operations. You could put a marker in Car that specifies its retrieval XA, then have updateCar verify it's still in or under the same scope. Perhaps this is a problem best solved with code review?

4) I agree with Ben's recommendation of tojava. Sure, the fact that it's not documented is less than ideal, but after a few minutes of playing with test.xml and whatever tojava XMLs & generation scripts you've left behind & any reasonable programmer will grasp it. And what is more standard: your Repository wrapper system, or the one that ships with ATG? If you are still iffy about tojava, you could use Castor (a pain, but a well-documented one).

5) DAO is a pattern, not a standard. If your true goal is Repository agnosticism & you'd like others to be able to port the data layer, you might do better creating a JDO Repository adapter SPI. Or back off a bit from the Manager pattern (largely ATG-specific) & move to something more like the example you've linked.

6) I don't agree that Repository abstraction is a practical goal. If an ATG app needs porting, it's easier to rewrite than abstract. Experts on the new platform will have difficulty understanding ATG idioms. Plus, have you ever done a port with no new requirements? So it'll be easier to do a requirement-based rewrite than a true port. Or if they stay with ATG, subsequent ATG experts would work with Repository more easily than with your abstracted layer.

You note you don't like "...repository items...updated without a clear idea of what's been changed." I would hold that the important thing is realizing _whether_ something's changed--my own rule is that concerns across MVC layers should be scope-based rather than functional. And I agree that given the sort of folk who end up writing FHs, it's unfortunate that the XAs are demarcated there. But this seems to be an inevitable consequence of optimum MVC design in a stateless (HTTP) context.

I agree that memory consumption is a very low-priority concern. The size of your wrappers is tiny compared to the creation of the web of Observers & sync markers triggered by RI retrieval.

If you like the simplicity of updateCar's signature, the same effect can be achieved with a public-member struct used for passing only (I am fond of using inner classes for this).

You have a good point about testing with mock objects. But perhaps you'd get more mileage out of a MockRepository. Is anyone working on mockobjects-atg-dynamo now? And your wrappers could do some really interesting things with calculated/derived properties. I also like how you've avoided forcing callers to repeat logging patterns. I usually end up doing this somewhat inelegantly through reflection. Also, the specialized exceptions are a good feature, but note that you have a few unhandled & undeclared REs above.

Thanks for the tips, and I hope you'll follow up on how well this pattern of wrappers works out for you!

object beans and DAO's to manage repositories? I feel a flashback to relational views! I disagree with Ben's comments above... a repository's flexiblity is sometimes best managed with objects that can provide an encapsulated interface to the business concepts of the data rather than the data itself. Sure, delegation is a pass-thru to gain type safety. No real magic there. But, DAO's also solve the problem of providing a convenient way to mingle data and logic without the complexity of derived properties and custom property descriptors. If all you want is a displayName property that is specific to your object, its vastly easier to manage that in object form. Also, having context sensitive exception management is almost always better than a generic RepositoryException. On top of all that, another benefit of DAO is that it can benefit from object concepts like inheritance (in a true form, unlike item descriptor inheritance) and with interfaces as you've described can provide abstraction of data with specific access control at a property level to give you package or protected controls over data access (security benefits, anyone?)

Either way, its good to see old ATG concepts still lingering around, even with the new snazzy techs like repositories. The trick is knowing when to use the right tool for the job.

Let me reply to both of you at once, just because my comments are terse dosn't mean I'm being rude.


Will writes:

" abstract the business domain logic from the repository"

Out of the box the repository only presents data. My position is that business domain logic does not exist and there is no such thing as different kinds of "logic" but to use your language, there is no such logic to abstract away.


"having a simple databean to pass into a manager means that modifications to a MutableRepositoryItem are limited to a single call,"

Like repository.updateItem? You are really only providing a typesafe wrapper over the repository api.


Travis Writes:

"DAO's also solve the problem of providing a convenient way to mingle data and logic without the complexity of derived properties and custom property descriptors"

yes it is more complex to write a custom property descriptor, but by doing so, you are garanteed that your code is executed when this item is modified by other ATG mechanisms such as the BCC,ACC, template definition scripts.

But you convinced me of one thing, I'm all for defining these wrappers and putting all code in them, thus getting rid of these silly manger and tools componens that grow like the homeless in summer.

"context sensitive exception management is almost always better than a generic RepositoryException"

no, all cases you will handle this at the form handler level and show an error message to the user. It makes no difference how many layers the database exception is wrapped in. It's all the same.

" it can benefit from object concepts like inheritance"

yes, you are correct, you gain some type saftey, which includes type inheritance. But there is no way you can convince me this is worth the trouble of wrapping RI.

"and with interfaces as you've described can provide abstraction of data with specific access control at a property level to give you package or protected controls over data access (security benefits, anyone?)"

There is no security benefit to decalring a method package or private protected. Anybody who can evaluate byte code in your jvm can call any method they want. And you are gaining no more data abstraction then already aforded you by the repository API.

" ATG concepts still lingering around, even with the new snazzy tech"

Even ATG is very bad at wrapping RI, they do it completly differently in all thier major products DCS 5, DCS 6, Portal, DPS, and Publishing. Not to sound bitter, but I for one had enough of this shit.

....................


Don't forget that your simple posted exmaple neglects complex parts of a wrapper framework such as item anditem collection references. How will you also determine which properties have changed? Recall order objects from DCS 5.0 and you will understand where all of this leads.


name
url