Automatic Value ObjectsBy Ben Teese ( IntroductionThe Value Object pattern (also known as Transfer Objects In this article I will discuss such a system in more detail, and then demonstrate a prototype for a tool that uses Dynamic Proxies In a Past LifeIn a past life I worked on a project involving a large and complex Swing client. It talked to an equally large-and-complex home-built application server. Because it was using Swing, the sorts of things that the client was able to do to business objects were fine grained and sophisticated - more so than for a web application that had a course-grained request-response cycle. Furthermore, the interactions between these objects (on the server-side) would also be quite complex, and in some cases need to be made immediately visible on the client. Sometimes a change to a field in one window would require that the value of another field in another window be updated immediately. Even worse, sometimes the processing in-between these updates would require some kind of server interaction - for example, a call to a central clock. Value objects were used in this system for client-server communication. The reason for this was performance-related - the system operated primarily over a network and latencies were of concern. The problem was that the value objects pervaded the entire architecture. They defined the basic building blocks for the entire system. It was all well and good to be subject to such constraints when adding functionality that needed to operate over the remote interface. However, much of the time, either not much or none of the functionality actually pertained to remote layer. Consequently, you'd be stuck with using these simple data structures when it would be much more desirable to have a richer object model. In some ways this was an extreme case: the value objects could have been contained to a layer of their own instead of infecting the whole architecture. However, even if they had been contained to a layer of their own, there still would have been maintenance overhead. If I wanted to add a new field, I'd need to add it to both my domain model and the value objects. This is something I've had to do numerous times on other systems. The problem is that value objects are essentially just dumb data holders. Furthermore, the data that they store often has to be quite simple - it can't be a complex tree of objects because it's considered undesirable to pass this tree over a network. But sometimes you want them to contain state information that only exists on the server-side. Sometimes you want something that is half-serializable and half-remote. Automatic Value ObjectsSo what I needed was a value object that also retained a reference to a remote object on the server. Furthermore, the internals of this object had to be transparent to the client, so that in the first instance the client didn't have to worry about whether properties were being accessed locally or remotely. Another way of putting this was that I wanted to isolate the issue of efficient remote access to a layer of its own as much as possible. I didn't want client programmers to have to worry in the first instance about changing value object code. Nor did I want this concern to leach out and affect the architecture of the entire application. In short, what I needed was some sort of Automatic Value Object. Inspired by dynamic proxies and the subsequent arrival of frameworks that allow you to transparently make objects remote (for example TRMI Here is an excerpt of the core of the test program: LocateRegistry.createRegistry(Registry.REGISTRY_PORT); String name = "CustomerFetcher"; CustomerFetcher customerFetcher = new CustomerFetcher(); Naming.bind(name, customerFetcher); ICustomer clientCustomer = ((ICustomerFetcher) Naming.lookup(name)).getCustomer(); clientCustomer.getName(); IAddress address = clientCustomer.getAddress(); address.getPostcode(); address.getStreet(); address.getSuburb(); clientCustomer.setName("testName"); This test implements both the client- and server-side of a series of calls. Firstly, it creates a remotely-accessible ICustomerFetcher. It then looks up this ICustomerFetcher remotely and fetches other remote objects from it. Finally, it sets a property on one of the remote objects. Because these calls are being made on an object that was looked up via the RMI registry, they will all go through the RMI transport layer. To demonstrate this, we can enable the java.rmi.server.logCalls Sep 30, 2005 3:18:52 PM sun.rmi.server.UnicastServerRef logCall The key part are the calls to Customer, CustomerFetcher and Address - I've marked them in bold (there are also calls to the bind, lookup and lease methods -these are unavoidable and are thus discounted). We see that for each call to these objects there is a call over the RMI transport layer - totalling seven remote calls. In a real system each of these calls would amounts to a call over a network and thus would experience network latency. Now let's tweak the example slightly by modifying CustomerFetcher.getCustomer so that it now uses AVOProxy.newProxyInstance(): public ICustomer getCustomer() throws RemoteException { // return new Customer(); return (ICustomer) AVOProxy.newProxyInstance(new Customer()); } Rerunning the example, we get the following: Sep 30, 2005 3:21:41 PM sun.rmi.server.UnicastServerRef logCall Disregarding the calls to bind, lookup and dirty, we now see that only two calls have been made. So what makes AVOProxy.newProxyInstance() so special? The Gory DetailsWhen AVOProxy.newProxyInstance() is passed the Customer, it gets the values of any properties that it has - ie, it invokes those methods on ICustomer that take no arguments but return a value. It then stores these property values in a HashMap. In the example above, this occurred for the methods ICustomer.getName(), ICustomer.getAddress(), IAddress.getStreet(), IAddress.getPostcode()}} and IAddress.getSuburb(). The important thing to note here is that all of these calls occur on the server-side. AVOProxy.newProxyInstance() then creates a Proxy that implements ICustomer and contains both the HashMap as well as a Remote reference to the original Customer, and returns it. This means that when CustomerFetcher.getCustomer() is called remotely, it ends up returning this proxy. Furthermore, because the proxy is Serializable (as opposed to the original Customer, which was Remote), it goes across the wire to the client, taking with it the HashMap of stored values, as well as a Remote reference to the original Customer. When the client invokes a method on this proxy, the first thing that the proxy does is check whether its HashMap contains a result for this method. If it does, it returns the result immediately and thus no remote call is made. If it doesn't, then it just delegates the call to its Remote object reference. Consequently, when the client calls ICustomer.getName(), ICustomer.getAddress(), IAddress.getStreet(), IAddress.getPostcode()}} and IAddress.getSuburb(), no remote call is made. However, when ICustomer.setName() is called, a remote call does take place. One important property of AVOProxy.newProxyInstance() is that it's recursive. This recursion occurs in three ways:
An important implication of these last two points is that the concepts of client and server become interchangeable. Correctness Vs. PerformanceOf course, even in this simple example, there is a trade-off between absolute correctness and performance. Why? Well, because the various property values have been cached and sent across, there's no guarantee that they'll be up-to-date when they're fetched by the client side. If the nature of the environment is such that those values could change on the server-side at any time, our current solution would result in the client being unaware of those changes. That's a basic trade-off that we have to be aware of: if the value of a property needs to always be up-to-date, we can't cache it and thus can't optimize access to it. Generalizing on this, we could say that the business requirements of an object will affect how efficiently it can be access remotely. In many ways this can be seen as an extension of basic data compression theory: the more that we know about the model of the data that we are compressing, the better it is able to compress that data. In this case, we can say that the more the optimization layer knows about the sort of conversation that the objects are having, the better it is able to optimize that conversation. This observation becomes especially true when considering sets of objects working together. For example, consider a set of Hibernate domain objects loaded from a Hibernate session. A naive approach to distributing such objects might be to apply a simple wrapper as has been demonstrated above. However, whilst that's fine when it comes to reading the properties of the domain objects, consider what happens when the client starts to set properties on those objects. For each property modification, a remote call will be made. Yet we know that it's not until the transaction is committed that we really need to set the property values on the remote objects. Consequently, we could probably get away with only bundling all the changes together at the commit point and making a single remote call. Unfortunately, at this stage the framework I've proposed isn't smart enough to figure that sort of thing out - we'll talk a little more about this later. My key point is that whilst I accept that the remote optimization layer does need to be aware of what the business-logic layer is doing, I don't see why the converse should be true; why should the business-logic layer be aware of what the remoting layer is doing? Should it have to worry about such things - at least in the first instance? If it becomes apparent that there are performance issues, then perhaps the remoting layer will have to become aware of some more business rules, or even business rules will have to be shifted into the remoting layer. Where to from here?The intention of this demonstration has not been to provide a comprehensive solution to the problem of transparent optimized distributed object communication. The purpose has instead been to demonstrate that these things are possible, and to serve as a starting point for further discussion as to where this can be taken. Neither is it believed that the implementation is as simple as it could be. Spring's Remoting framework could probably be used to reduce the amount of RMI-related code, and some AOP framework (for example Spring AOP Here are some of the more obvious improvements that I can see, listed in rough order of importance:
HibernateIn previous sections, I've referred to Hibernate or concepts that Hibernate supports. This isn't coincidental. Hibernate has already dealt with many of the issues related to transparent caching that I've come across (although, to be fair, Hibernate actually delegates most second-level caching issues to a configurable caching framework). For example, in presenting my ideas to a colleague, he said it reminded him very much of the concept of look-ahead caching in databases - a concept that Hibernate supports. Furthermore, the very notion of transactions could be a very useful one when it comes to optimizing distributed object interactions. As I mentioned earlier, if we know that a group of operations is occurring within a particular transaction, doesn't that give us some scope to bundle that unit-of-work together in a single remote call? I think there would be some benefit in trying to take this concept and apply it to the problem of efficient remote object access. ConclusionThe problem of efficient distributed object access is extremely old. Thus I was a little surprised that in a brief survey I couldn't find anything like what I've just presented. It is widely accepted that blindly treating remote objects in the same manner as local objects is a recipe for disaster - and not just in terms of performance (refer to http://research.sun.com/techrep/1994/smli_tr-94-29.pdf For client-server applications with complex clients and servers that don't need to be clustered, is there a better way? In this article I've presented a prototype of an alternative approach that could take a lot of the effort out of efficient remote access to objects. I've also outlined a number of directions in which I think it could go. I'm interested in what you think of it and whether you've seen anything like it before. Feel free to email me at |