picture home | pixelblog | qt_tools

omino code blog

We need code. Lots of code.
entries for category "software architecture"
/\/\/\/\/\/\/\/\
David Van Brink // Sat 2011.07.30 13:04 // {code software architecture}

HTTP “Comet” Realtime Messages

What you Want

For some web applications, you want to send realtime messages between the browser and the server. That is, the browser can send a message to the server at any time (this is typical), and the server can send a message to your session at any time, too (this is not what HTTP was designed for!)

What you want: to send a free immediately-delivered message any time you want.

What you get: they’re not free, and they may be far from immediate.

This note will describe a cargo trucking analogy for HTTP requests, and expand on it for interactive (“Comet”) style use.

Truck4
Let’s look at a couple of typical browser use models.

You see a link, you click on it, and the page comes up.

==> You send an empty truck to the factory, and it comes back with a load of standardized cargo.

You fill out a small form, press “Submit”, and some page comes up.

==> You send a lightly-loaded truck out. It has some instructions on a clipboard. At the factory, they read your clipboard, load up the truck, and send it home.

You upload a photo using a web-based form. A page comes saying, “Your photo is up, click HERE to see it.”

==> You send out a loaded truck, it comes back empty with a note saying, “We got your photo.”

These are all one-time events, initiated by you at the browser. The HTTP request/response model works pretty well for this. Let’s look at one more typical use case.

You’re very concerned about, let’s just say, the temperature in Fahrenheit at the Town Hall Weather Station. So every 5 seconds you click the refresh button. Refresh, refresh, refresh. And every 5 seconds, the empty truck rolls out, and returns with more or less the same cargo. Every few minutes, perhaps, the temperature changes and a different cargo comes back.

Truck6

Interactive Communication

Let’s call the browser “you” and the server “the factory”. Here’s what we have so far:

  • You have an infinite supply of trucks.
  • You can send a truck to the factory any time you want.
  • You can choose what to put in the truck on the outbound trip.
  • The factory chooses what to load into the truck for the return trip.
  • The factory has no trucks unless you send one.

From here out, we’ll mix the metaphors and pretend not to notice.

Consider a chat session between you and a server-based robot. Let’s assume that it’s quite thoughtful and conversational, and doesn’t merely immediately reply to each thing you type. Rather, it might consider your words for a time, or even have something to blurt out on its own.

Here’s several possible implementations. Here, the truck is a Mail Truck, but that’s not important.

Single Truck Stays Home

Every time you type a message and press return, the truck goes out with your message, drops it off and immediately comes home. If there were any messages waiting for you, it picks them up.

Good: Acts just like a web request, nothing happens unless you hit return.

Bad: The robot never gets to tell you anything, unless you speak first. You end up typing “hello?” a lot.

Single Truck Waits For Reply

When you press return, the truck heads to the factory with your message. Then, it waits there until the robot has something to say. When it does, it comes back with the robot’s message. Meanwhile, you couldn’t say anything. You didn’t have the truck.

Good: Sometimes the robot can speak immediately, and sometimes you can.

Bad: Sometimes you cannot say anything, because you have no truck, and sometimes the robot can’t, for the same reason.

Single Truck Goes Back And Forth Always All The Time

Let’s say that every 5 seconds, the truck heads out with anything you’ve said in that interval, and comes back immediately with anything the robot has said. Now we are talking!

Good: Looks just like a regular web request. Messages are delivered more or less regularly. Things are looking up.

Bad: The truck is always making the trip, even when there’s no cargo. And the deliveries are never immediate.

Intermission

Truck1

You noticed we introduced a new concept: That of the truck waiting at the factory. That’s allowed! (Up to a point; if it takes too long you must assume the truck has been lost.)

This idea of the server holding on to the request for a little while is referred to as “Comet”, a play on “Ajax”, which comes from “asynchronous javascript and xml”.

Anyway, now we’re getting somewhere. A few more items:

  • Every time the truck rolls, it has a cost. Sending a truck home empty is wasteful.
  • Leaving trucks at the factory for a time has some cost.
  • It’s legal for a browser to open a TCP connection, send a request and get the response, and then close the TCP connection. Alternatively, it can keep the TCP connection open and use it again. Requests can be pipelined, and responses will arrive in order. Oops, trucks, right…
  • Sometimes you destroy a truck as soon as it returns, and sometimes you keep it around. Which is more expensive depends on how long you’re keeping it parked.

Multitruck Solutions

Forget about the chatty robot. You get the idea by now.

Expanding Fleet of Trucks

You start by sending a truck to the factory, just in case. Then if you need to send something, you send another truck to the factory. It stays there. Sending a truck home empty, you see, is wasteful. If trucks are free, we can leave them for use at the factory as needed.

When the factory needs to send something home, it’s got at least one truck ready. If it ever runs down to none, we’ll send it an empty one again.

Good: We’ve minimized the road-time of empty trucks.

Bad: The factory has limited parking, and we’re actually not the only customer. Trucks may get old and rusty sitting at the factory disused. (Or rather, they’ll get towed, slapped with a Timeout, and we may have to send out a fresh one.)

Fleet of Two Trucks, Variation One

We keep one empty truck at home, and one at the factory. We can send our truck over at any time, and it gets sent home immediately, empty. If the factory has something to send, it has a truck and uses it. We immediately send it back to the factory, empty.

Good: Now, we’ve both got trucks, except for very brief times right after we’ve sent something.

Bad: The factory-based truck runs a big risk of parking tickets, while the home-based truck doesn’t. Also, half the truck trips are empty, after all, alas!

Fleet of Two Trucks, Variation Two

We start by sending an empty truck to the factory. If either we, or the factory, need to send something, we use the truck that’s there. Whenever a truck arrives, we send out the other one. (If we both sent at the same time, then, hooray, we just use the truck that rolls in.)

If a parking ticket is imminent at the factory, we send it home empty, and the other truck arrives in its place.

Good: We can both send any time. If either of us send, we reset the parking-ticket time.

Bad: On average, half the truck trips are empty.

Observations

To recap, the algorithms described were:

  • Single Truck Stays Home
  • Single Truck Waits For Reply
  • Polling: Single Truck Goes Back And Forth Always All The Time
  • Many Requests: Expanding Fleet
  • A Slow Request and a Fast Request: Fleet of Two Trucks, Variation One
  • Ping-pong Requests: Fleet of Two Trucks, Variation Two

The first two are just broken. They don’t let communication readily occur.

The other four are all viable, with different kinds of costs.

Polling is nice and easy to understand. It puts a fixed upper bound on the latency of your messages, and an average latency of half that. But it sends a lot of empty trucks; lowering the rate of trucks increases the latency.

Using Many Requests keeps a lot of open connections, but minimizes empty trucks. It also contradicts the HTTP 1.1 RFC, section 8.1.4, which says you SHOULD NOT have more than two trucks out.

A Slow Request and a Fast Request is ok. (If you’re accustomed to web queries, it may “feel” nice because the home-based truck seems to associate requests and responses, but this is fallacious. If we’re really passing asynchronous messages, then the message protocol defines the request/response associations, not the HTTP protocol.)

Ping Pong Requests seems to be the nicest of the bunch. It’s a slight improvement over the Slow and Fast Request method, in that the server may have more chances to avoid a timeout on the waiting request. Its symmetry is perhaps slightly appealing as well.

Caveats and Conclusions

Me? I’ve never done any of these. Working through it now, have some prototypes up and running based on JavaScript client, and Restlet-based Java server. But the truck analogy has proven useful in contemplating these algorithms. I’m leaning towards Ping Pong, and plain old Polling as the two most viable modes.

Truck3

References

HTTP 1.1 specification RFC 2616, see Chapter 8 on “Connections”.

oh, i dont know. what do you think?


David Van Brink // Tue 2008.05.13 18:34 // {broad generalities software architecture}

Deltas

I don’t know what this means, really, but it seemed really profound when I wrote it on my whiteboard. But now it’s time to erase the whiteboard, so, here it is.

Something about the plugin manager and some recent “improvements” to it…

oh, i dont know. what do you think?


David Van Brink // Wed 2007.08.8 23:04 // {java software architecture}

Basics: The Name of the Assertion

The Java JUnit framework is an incredibly useful tool for maintaining code integrity. This note focuses on the tiny but important matter of what messages to insert into your assertion statements.

Basics: The Name of the Assertion

Why Assert?

The philosophy behind unit testing is simple: If you make the appropriate small assertions about your software, you can believe that the system will operate correctly in the aggregate.

The Java JUnit framework provides a convenient way to structure a collection of these assertions. In JUnit, when an assertion is correct, or “true”, nothing happens. When an assertion is incorrect, an error is displayed, and the particular test is ended.

JUnit provides a handful of Assert methods. Each one has an optional first parameter: the failure message. This note shall offer guidance on maximizing the value of those failure messages.

JUnit In One Paragraph

I’ll assume you either know or can easily find out the simplest mechanics of writing and using JUnit. For the purpose of this note, we need merely recall that:

  • assertTrue(boolean b) fails if b is false
  • assertNotNull(Object o) fails if o is null
  • assertEquals(Object a,Object b) fails of a does not equal b
  • …and that each takes an optional first argument with a String failure message.

Evolution of a Test

Let us now examine specimens of a test from various levels of the evolutionary ladder.


Homo Habilis

A test can’t be much simpler than this… or less informative.

  public void testTwoFish() 
  {
    assertTrue(getListOfTwoFish().size() == 2);
  }
junit.framework.AssertionFailedError at dvb.JunitExamples.testTwoFish(JunitExamples.java:12)

Well, it failed. It says right there, “AssertionFailedError”!


Homo Erectus

This version of the test shows the beginning of structure.

  public void testTwoFish() 
  {
    // "expected" goes before "actual"
    assertEquals(2,getListOfTwoFish().size());
  }
junit.framework.AssertionFailedError: expected:<2> but was:<3> at dvb.JunitExamples.testTwoFish(JunitExamples.java:15)

Better already! Our distant ancestor Homo erectus knew that assertEquals was a more powerful tool than assertTrue. It tells us that something was 3 but should have been 2. But what? Well, the test name gives a hint. But we can do better…


Homo Neandertalensis

We begin here to see a nearly modern JUnit test. Language begins to play a broader role in survival.

  public void testTwoFish() 
  {
    assertEquals("getListOfTwoFish()", 2, getListOfTwoFish().size());
  }
junit.framework.AssertionFailedError: getListOfTwoFish() expected:<2> but was:<3> at dvb.JunitExamples.testTwoFish(JunitExamples.java:20)

Now the message is somewhat helpful… and naturally raises the question, What did it return, if not two fish?


Homo Sapiens

We now see a specimen of a fully functional JUnit test.

  public void testTwoFish() 
  {
    List<String> x = getListOfTwoFish();
    assertEquals("getListOfTwoFish() " + x, 2, x.size());
  }
junit.framework.AssertionFailedError: getListOfTwoFish() [nemo, mr limpet, flipper] expected:<2> but was:<3> at dvb.JunitExamples.testTwoFish(JunitExamples.java:26)

Do you see it? We’ve made the actual list part of the error message. Handily, Java collections and arrays have a built-in toString() method which lists each element separated by commas. Now, when it fails, the problem is obvious.


Homo Superior

That last specimen may be satisfactory for human programmers, but my standards, and I hope yours, are higher still.

  public void testTwoFish() 
  {
    List<String> x = getListOfTwoFish();
    assertNotNull("getListOfTwoFish() was null",x);
    int size = x.size();
    assertEquals("getListOfTwoFish() " + x, 2, size);
  }

Two final flourishes round out our exercise. First, we check for null. It could happen to anyone, it could happen to you. And secondly, we pull out the size assignment to a local variable. This is a courtesy to those who come after us, stepping through our code.

The Common Theme

There is a common theme in all the above examples: augment the information that JUnit already returns. The assertEquals is wonderful, as it shows what you want and what you got. Our message augments it with a name, a list… a hint. The science fiction writer Theodore Sturgeon invented a symbol for “Ask the next question.”


The idea applies here. What went wrong? Oh, we expected one integer but got a different one. What were they? Ah, we got “3″ but it should have been “2″. So what was the third fish?

But Why All That Effort?

In practice we usually write a test once, and, assuming it passes, never touch it again. The above “evolution” may happen as a result of debugging a failure. But: the effort is small to do it up front, when the test is first written. Like most learned behaviors, it may take a few negative experiences before it becomes truly second nature. We all burn ourselves once, and then learn to use potholders. And wasn’t there something about seatbelts?

All images taken from Wikipedia, except Magneto.

oh, i dont know. what do you think?


An application typically stores its state as a file. As an application user, you expect this file to work, even if you don’t touch it for a while. This note offers some primal advice on implementing a file format in your application.

Basics: Know Your Document Format

Your Application’s Document

Be in control of your file format — it’s your most persistent API. Application versions will come and go, but you and your users expect the documents to last forever.

One lesson I’ve learned is simple: there are no shortcuts for mastering your own file format. You have to design it and implement it. By all means leverage existing parsers, such as for XML or RIFF or what not. But I honestly believe that there is no appropriate automation for translating your in-memory-model to a file format.

Tools like JAXB which generate XML serializing and deserializing code based on a schema can be quite useful, but I always add an additional translation layer between these generated classes and the “real” API.

Too Much Model

I’ve seen bad results from “built-in serializers”; one application developer I worked with relied entirely on the built-in serializer of Apple’s “Yellow Box” (this was essentially “Cocoa” running on Windows, after NeXT, before Mac OS X). Every build of the application produced a different file format, and some daisy chain of machine generated updaters was able to continue to read it in. The real problem was that we ourselves couldn’t reliably answer the question, “What’s in the file?”. We couldn’t easily write other tools to manipulate or debug our own documents, or stimulate test cases. (Staccato Systems’ “Synth Builder”.)

Too Much File

At the other extreme, I’ve seen bad results from reading an hierarchical file structure into memory, and attempting at all times to keep it identical to its on-disk representation. This required a lot of work, and bugs manifested as extra data in the file which other readers just needed to ignore. But not all readers ignored it, and so these stray file elements became de-facto parts of the file format. (Altera’s “SOPC Builder”, and its PTF file format, before our total overhaul.)

Just Right

It’s safest to define your file format separately from your in-memory model. They may be similar, but it’s valuable to implement the file reader and writer as a separate entity from your model. This will involve a translation layer where you read in your file format, perhaps to an in-memory tree structure of some sort, and then populate your in-memory objects. There are opportunities to make this more or less automatic, as appropriate to your particular development effort. But, importantly, this translation is very well-defined and therefore very testable.

Generally…

Hierarchy.

A file format will usually be hierarchical. That’s just the common design pattern that seems to cover the bases. As with your object design, it’s handy to make the hierarchy something that maps intuitively to the problem domain. This is easier said than done, but is an opportunity for artfulness… to say the least. As to the low level file format itself, in general in this day and age you can probably just skip any tedious meetings or debates and create an XML-based structure. If you need to store lots of binary data, and file size is a concern, your XML document can refer to neighbor files. And under no circumstances let the argument that XML is too user-unfriendly or heavyweight influence the decision; these remarks come from those who don’t know from Koolaid. Really.

No Redundancy.

The persistent document should capture each datum once and only once. As a thought experiment, imagine someone manually editing the file, perhaps even in a text editor. Ideally, they should be able to modify an entry in the file and have a predictable feature of the document change when loaded.

Human Readable.

And speaking of the above thought experiment… This isn’t always possible, and in rare cases is undesirable (for reasonable or less-reasonable reasons). But usually it’s a great benefit to be able to examine your document in a text editor. It’s great for debugging, that much I can promise. Another advantage of a text-based file format is it removes any questions of byte ordering or floating point format. What number is hex 01 02? You have to define whether it reads as 258 or 513. But if your file says “x = 258;” then it’s perfectly clear.

Easy In, Rigorous Out.

When reading in your document, be generous. If something is missing, insert a reasonable default value. This strategy can be used with cautiousness and, dare I utter it, cleverness to extend the file format and still be backwards compatible. On the converse however: Always write out everything! It is not safe to skip an entry because its value happens to be “the default”. Someday you’ll change the default, but old documents mustn’t change.

Be Careful With Scripts.

It may be tempting to use a scripting language as a file format. For example, TCL is very easy to embed in C or Java programs, but it’s not ideal as a document file format. Some will argue that philosophically code and data are the same thing. My counterargument used to be something about predictable computation time and the halting problem: if you read in a script, it might run briefly or it might run forever. But I have, I hope, an even more compelling distinction: code can erase your hard disk but data can’t. It’s that simple.

When I’ve seen scripts used as a file format, invariably, eventually, there exists some need to “parse” the script without “executing” it, which is of course absurd. If you depend on being able to parse a subset of the language then you’ve reduced it to data. Use a data format, not a scripting language.

Closing Remarks

One day in the far future (but sooner than you expect) it will be time to rewrite your application from the ground up, or create a new application which interacts in some way with the old one. In either case, a well-defined and knowable file format will render these tasks possible.

oh, i dont know. what do you think?


David Van Brink // Sun 2007.07.29 20:12 // {code java software architecture}

Basics: Where To Put A New Method

In “Where to Put A New Method” we’ll consider the question: “Where should I put a new method?” We’ll be considering several different collaborative topologies. Although I’ll describe this in terms of Java methods, the dynamics — both technical and social — are, perhaps, of general interest.

Basics: Where To Put A New Method

I Wish The Platform Had [Feature X]

Here’s a perfectly reasonable idea for a bit of functionality: “Please write out this file unless the file already exists and has the same contents.” This is so you can write out a file, but not bump the timestamp if it’s going to be identical to the existing one.

In Java, you might reasonably wish that the File object had that method. That would be neat.

  /**
   * I wish we had this!
   * Write the file unless the existing one is the same.
   * @param String new contents for file.
   */
  public void writeIfDifferent(String newContents) {
    //...
  }

Sure, you think, everyone would love that. But the fortunate fact is that you can’t add it to File. That class is part of the platform and because you’re not one of the authors of Java, you just don’t have access to alter it. And this is probably a good thing. Someone owns the design of that class, and they have a vision for it, better considered than yours or mine. This proposed new method should only be added if they accept it. So put your request on a postcard, mail it to the North Pole, and wait.

Meanwhile, you need this functionality today. It’s really quite simple to implement. Create a class like so:

import java.io.File;
public class ThingsIWishJavaHad {
  /**
   * Write the file unless it would stay the same.
   * @param file
   * @param newContents
   */
  public static void writeIfDifferent(File file,String newContents) {
    //...
  }
}

This is an easy choice because it is no choice at all. Altering File simply isn’t an option. In the next section there is a more challenging moral dilemma.

(Hopefully, it’s clear that the static method is preferable to extending File. If we created our own class, BetterFile, we would have to use it throughout our code base to get these small additional features. The static method can be used directly. If you gave that static method to your friends, they could use it without changing their variable types.)

I Wish Your Library Had [Feature X]

Now let’s turn our attention to typical corporate software development. If you’re developing some software it’s quite likely that you rely on some in-house software libraries. Just because you can call up the owner doesn’t mean you can get everything you want! I have it on good authority that most employees at Sun can’t just go and add a method to File.

(Corporate cultures are wildly diverse. I’ve seen environments where anyone was allowed to change anything, and there was a presumption that, well, I guess everyone’s a professional, and if they typed commit it must be OK. In small organizations this may even work, but generally it is dysfunctional.)

Let us concoct an example. Let’s say you’re working at SmogCo, maker of SmogBooks, a publishing system. You are developing a processing tool for SmogBook documents, for a special Florida-based client. Naturally you’re building up your specialized tool using the SmogBook Java library from the main development team. The special client needs a feature to add oranges to every page. (Look, it’s just a f’rinstance, OK?) Naturally, you realize that smogbook.addOrangesToEveryPage() is too limited. What you really want, of course, is:

  public enum Fruit { ORANGE, APPLE, CHERRY } // we'll add more later!
  public void addFruitToEveryPage(Fruit pageFruit)
  {
    //...
  }

Brilliant, you think! Every SmogBook developer can leverage this; we do deal with agricultural clients a lot, come to think of it.

Now comes the moral dilemma. In many corporate settings, the revision control system is wide open. Furthermore, in many corporate settings the boundaries of code-ownership are not well-specified. So what do you do?

(Truly, in matters like this, there is great opportunity to witness, and participate in, classic social dominance games. And indeed, we should be grateful that our species has, largely, found an outlet for these tendencies in such harmless pursuits. But that is a topic for a different essay.)

You could just check the method in and hope that either everyone likes it or nobody notices. If your change is good, and you pull it off a few times, you could just wind up on the core development team as a result. But it’s far more likely that, even though it consumes your working hours, this “add fruit to every page” feature just isn’t of general interest. It’s human nature to see our immediate goals as globally important, but it’s good architectural sense to realize that they aren’t.

So I’ll have to advise that, instead, you create a tiny local implementation of the fruit feature, like so:

public class SmogBookFruitUtils {
  public enum Fruit { ORANGE, APPLE, CHERRY } // we'll add more later!
  public static void addFruitToEveryPage(SmogBook book,Fruit pageFruit) {
    //...
  }

By all means, show it to the core development team. They’ll be pleased to see their API and classes understood and used. Naturally, they don’t want to add just any old thing into their precious libraries: adding something in is a permanent support burden! But now they’ll know that at least one client needed the fruit feature; and if it’s a truly necessary feature, they’ll want to add it to the core library. And you might yet end up on their team.

I Wish My Library Had [Feature X]

And now we come to the third and penultimate challenge. We’ve covered Man Against Nature, and Man Against Man. Now we’ll examine Man Against Self.

If you yourself are maintaining a library, and you need a new feature, where do you add it? You have to ask yourself very carefully: Is this feature consistent with the vision of the library? If you add strangely asymmetric features to your library, it will become messier and harder to use. It may even feel less professional to others who are evaluating and using it.

Maintaining architectural clarity while developing and using your own libraries requires you to play both roles, that of invader and defender. As with libraries written by strangers and by peers, the safest place to add a new feature is somewhere else. If it proves out, promote it into the library.

They Wish My Library Had [Feature X]

And eventually, inevitably, you yourself will be providing core libraries, and you will have fans of your library, which I’ll here call users. And now they want features. Oh, how they do go on about how they wish things were different. What can you do? I can offer several bits of advice.

First, accept that these users, as troublesome as they are, are your friends. When you hear complaints you know you’re on to something useful. Trust me.

Keep an eye on the revision control system. Your users mean well, but that doesn’t mean they know what they’re doing. If they change your code, look at what they’ve done, and decide if that’s how you would do it. If not, you have a professional obligation to roll out their changes and advise them of a preferred solution. If the change is good, put a little gold star next to their name on the secret score sheet. Bring them into your team when you can.

Listen carefully to your users. They’ll ask for many things. What you should give them is almost never what they ask for. It’s your job to tastefully refashion their requests into beautiful elegant solutions.

To do this, you must know where to put things.

oh, i dont know. what do you think?


David Van Brink // Sat 2006.09.30 11:24 // {code java software architecture}

A Philosophical Snapshot

Introduction

This introduction has been written last.

The intent of this post is to give a snapshot of some coding strategies which I think are good ideas, today. This is of course reflective on the most recent round of development I’ve been involved in. Reading it over, I see that the notions are drawn nearly equally from things we got right, and things that (I believe) we got wrong.

This is my blog and my essay, so I’ll just make these statements as if they are indisputable truths from on high. Sometimes I’ll even try to support them. But I certainly encourage dispute and counterexamples. Comment away!

These are all, broadly, under the heading of API design. After all, what isn’t?

These are also all referring to Java code, though the principles are certainly true in any programming language. Says me.

Offer Power, Not Rules

Make it easy to do the “right” thing. Do not try to make it “impossible to do the wrong thing”. Building your API around “restrictions” rather than “capabilities” invariably results in a cockeyed underlying implementation, infusing anachronistic domain-specifics into inappropriate places. It also can lead towards untestability. For testing you, by definition, need to exercise the functionality “out of context”.

Provisions For The Journey

Consider the entire development flow, not just the final result. (Sometime in the 1970′s Japanese cars all included easily-located jacklift- and tow-points on the frame. American cars still sucked.) For example, if generated code is part of your product, it is an obstacle to quick debugging turnaround times in Eclipse. Consider that as part of the cost. It may be the right solution, still, but is worth considering. Similarly, if your code relies on certain files in certain places, or certain parts which must exist as Jar files, that will have implications to your clients’ interactive development.

Obligations Versus Offerings

  • Obligations are Abstract

Any kind of “plug-in” that your product supports will necessarily have multiple implementations. The thing which unifies the different implementations is that they each satisfy some fixed minimal set of “obligations”.

In Java, a useful way to manage these obligations is by an interface.

If you change the interface, you are changing the obligations. (You can also change the obligations by merely reinterpreting the interface, but let’s gloss over that for now.) Changing the obligations of a plug-in is a serious gesture. Managing that interface carefully is important because any change can have broad effects. A committee of interested parties can be beneficial in restricting unnecessary change.

Once an interface is “out the door”, that is, used by a broad base of implementers, the cost of changing it increases enormously. In general, changing an existing interface (once “out the door”) is impossible, and a better choice is to create a new one, while still supporting the old.

  • Offerings are Concrete

Conversely, a utility library provides a set of features, or “offerings”. Like a plug-in interface, many developers depend on it. However, it is perfectly safe to increase its feature set, as long as the existing features remain unperturbed. Also, in contrast to plug-ins, there will be just one implementation.

In Java, the best way to express offerings is with a concrete class.

The maintainer of a utility library can certainly benefit from client-feedback, but restricting change should not be the primary goal of that feedback. Rather, the client-base should make clear how they are using it and what additional features are desired. The maintainer can then factor that all together to satisfy his clients appropriately.

While it is true that one way to maintain compatibility is to prohibit change of the library, a better way is to ensure that existing features remain operational. This can be done by unit tests. Clients should be free to add unit tests to guarantee that their particular usages remain supported.

If the maintainer inadvertently alters existing behavior such that a client expectation is changed, they can either a) strive to retain the existing behavior or b) negotiate the change with the client.

As with plug-ins, once a utility library is “out the door” the cost of changing it increases significantly. In fact, we can then state clearly that “changing it” is “breaking it”, and one must consider carefully before doing that. (Although occasionally it is still the right thing to do.)

But, unlike “obligations”, it is always safe to add more “offerings”. The next release of the utility library may do more, but never less.

On Plug-Ins and Privacy

This next one is subtle.

A “plug-in” is something that will have multiple or numerous implementations. A “plug-in client” is something that uses a particular kind of plug-in.

A plug-in client obviously must depend on a particular interface for the plug-ins it uses. A plug-in may depend on a certain usage-model or flow. That is, it may expect to be invoked in certain ways. Ideally, this is well-documented, but if not, then the first plug-in client more or less defines it. But, a plug-in must not depend on a particular plug-in client. The classic symptom of this an API like:

public interface IFooPlugIn
{
    public void setX(int x);
    public int getX();
    public FooRecord generateKung();
    /**
     * To invoke sub-plug-ins, we have to give each plug-in a
     * reference to the application.
     */
    public void setApplicationHandle(JoesFirstApplication applicationHandle);
    // Whoops!

}

OK. Right now, you’re all about Joe’s Application. It’s the biggest thing in the world, and the whole company is behind it. Joe’s Application is going to increase shareholder value, and that’s great.

But this API locks you into emulating JoesFirstApplication when you someday (I know you can’t imagine it now) want to write a different application that leverages all those existing plug-ins. Half the value of plug-ins is lost if you’re not decoupled from the client. If there’s some aspect of the client that your plug-in needs, it’s better to create the smallest possible interface which provides it, and pass that instead:

public interface IFooPlugIn
{
    public void setX(int x);
    public int getX();
    public FooRecord generateKung();
    /**
     * Minimal access to discover sub-plug-ins.
     * Clients: your implementation
     * of IPlugInFinder determines the available scope.
     * Plug-in authors: use this to find legally 
     * accessible sub-plug-ins
     */
    public void setPlugInFinder(IPlugInFinder plugInFinder);
}

There. Now we’ve left the door open to create a completely different client that can leverage a body of existing plug-ins. We’re not locked into Joe’s vision forever. You know… just in case JoesFirstApplication isn’t the final manifestation of your product.

Conclusion

I hope, in the future, to look back at this post and think, Good heavens, that’s so basic that you shouldn’t even have to say it! Equally likely, future-dvb will say, What a bunch of misguided nonsense. Only time will tell.

oh, i dont know. what do you think?


Read part one first, please!

Bad Code
The experience of Bad Code comes from the following dissonance:

  • You want to use this code, either as a library or perhaps to improve or fix it. (In some settings it may be that you have to use it.)
  • It resists being used.

In extreme cases, a body of source code doesn’t build. But even if it does, if you can’t discover a way to use it then it has become Bad Code.

I believe there’s two things which contribute the most vigorously to the badness of Bad Code. I’ll enumerate these, and then offer a speculation on how this comes about.

Badness 1: No Examples
Examples can come in many forms. For something downloaded from the interwebs it might be a folder named “Examples”. Nothing wrong with that.

But if it’s a library from the guy in the next cube, it might just be an email with some stuff pasted in, or even some whiteboard scribblings. Remember, I said Good Code lets you get something done in 20 minutes. Talking to the author isn’t “cheating”.

Also, as I’ve previously babbled, unit tests are great. This is another place where an example might be found.

But the key point here is that there are examples of use. And you can find them.

Badness 2: External Dependencies
Here is a recurring property of Bad Code: it depends on everything else being “just right.” That may mean that lots of other libraries have to be in place in some nonobvious way, or that files have to be in certain spots, or that environment or property variables have to be just so. And the perp will innocently say: “But why would you ever run it without all the other stuff?”

Because, my friend, that’s what makes Good Code good.

How Does This Happen?
In my recent experience, this comes about just one way: programmers following a spec.

I know it sounds crazy… but I’ll attempt to explain it in abstract pretty pictures. Here’s a spec, derived from some whiteboard discussions and some anecdotal customer feedback:

And here is the Bad Code implementation of that spec:

It looks just like the spec! Right down to every little kink and error. Of course you can’t use it for something in 20 minutes. Your product took years to develop, and this implementation knows every nook and cranny of that work and depends upon it.

And here’s my vague and unsupportable visualization of the Good Code implementation:

The spec has been deconstructed into a rational architecture. The implementation includes the spec but is not damaged by the spec. Ah, if only pretty pictures could actually be true…

Make It Look Easy
In part one I presented the Mozilla Rhino interpreter as an example of Good Code. It was self-contained and had examples, and was relatively easy to set up and run. And one could say, “Oh but that was a particularly easy case. Of course it can run standalone!” The trick is to make it look easy.

There’s an artist/mathematician named Scott Kim who is known for creating “inversions,” text art like this:


It has 180 degree rotational symmetry. A wise person once pointed out to me that, “Each of Kim’s inversions looks like it was a particularly easy one, don’t you think?” There’s always some little thing that makes it trivial. But it’s not. The trick is to make it look easy.

If it doesn’t look easy, you haven’t worked hard enough.

That’s the art.

In part three I’ll offer a couple of tips that might, just might make your code usable by someone, for example, me, in twenty minutes.

2 comments
Douglas Jones // Thu 2006.09.7 07:457:45 am

That reminds me of a quote I heard about 10 years ago. “If you can’t explain it clearly then you don’t understand it well enough.”

David Van Brink // Thu 2006.09.7 14:522:52 pm

I agree with that!

Though… the implicit full claim would be: “If you can’t explain it clearly then you don’t understand it well enough for me to understand it.” Some people are very comfortable with complexity and transverbal software… but I don’t want to inherit their code… if their comfort level exceeds mine.

Yet another humorous retelling of approximately the same thing: http://www.gnu.org/fun/jokes/pasta.code.html.

Nearly every software professional has heard the term spaghetti code as a pejorative description for complicated, difficult to understand, and impossible to maintain, software. However, many people may not know the other two elements of the complete Pasta Theory of Software.
Lasagna code is used to describe software that has a simple, understandable, and layered structure. Lasagna code, although structured, is unfortunately monolithic and not easy to modify. An attempt to change one layer conceptually simple, is often very difficult in actual practice.
The ideal software structure is one having components that are small and loosely coupled; this ideal structure is called ravioli code. In ravioli code, each of the components, or objects, is a package containing some meat or other nourishment for the system; any component can be modified or replaced without significantly affecting other components.
We need to go beyond the condemnation of spaghetti code to the active encouragement of ravioli code.
– Raymond J. Rubey

The problem is that we each of us can read these sentiments and think, Oh yes, of course I do that!

oh, i dont know. what do you think?


Well, I’ve been holed up in my dank apartment these last 3 weeks, canvas and plastic stapled over all the windows. I’ve been writing furiously. Red-pencil scribbles cover tablet after yellow Big Chief writing tablet with my profound wisdom and observations. I feel myself following in a great tradition… Dear reader, allow me to share with you some morsels which you might find entertaining or even informative. This is to be the first of a three part epic on the subject of Bad Code.

Introduction
What makes “Bad Code”? I feel like I’m tormented and dogged by Bad Code. Some of it is my own. Naturally I’m less bothered b my own Bad Code than my other people’s bad code, because I have at least some insight into the perverse intent of the code when it’s my own. But it’s still Bad Code.

What makes Bad Code?

Oh, there’s endless modern kid-stuff about good practices and code smell and all that nonsense. It’s my experience that even good kids who learn all about test-driven development and favoring composition over inheritance and all that rubbish and follow it to the letter still write Bad Code.

What makes Bad Code bad?

First of all, it has to be code. You wouldn’t call it Bad Code if you experience it as an application or tool or whatever. You’d call it a bad app, or a bad web page. I’m talking about Bad Code. You typically experience code in the form of a library that you want to use or source code that you’re called upon to modify.

Here’s my definition of Bad Code: If you can’t load it into your IDE and make it do something interesting under your control in twenty minutes, it’s Bad Code.

Let me justify this first with a counterexample of some Good Code. Oh yes. Cranky am I, but praise is possible.

Some Good Code
One of Mozilla’s open source projects is Rhino, a JavaScript interpreter. (Actually it is officially called ECMAScript now, but whatever.) Why is this Good Code? Let me count the ways:

  1. It’s easy to find: http://mozilla.org/rhino/.
  2. It was easy to download. A full source drop was 1.7M.
  3. Although it wasn’t an Eclipse project, it imported perfectly into Eclipse as a “Java Project from Existing Source” with almost no errors. A fast look reveals that all the errors are in one package implementing XML stuff, and its referring to org.apache.xmlbeans. On a whim, I remove that entire subpackage and apparently nothing else was depending on it.
  4. There is clearly marked “examples” directory…
  5. The examples are named, have a few comments which are enough to get the lay of the land.
  6. I create a blank .java file, and start copying bits and pieces from the example into my new file. Soon, I have a tiny piece of code which I think I understand pretty well, and does something explicable

Here’s some code. I know it’s a bother, but give it a read. It’s short. It would mean a lot to me. And remember — it took me less than 20 minutes to get here.

public class DvbScript {
    public static void main(String args[])
    {
        Context cx = Context.enter();
        try {
            Scriptable scope = cx.initStandardObjects();
            String s;

            s = "var abc = 13/7;";
            cx.evaluateString(scope,s,null,0,null);

            s = "var xyz = Math.sin(1.33) + "*****";";
            cx.evaluateString(scope,s,null,0,null);
            
            Object abc = scope.get("abc",null);
            Object xyz = scope.get("xyz",null);
            System.out.println("abc is " + abc.toString() + " and is a " + abc.getClass().getName());
            System.out.println("xyz is " + xyz.toString() + " and is a " + xyz.getClass().getName());
        } finally {
            Context.exit();
        }
    }
}
Produced this output: abc is 1.8571428571428572 and is a java.lang.Double xyz is 0.9711483779210446***** and is a java.lang.String

In that brief bit of code, I managed to successfully exercise the library and begin to understand the internal scheme of the interpreter.

Why It Was Good
It wasn’t a completely bump-free ride. I had to slice out part of the source code as downloaded… I deleted a whole lobe of the source tree that appeared to be all about XMLBeans or some such. But it was painless. Their one dubious external dependency on org.apache.* was isolated and optional.

The examples were easy to find, and they ran. At first they produced exceptions, but then I read the source code comments and knew what to pass on the command-line and all was well.

Fundamentally, as a user of this library, this Good Code, I felt like I was the target audience. Someone had considered that I would be sitting here today trying to run their stuff.

Now, here’s the thing of it: Bad Code feels the same way. It feels like someone has consciously considered that I, David Van Brink, would one day try to use their stuff, and has premeditated ingenious ways to thwart that goal.

In part two I’ll be cheerfully exposing some of the strategies that Bad Code takes to torment me.

oh, i dont know. what do you think?



(c) 2003-2011 omino.com / contact poly@omino.com