nuthole.com "steadily enriched with an almost fabulous daring and originality"
contact:email
subscribe:feed
2006 04
Powered by Blosxom
Get Firefox!
geourl
subgenius
spampoison hosting rails

The Horror of Compile-Time Binding

posted by jack at 15:16 CET in / compute / programming feed

I'm working on this C++ project, and suddenly realize how much we're using the visitor pattern. The point of the visitor pattern is to extract non-core functionality out of data-bearing classes, instead gathering that functionality into a visitor class that knows how to operate on all the data-bearing classes. You do this by making each "visitable" class implement a method that accepts a visitor object, which in turn calls a method in the visitor to do its work, along these lines:


void MyClass::accept(Visitor &v)
{
    v.visitMyClass(this);
}

That way, new visitors can be created to implement new functionality involving visitable classes, without the visitable classes needing to be changed or even recompiled. Here's where the fun kicks in. For each new "visitable" class I add, I need to add a member function for each existing visitor, which means editing dozens of .h and .cpp files, adding a small "visitXXX" member function to each.

The kicker is that most of the functionality we're implementing with visitors consists of either trivial one-liners or empty implementations for most of the visitable types. So we have visitor classes containing dozens of identical member functions. Nothing quite like the joy of editing dozens of files to insert meaningless boilerplate code!

At first I wondered what my co-workers were thinking when they set out on this path, but eventually I came to realize, much to my chagrin, that this is an inescapable reality of C++ programming in many situtations, thanks to the compile-time binding that C++ does. The compile-time binding actually precludes you from doing certain things without this sort of indirection.

To see why this is, first consider an example that demonstrates how things can work in a more full-featured OO language, such as python. Consider the following python program:


class A:
    def emit(self):
        print "Class A"

class B (A):
    def emit(self):
        print "Class B"

a = A()
b = B()
a.emit()
b.emit()
print "-----"
objects = [a, b]
objects[0].emit()
objects[1].emit()
This code snippet defines and manipulates a simple class hierarchy, with B inheriting from A. Common sense tells you what the output of this program may be:
Class A
Class B
-----
Class A
Class B

And your common sense would be right. Whe calling the emit method on the a and b objects, the correct method is triggered and the corresponding class names are output.

Now then, considering the following C++ program:


#include <iostream>
class A
{
public:
    virtual void emit() {
        std::cout << "Class A\n";
    }
};
class B : public A
{
public:
    virtual void emit() {
        std::cout << "Class B\n";
    }
};
int main (int argc, char * const argv[]) {
    A a;
    B b;
    a.emit();
    b.emit();
    std::cout << "-----\n";
    A objects[] = {a, b};
    objects[0].emit();
    objects[1].emit();
    
    return 0;
}

Does your common sense tell you this program should have the same output as the earlier python version? Think again:

Class A
Class B
-----
Class A
Class A

Say what? "But, but, but, I put an instance of B in that array, wtf?" Well, here's where C++'s compile-time binding came and shot you in the foot. Note that C++ doesn't offer any sort of generic array of objects. Whether using a C array as I've done, or a std::vector object, you must indicate the type of the object, in this case the root of our hierarchy, A. C++ then makes use of this type information when your program is compiled to match up the call to emit with the implementation in the named class, rather than the implementation in the derived class you're actually instantiating! Doh!

This has dire consequences for any situation where you've got a collection of objects of varying types. Say you've got a bunch of model classes, and you'd like each class to emit an xml string containing its values. In Objective-C, Python, Ruby, Smalltalk, and probably even Perl and Javascript, this is pretty straightforward; You just give each class a method that emits a string (and, assuming you've got an acyclic directed graph, calls the emitting method in each of its "child" objects). Then you just call the emitting method in the root object, and away you go. For example, once again consider some python:


class RootElement:
    def emitXml(self):
        print "<root>"
        for child in self.children:
            child.emitXml()
        print "</root>"
class SomethingElement:
    def emitXml(self):
        print '<something value="', self.value, '"/>'
class SomeoneElement:
    def emitXml(self):
        print '<someone value="', self.value, '"/>'

r = RootElement()
c1 = SomethingElement()
c2 = SomeoneElement()
c1.value = "first child"
c2.value = "second child"
r.children = [c1, c2]
r.emitXml()
(Note that this code breaks encapsulation quite a bit in setting attributes on the objects, and is not what I'd do in production code. But python lets you do it, and it makes examples short and sweet, so there you have it.)

Now then, that code outputs about what you might expect:

<root>
<something value=" first child "/>
<someone value=" second child "/>
</root>

Simple, right? Well, if you try to do something similar in C++ (or Java for that matter), you'll run into trouble. You can start by giving your element classes a common parent declaring emitXml (don't need to bother with such a thing in python, where method calls aren't bound to implementations until runtime), but as soon as you have something like RootElement maintaining a list of child elements, you'll hit the same snag in our earlier example; the relevant implementations of emitXml will be passed over in favor of the generic implementation in the parent class.

Which is where the visitor pattern comes in. It's basically a kludge to work around the compile-time binding limitation of C++. Even when there are times you'd want to use it in order to extract responsibilities out of a data-bearing class, you still end up with this ridiculous double-dispatch syntax (compare with Objective-C, where you can actually add methods to an existing class by using a category, so you can accomplish the same design feat of grouping functionally-related methods together, yet outside the classes they operate on, without the artifices of the visitor pattern).

The moral of this story is that the visitor pattern is a necessary evil for statically-bound languages like C++. And that if you want to avoid this evil thing, start by avoiding C++. Really, pick up python or ruby or Objective-C or whatever. Really!

[update] fixed some typos, thanks weinholt

permalink | digg this | slashdot this | add to del.icio.us |
Comments
Re: The Horror of Compile-Time Binding
hidden to protect the guilty wrote on Wed, 19 Apr 2006 04:56

Hmm interesting,

Personally I think that you are confusing C++ in general with the C++ your C++-correctness-pattern-nazi friends are writing.

Unlearn what you have learned, follow your feelings etc etc :)


[ reply to this ]
 
Re: Re: The Horror of Compile-Time Binding
Jack wrote on Wed, 19 Apr 2006 07:00

Well you've got a point there. Of course, if my co-workers didn't have an interest in making use of patterns, they'd have done it the old-fashioned way: each visitor would instead be implemented with a massive switch/case or if/else construct to vary behavior according to an object's type (which you can get with RTTI). But that's not a very pretty solution either.

I think that deep down, every OO programmer has a gut feeling that object-oriented languages are "supposed to be" polymorphic, and that you really shouldn't have to query an object about it's type. When you then encounter a situation like I described above, and find that your choice of language has hobbled your ability to use polymorphism, the visitor pattern seems like a good solution. And in some situations I'm sure it is. But in some situations, like when many of the visitable objects don't have any meaningful actions in the visitors, you end up with lots of empty methods (or methods that call some default method that acts on a generic type). In the long run you can end up with lots of completely meaningless code.


[ reply to this ]
Re: The Horror of Compile-Time Binding
hidden to protect the guilty wrote on Wed, 19 Apr 2006 08:49

I think a decent architecture for your visitors would be something like:

ElementVisitor - doesnt care about element type and just has a visitElement function

TypedElementVisitor - derives from above, checks the type and calls the typed function

You derive from the first if you don't want to do type specific stuff and from the second if you do. Another thing that would make it better would be to have non-pure virtual functions for the typed callbacks - if you don't care about a certain type, just don't add the function.

Downside to this is things will go wrong when you add a new type and don't handle it in a visitor that should handle it. But there again you're a programmer not a child, you should be able to cope with it.

As for the language thing: different languages have different properties just like different cars do. Personally I prefer to be able to drive whatever car I need to rather than stick to just the specific make and model of my car.


[ reply to this ]
Re: The Horror of Compile-Time Binding
Peter Marklund wrote on Wed, 19 Apr 2006 13:10

Very interesting article Jack!

I don't quite have the time to verify this with a Java program right now but I think polymorphism would work for your example in Java. I can't remember having run into a situation where Java is not behaving polymorphically when it should.


[ reply to this ]
 
Java OK
Jack wrote on Thu, 20 Apr 2006 16:45

Right you are Peter, I just whipped up a java example and it works fine. This leaves C++ as the only mainstream OO language that suffers from this compile-time crippling.


[ reply to this ]
Re: The Horror of Compile-Time Binding
C++ NAZI wrote on Thu, 20 Apr 2006 06:03

BLASPHEMY! DEATH TO YOU ALL. >D


[ reply to this ]
The A, B example
Jonas Gauffin wrote on Fri, 04 May 2007 00:47

Well. That's why you are using pointers in vectors/lists =) Try changing to ptrs, and you'll see that the example works.

Use the shared_ptr if you do not to manage the pointers yourself.

---

I've switched to C#/Rails from C++/PHP5 =)


[ reply to this ]

Here's Where I Tell You What's Wrong With The Music Industry

posted by jack at 23:48 CET in / thinking feed

I've been thinking about this for a while, been meaning to blog about it, and some recent conversations have brought it to the forefront of my mind. Here goes.

Surely no one has missed the fact that the music industry has gone through some changes in the past few years. Things like the iTunes music store have dragged the record labels, kicking and screaming, into the 21st century. Most of them are onboard by now, but there's still an undertone of terror visible in their actions; More and more of the CDs they sell have been whammied so they sometimes won't work on a computer at all. Or, if they do work on a computer, they may install deadly evil mojo on your machine in the process.

The reason for these paranoid practices on the part of the record labels is pretty easy to suss out: For decades, they've built their businesses around selling bits of plastic or vinyl. They've taken for granted that they'll be able to continue doing this forever, and that this will always be their most important source of revenue.

So naturally they're worried about mp3 and similar technologies robbing them of some of their profits. I think it may be instructive to consider what happened around the birth of the recording industry. At that time, many live performers became concerned for their livelihoods, worrying that if people could hear music on demand at any time, they'd no longer bother going to performances, and these live performers would be out of work. Of course, this turned out not to be the case; If anything, recorded music has served to propel any number of performers to a level of fame and fortune that would have been impossible without recordings. I believe that the music industry's concerns about losing revenues due to illegal digital distribution are similarly unfounded.

For one thing, there is a notion that the people who download lots of music illegally are probably the people who also buy some of the largest amounts of legitimate copies of music. Downloading doesn't replace their music-purchasing, but rather enhances it by letting them sample music from unfamiliar artists (potentially leading to future purchases) or downloading unavailable-for-purchase bootleg recordings of familiar artists (which increases the "bonding" between a listener and a favorite artist even more).

Perhaps more importantly, the music industry is sitting on the keys to unlocking its own future potential. For instance: Propelling public awareness of artists through cross-promotion. Let's say you're a big Nine Inch Nails fan, having heard their music on the radio, maybe gone to a show or two, etc. At some point you point your browser at http://www.nin.com, look around a bit at some lyrics and whatnot, and then you end up backtracking and surfing on somewhere else; Your interest in NIN didn't lead to any info about anything non-NIN.

How could this be improved? Well, NIN is currently on Interscope. What if Interscope stipulated that all of its artists needed to provide linkage (in the form of a side-bar, or a dismissable slide-over graphic) to other, similar Interscope artists? Then you might find out about Helmet or Audioslave or, perhaps, some up-and-coming band that you've really never heard of. Some would complain that Interscope infringing on their artists websites would be a terrible, terrible thing, but considering all the other shit they pull on the artists in their stables, this would be pretty minor.

A bigger problem is that relationships between artists come and go. As of this writing, for instance, the above-mentioned Helmet is still listed on Interscopes Artists page, despite the fact that one of the latest entries on Helmet's page mentions that they've left Interscope. Oops! So Interscope's (hypothetical) forced linkage from NIN to Helmet now isn't helping Interscope any more. But... what if it wasn't just Interscope? What if the industry as a whole were presenting interesting ways for listeners to find similar artists? That way Interscope would still benefit from a NIN<->Helmet linkage since in the future, visitors to Helmet's site would still get back to NIN.

Apart from just a plain old list of links between similar bands, things could be user-tailored so that sites would bring to the user's attention news about upcoming local concerts with relevant artists, eventually including (gasp) unsigned artists as well (which the industry as a whole benefits from promoting since one day, presumably, any decent unsigned artist will be under contract with one of the labels).

Now all someone has to do is convince the music industry that I'm right, and implement a solution for them, giving them an offer that's so good they can't say no.

permalink | digg this | slashdot this | add to del.icio.us |
Comments

Your Comment

 
Name:
URL/Email: [http://... or mailto:you@wherever] (optional)
Title: (optional)
Comment:
Save my Name and URL/Email for next time

Looking for programming talent that doesn't make you say "WTF!"? Try the hidden network.