• perspectives 26.10.2008 1 Comment

    Most programmers follow a progression as their skills improve.

    Initially, we have a just make it work mentality. In this phase, there is no structure or attempt at creating any abstractions. If functionality needs to be duplicated in a slightly different fashion, the lines of code get duplicated. We happily trudge along in this phase until we are met with a project of significant complexity. Here we start to break down, and quickly learn that it becomes difficult to maintain.

    In the next phase, which I like to call “abstraction envy”, we start to learn the different ways we can architect our applications. We begin to learn the different designs, structures, and patterns that we can use to ease the burden of maintaining applications. As we learn more and more patterns however, we try to apply our newly found knowledge as widely as possible. Given that this is still a learning process, and much of programming is learned by doing, we often pick the wrong tool for the job. To use the popular metaphor, we use a hammer, but we’re not hitting a nail. We slowly start to learn that their are some downsides to these patterns as well, and just because we understand them doesn’t mean we should use them. If a little is good, that doesn’t mean that a lot is better.

    This next phase I’m about to describe is the middle ground between the first two. This ideal phase is essentially a zen state of programming. Abstractions are only used when they are necessary. And the abstractions introduced are not just for code reuse, but they create simple, yet powerful ways of thinking about our problems. These types of abstractions are more useful because they more closely match the conceptual, or real problems. You’ll know when you see one of these because it’ll seem like the problem was created for the abstraction instead of the other way around. Besides creating perfect abstractions for the parts of the application that need them, the flip side is just as important, if not more so. Abstractions are not created for the parts of the application that don’t need them. Simple things are left simple, and complicated things are possible.

    I think most programmers are in the middle state. If you ask a programmer his advice on a particular problem, you’ll probably get an answer, and you’ll probably get a lot more information about other problems that get created along the way. Sometimes these problems themselves require complicated solutions, and then these problems cascade.

    That brings us to the heart of the problem really. Simple things should be easy to change. If they’re not, something is wrong. But none of us writes code with the idea that we’re making certain things harder. We always believe that we’re improving things. We see some boilerplate, and we think to ourselves, gee, wouldn’t it be great if we didn’t have to do that all the time? Let’s find a way to eliminate it. We move along happily, proud of how much code we’ve reduced. The problem lies in the future, when we’re thrown a curve ball. It doesn’t fit in our strike zone, but we’ve got to hit a home run anyway. We look at the code, and we realize that it would have been a whole lot easier to do if we didn’t have to deal with those abstractions we added in earlier. We’ll have to modify them significantly to make the new problem fit. In other words, the abstraction leaks, because we now have to understand the implementation.

    And now we’re faced with a tough decision. Do we make these modifications - modifications is a good word for this, it’s usually a hack - or do we scrap them and go for a simpler approach. The hacks are usually easier and we feel more confident that they’ll work, but they increase the entropy of the system which will make it harder for us to maintain. A bigger cleanup will take more time and is riskier, but is probably a good long term investment in some cases. Unit tests will help in this case, but like good abstractions, I find that it’s just as hard to write or find tests at the right abstraction level to pull this off.

    So how can we avoid getting ourselves into a mess like this in the future? I think the real answer is that we have to get ourselves tangled into a few webs of anarchy before we can learn how to avoid them. But in an attempt to answer the question, instead of erring on the side of introducing abstractions, we can err on the side of leaving them out. Simpler code is usually easier to modify, although that’s not true for large pieces of spaghetti where a few well placed abstractions trim out a lot of the complexity. But my personal opinion is that it’s usually easier - and more fun - to add in additional layers of abstractions rather than break free from existing ones. It also feels safer.

    But perhaps what should guide us the most in these ambiguous cases is the principle of least astonishment. If the presence of a particular abstraction is surprising, then it probably shouldn’t be in there. Another good question to ask ourselves is if the new abstraction actually makes things simpler. If the code is easier to understand with the abstraction, awesome. If it’s a puzzle to figure out what is going on, and it’s just there to save on lines of code, change it.

    It’s easy to make things. It’s hard to make things simple

  • approaches 14.10.2008 No Comments

    Always two there are, a master and an apprentice

    When learning a trade, the master/apprentice relationship can be tremendously useful. It is a slow path, but an effective one. It’s a great way to pass on the lessons from the past.

    And not only is the knowledge of skills passed on to future generations, so is the knowledge of common pitfalls. As an apprentice begins to make mistakes, the master can correct them immediately, potentially avoiding much of the headache associated with recovering from the error. And we all know that the earlier an error can be recovered from, the smaller the consequences.

    However, you don’t see very much of this in the programming world. I think there are a few reasons for this.

    1. Being self taught is encouraged. If you can learn yourself just by rtfm, then you haven’t wasted anybody else’s time.
    2. Programmers all think that they can solve the problem better themselves.
    3. There tends to be a high turnover in the programming industry.

    But then, each generation is doomed to repeating the same mistakes. And with programming, it’s difficult to know when you’re making a mistake. But, a master can point this out right away, and correct it before it becomes a major problem. This can save an enormous amount of time, and greatly speed up the learning process.

    Programmers, and those expecting programming based solutions, are usually impatient. There’s enormous pressure to release the next best thing yesterday. The programming field also changes rapidly, and it will probably continue to change at an even faster rate in the future. It should be no surprise then that programmers are in a rush.

    But the flip side is that it takes a while to become a great programmer. We really need to experience many failures before we can identify how to architect successes. And contrary to popular belief, it takes years to become proficient.

    If there are no silver bullets, then I think that encouraging master/apprentice relationships can help propel us as a whole in the right direction faster.

  • It’s interesting to hear so many different opinions on the command line. Many see it as archaic. Others are scared of it. And then there’s a few who prefer it to all other interfaces. But regardless of how you feel about it, there’s no denying that it’s here to stay, at least for the near future.

    I’ve mentioned earlier that interacting with a computer can be likened to communicating with it. We usually “speak” to it in the physical sense through a keyboard or mouse. On the virtual or software side of things, the most popular interfaces are either through the command line, or a gui (wimp interface).

    Command line interfaces allow for more direct communication with the computer. You type words that you want the computer to execute, and it returns the response. Going through a gui is like talking to an interpreter first, and then having the interpreter relay the information. This can be more useful if the interpreter can figure out what you mean and make a more informed request to the computer. But for informed users, interpreters just get in the way.

    Where I think the command line really shines is its flexibility. At a moment’s notice, you have access to virtually anything, all through a single interface. You have access to a large, powerful set of tools that can be widely used with one another. For example, what if I wanted to count the number of files in a directory that had an odd number of lines in them? That were edited in the last week. And have those sent out in an email. Biweekly. Granted, that’s a contrived example off the top of my head, but one that is virtually effortless to accomplish from the command line, yet difficult with a gui that wasn’t designed specifically for that.

    Another useful feature of the command line is its inherent repeatability. Once a command has been run, it can be recalled, executed as is, or executed with slight variations, all with little effort. This is true for sequences of commands as well. And, if the sequences themselves start repeating in sequence, then they can be moved to a shellscript or function and run with a single command. In this way, the command line allows for the user to create new language that is better suited for the problem, in a bottom up approach.

    Guis have their place too though. The knowledge required is usually much lower to start being effective with an application. And guis are a better fit to solve visual problems. It’s much easier to work with a wysiwyg type app when you’re producing a new design, rather than having to perform transforms with commands and then redisplaying.

    But I think the audience is what makes the biggest difference. Programmers, and “expert users” have a tendency to prefer command line tools. Novice users are usually afraid of having something go wrong, and find comfort with gui applications. And for the die hard gui guys, all you have to do is tell them:

    Smith and Wesson was the original point and click interface

  • perspectives 19.09.2008 1 Comment

    Programming is really just another form of communication. We type in some code, tell the computer that we’ve got some stuff for it to read, and then it gives us back a response when we run it. The write, compile, run feedback loop is like having a conversation with the computer.

    Generalizing a little bit, using a computer itself is a form of communication. We put some input into the computer, whether we type something in, or use a mouse or some other device. Then the computer responds back to us. It can also work the other way too. The computer can prompt us for action, usually after a particular event occurs. For example, we can have a msgbox alert us of new mail as it arrives.

    Thinking about it from a conversational perspective, our interactions with the computer become more effective if we can better articulate what we are trying to do. If we can phrase a question, or a particular action in a more effective manner, then it’s more likely that we’ll get a good response back from our computers. Usually, the whole trick comes down to saying the right thing.

    And as programmers, the choice of programming language is really just that. It’s how we choose to describe our application to the computer. Seen from this light, arguments have been made that if you’re able to express yourself in fewer words, then you’re using a better language for the job. It also follows the language parallel in that certain languages have many more words for a particular concept, each with a slightly different meaning. These languages are adept at expressing the subtle variations of this concept. And if a language doesn’t have a word for a concept, then chances are it doesn’t know how to express it effectively either.

    Programmers generally get a bad rap about being poor communicators. But, if programming is just a form of communication, can this be true? I’d argue no. The best programmers turn out to be effective communicators too. They really are able to express their points in a clear, concise way. After all, they’re used to writing clear, concise code.

  • When discussing whether quality is more important than quantity in programming circles, quality will often be cited as the clear winner. The argument is that focusing on quantity only ends up hurting us in the long run. Sacrificing quality usually means taking so called “shortcuts”, which can lead to headaches in the future. When the shortcuts turn into dead ends, we end up having to take detours to get around them.

    But does focusing completely on quality necessarily improve it? We try to justify spending more time improving quality by saying that once it’s done the right way, the problem is less likely to resurface in other ways. In other words, we’re spending more time on it now so that we don’t have to in the future. But what does it actually mean to concentrate on the quality of the application? Just because I try to anticipate future uses of the code, or remove all duplication, or try to document what I’m doing doesn’t necessarily mean the code is of higher quality. If there are known bugs and I eliminate those, I haven’t necessarily raised the bar quality-wise. I could have introduced other defects as side effects, or some security or performance issues. I can spend more time writing tests, but again, I haven’t improved the quality of the code here.

    I’m not going to make an attempt to define software quality myself. I only wanted to point out that it’s not completely obvious, and refactoring has its own risks. Instead, I’ll summarize an interesting story I came across here.

    In a ceramics class, half of the students were graded completely on quantity of work produced. The other half was graded completely on quality. You’d think that the quantity students would produce tons of sub par work, while the quality students would produce one amazing work. I did anyway.

    I was wrong. The quantity group in fact ended up producing works of higher quality. They were able to learn from their own mistakes. The quality group, while coming up with sound theoretical works, failed to deliver.

    This sounds very similar to programming. And unless you’re this guy, it usually takes a few iterations to get something right.

    We tend to learn more effectively through our own mistakes. Just like parents will let their children make their own mistakes, programmers learn to avoid pitfalls by first falling into them. The better programmers are the ones that write more code.

  • You come to nature with all her theories, and she knocks them all flat. -Renoir

    Computer science is a misnomer. There is no series of steps to follow that will lead to a great application. There is no methodology that guarantees success. The science that we do have is what I would call “low level”. We can prove that the running speed of quicksort is n log n for the average case. We can find the maximum of a set of numbers in n time. But I can’t prove that my web application is not going to crash.

    What makes the software engineering discipline so different from the other engineering disciplines is that absolutely all of the work happens in the design phase. When designing a building, coming up with the blueprints is just the first step. But with programming, the blueprints are everything. Naturally, I mean extremely detailed blueprints. That’s really what a program is though, right? It’s just a set of detailed instructions that the computer executes for us. When we’re creating a building, the building is done when it’s built. When creating an application, it’s done when we’ve come up with the all of the instructions needed. It’s one more step removed. That’s why when we’ve created one copy of an application, it’s trivial to create n copies.

    Anyway, we slowly learn that there is no magic formula to write an effective application. There are guidelines, principles, best practices, but in the end, following all the rules doesn’t guarantee a masterpiece.

    This sounds a lot like the difficulties faced when trying to create a bestselling novel. There are guidelines, and patterns for plot developemnt that have worked well. But in the end, you can follow all the best practices that are out there, and still not come up with a masterpiece.

    I know I’m making it sound like the both of these are very hit and miss. This is not true. Great authors consistently produce great works. Great programmers also consistently produce great applications. While it’s hard to define what makes the works great, it’s usually much easier to recognize. What’s more is that most programmers can agree on who the great programmers are, yet when asked to quantify why, it’s not easy to come up with an answer. We can all recognize it, but measuring it is difficult.

    Universities however, teach programming from a much more scientific point of view. We’re taught the fundamentals, big O notation, operating system concepts, basic software engineering and process, and things along those lines. Yet it’s rare to find a university (at least I haven’t come across one) that studies the great masterpieces of programming. Or one that takes the worst of programming, and criticizes it.

    I think that this scientific approach to teaching programming is fundamentally wrong. We should be taking the hint from the liberal arts schools. They all study the great works of the past, which are endlessly discussed, analyzed, and criticized to great detail. Various writing techniques are dissected, and emulated. Students are encouraged to stray from the path, to explore.

    Maybe the reason why universities take this approach is that most past software has been closed source. It’s only fairly recently that open source has exploded in a very big way. But that’s slowly changing. More and more open source software is getting written, and at a faster and faster pace.

    But what I’m seeing happen is that we don’t have to wait for the universities. We as programmers are coming up with our own ways to learn more effectively. After all, we control what software gets written, so a lot of it is geared towards making our own lives easier. This also includes learning from others, and coming up with better ways of sharing and connecting information. It’s this trend, which seems to indicate that we’re getting better faster, that leaves me hopeful for the future.

  • approaches 08.09.2008 1 Comment

    We’ve all had our frustrating moments with computers. We bang our heads against the walls for quite some time, and no matter what we try, the computer responds with a clever “I thought you might try that, here’s your error.” And then, we try talking to someone else about it, and they usually have a brilliant idea that solves everything elegantly. Then we’re left scratching our heads, wondering why we didn’t think of that before.

    Lots of problems get solved like this simply because they are looked at from a fresh perspective. It’s easy to get lost in the details of a problem. When we keep our heads down, it’s hard to realize that we were just approaching the problem from the wrong angle.

    Programming is a game of insight.

    I think that sums up the essence of programming. The most significant gains are often those that shed the problem in a new light.

    But not all reevaluations of a problem lead to successes. In fact, I would argue that most of them don’t. But the ones that do work, usually do so in a big way. Given that programming is this give and take process, progress often isn’t linear. There isn’t a lot of progress, or it looks like things are getting worse, and then suddenly, there’s a big jump.

    This makes it especially difficult to measure, assuming it’s even possible to measure at all. Any formal attempts at measurements results in programmers optimizing for the local maxima. This ends up detracting from productivity.

    But regardless of whether we can keep track of it or not, it’s important to foster an environment that encourages creativity. A single idea can change everything.

  • approaches 07.09.2008 1 Comment

    We all go through programming disasters. It’s hard, if not impossible, to always make the right decisions. And after we’ve steered the ship back on course and we’re out of the storm, we try to review if there was any way we could have prevented the storm in the first place. This reflection is crucial, and is one of the best ways we can improve ourselves.

    What becomes dangerous however, is when these reviews turn into policies, or mandates. To explain why, I’ll summarize an anecdote which I’ve come across from the Extreme Programming book.

    A mother was baking a ham with her daughter, and she noticed that the ends were cut off. She asked her mother why, to which the mom responded: “I don’t know. That’s the way my mother always did it. I’ll ask her.” So the mother asked the grandmother why the ends of the ham were cut off, and she said: “I don’t know. That’s the way my mother always did it. I’ll ask my mother”. And the great grandmother’s response was: “My oven was too small, so I had to cut off the ends to have it fit”.

    Blindly following policies can lead to extra steps that can work against us. Rather than come up with new policies, it’s better to come up with principles that can inform future decisions. It’s more important to understand the reasons behind what those policies would have been.

    On that note, dogma itself is dangerous. When we start blindly following rules, we start becoming simple machines. It tends to stifle creativity, which is the worst thing that can happen to programmers. Programming itself, after all, is a creative process.

    I’ve seen a trend by programmers to be completely against any form of duplication. After all, there are extremely compelling motivations for this idea. Programmers have all copy/pasted code to get something working, with the reason being that it usually ends up working much faster in the short term. But then when all that new code needs to get updated, we’ve all forgotten to make the change to all the pieces that required it. And bingo, we have a new bug.

    Then we look back on it and what’s to blame? Not that we forgot to make the change everywhere, which was just the symptom. The problem was that it was possible to change one of the parts, and forget to update the other. It should have been refactored to avoid the duplication, so that a change in one location would naturally affect all the other paths. This would have prevented the bug from even being possible.

    But like most dogmas taken too far, this can get you into trouble. Here’s a very contrived example, and granted most of us don’t think this way, but I’ve experienced somewhat similar arguments for avoiding duplication where I thought it was just silly.


    a = 7
    b = 3 + 12
    c = 18 - 2
    d = 9 * 6

    Try to pretend that these are real calculations, and there are not just hardcoded numbers here. Can you spot the duplication? Normally you’d say there isn’t, but I can argue that there is. You have four assignments, and 3 mathematical operations. Isn’t that duplication? Here’s the “reduced” version:


    vars = ['a', 'b', 'c', 'd']
    arguments = [(), (3, 12), (18, 2), (9, 6)]
    ops = [lambda *x:7, operator.add, operator.sub, operator.mul]
    for var, args, op in zip(vars, arguments, ops):
        globals()[var] = op(*args)

    Notice how I’ve eliminated all the duplication? And wasn’t it clever of me to fit the simple assignment in the first example to a no-op? The logic is all now in a single line compared to the 4 up above. I can argue that although it’s actually more lines of code, I can effectively move all but the loop into configuration. Now people can add more variables to the global namespace, with an arbitrary operation performed on any number of arguments, all through configuration! What a wonderful and extensive system I’ve created!

    The perceptive reader will have discovered that I was being a tad bit sarcastic. (My co-workers will all tell you that I’m subtle). So which is easier to understand? Which would you rather maintain? Readers with no python experience will probably understand the first code snippet. But you probably need to know python to even attempt to understand what’s going on in the second.

    So if you’re going to follow dogmas, policies, or rules, then follow this one:

    Always use your brain.

  • approaches 04.09.2008 5 Comments

    We all have different ideas on when we should clean things up. It could be a dirty room, cluttered desk, messy closet, or just too much stuff lying around. We also prefer to keep things a certain way, which can be very individualistic. From the outside a desk with tons of paper lying around could look like it could use some rearranging, but the desk’s owner might be able to find anything at a moment’s notice.

    Given how different we all are with tolerating visual clutter, it’s not too surprising that there’s a wide range of opinions on when we should clean up our code. Many books have been written completely focused on this very issue, and in the end, I think it’s still much more of an art than a science. In fact, I believe programming itself is much more of an art than a science, but that’s the topic for a whole other discussion.

    Then there’s a time for spring cleaning too. And we’re always surprised when we find some really dirty stuff in the nooks and crannys. But in the real world, we usually do end up cleaning it up. (That is, if you’re not as lazy as I am). In the virtual world however, some cleanup tasks are truly daunting, and it may be easier to leave the dirty things the way they are. After all, it might not look pretty, but it works.

    And similar to how the owner of a cluttered desk can find something for us quickly, old dirty code tends to work predictably as well. But if we change it a bit, we can no longer guarantee that it’ll work the same way. Unit tests can help here, but we still can’t make any guarantees. And when you move an old couch from one corner of the room to the other, you realize that there was a whole lot more dust under there than you thought. That dust also has to get cleaned up. And after some time cleaning up, we ask ourselves, was it really worth it? That couch really wasn’t all that bad where it was.

    But, the advantages gained by cleaning up can outweigh its costs and risks. After all, if something becomes simpler and easier to understand, we stand to gain every single time someone works with it. Add up the time for all these future interactions, and the refactoring can more than pay for itself.

    However, like weatherman and traders already know, it’s hard to predict the future. (They’ll never admit it though. What I think these guys really excel at is coming up with excuses. :) ) But, it’s very easy to predict the past. And although we may pay a small price each time we end up working around the dirt, we start learning where all the dirty areas are. This puts us in a better position to clean up more effectively in the future.

    Refactoring usually alters the abstractions used. A new layer could get added that simplifies complex interactions by handling those details. Or extra layers that get in the way are removed to produce simpler code. But if a new problem comes along that doesn’t fit into these new abstractions, then the game is up. Chances are we’ll put in a hack to work around that “edge case”, and those can start to pile up, especially if other programmers start putting their hands in the mix. And before you know it, we’ll be talking about a new refactoring.

    I’m not arguing that we should never clean up code though. I’m just pointing out that there are often more factors involved when thinking about cleaning up than just making the code prettier. What cleaning up does seem to do is improve morale. Most programmers would much rather write a 1000 new lines of clean, sparkling code, rather than try to figure out the ten lines in a 10000 line messy codebase that need to be modified. And when bold, noble undertakings like these are launched, I start to feel like Braveheart just gave me a speech before an impossible battle.

    And what usually ends up happening is that the small splinter cell team beats the odds. But why? There’s too much work and not enough time. Well, I think that the answer is simple: the programmers work harder. And the reason for that is that they are a whole lot more motivated with the prospect of a fresh start instead of trying to keep a big ball of mud together.

    In the end, major refactorings are a tough call either way. And the technical issues might not be the whole story. Social issues can also play a role in the decision too. Programmers tend to take attacks on their code personally, and arguments can turn into personal vendettas quickly.

    But just like we’re always able to find ways around a messy desk, we find ways around our technical problems too. And regardless of the decision, thinking about the problem gives us more insight into it. So ultimately, we’re better off anyway after the exploration step. To use a cliche:

    The journey is more important than the destination.

  • approaches 02.09.2008 No Comments

    When faced with a new programming challenge, it can usually be approached from one of two ways: the front end or the back end. Each has its merits, and completely focusing on one or the other is a recipe for disaster. But, I think that most programmers tend to be a little “backend heavy”. I’d almost go so far as to say that it’s difficult to call yourself a programmer without being biased this way. After all, programmers enjoy solving complex problems by creating new powerful abstractions. If not, you’d be a little nuts to go through all the frustrations of programming without enjoying watching something work.

    What I find surprising is that if a problem is too simple, it’s the programmers themselves are the ones that create more complexity. I find myself doing this all the time. “Well what if the user wanted to do foo and not bar. Instead of hard coding this action, I can add a new <insert cool pattern/framework/abstraction here> and the application will then support any action. Then we can make a user preference and new configuration management system that users can tailor to their needs”. This cascades of course, and before you know it, you’ve created a new framework. It’s just so easy to get lost in it, that programmers sometimes forget to ask the obvious question: “hold on a sec, why are we doing this again? Let’s just cross that bridge when we get to it”.

    Focusing completely on the front end can be just as problematic though. Security holes, performance bottlenecks, and general lack of flexibility can hold you back too, perhaps even more so. Any one of these issues can be devastating. Then it’ll leave you wishing you had spent just a whee bit more on prevention, instead of having to pay so much for the cure.

    Right about now it sounds like my point is that everybody should just do things right from the beginning, without wasting time building anything unnecessary. But realistically, I think the most important thing is to “use your brain”. The applications that programmers build are supposed to be helping users in some way. Refactoring is supposed to help programmers maintain their code. Writing tests is supposed to help maintain confidence in the code base. User testing is supposed to help inform usability improvements to the application itself. If something isn’t working like it should, maybe it needs to change, or be rethought. It’s always healthy to take a step back and review how things are going. I think Einstein said it best:

    Everything should be made as simple as possible, but not one bit simpler.