PHP - language of the web

November 26, 2008

Currently, php is the most popular language on the web, popular here meaning sheer numbers. There are number of reasons for this, some of them obvious in retrospect. But then again, a lot of things are obvious in retrospect.

Php is easy to get started with. I think this one most of all is the biggest reason for php's popularity. In its simplest form, php is just a powerful templating language. We can take some html, sprinkle in a couple of php tags here and there, and presto, we have ourselves a dynamic page.

Php is easy to deploy. This almost goes hand in hand with php being easy to get started with. Most web servers are already php enabled, so chances are good that things will "just work" for us out of the box. If they don't, php's popularity virtually guarantees that the solution will be easily found online.

Updates are easy to see. All we have to do is save a file, and reload our browser. As a programmer, I can really appreciate this one, since I'm used to having to restart a process and waiting for it to come back up before I get feedback.

It's easy to find somebody that understands php. I can say this for anything that's really popular currently, so it's not something that's specific to php. What php has going for it here is that there is a low barrier to entry, making it easier for a programmer to quickly learn php without having any previous experience.

If I were to sum up php's downside then, it would have to be its ability to scale in terms of complexity. Most changes are made in an ad-hoc manner, with little thought for the whole. With respect to systems, this is the wrong approach. We want the overall system to have a strong foundation, and have the individual pieces be the ones that take care of their own messes. Php emphasizes getting the individual pieces working, and they often have to know about each other's internals.

For simple systems though, this doesn't matter. If we're just building a doghouse, we don't need to spend the time thinking about having a strong core, because we won't need one. If we just slab things together, it'll hold. We only run into problems when we use this approach to build a skyscraper.

That being said, I think most sites on the web are simple. There's a way to put new data in, a way to control what gets displayed, which could be through a search or not, and maybe a custom workflow for interacting with the data. It's often the case that building up these small pieces from scratch is faster than trying to fit them into the context of a larger system.

Even for more complex systems, there is a benefit to prototyping a naive solution, and seeing where it breaks down. Sometimes it's not where we expect, and this information becomes tremendously valuable when trying to design a more robust solution.

There are attempts to overcome php's shortcomings too. There are many php based frameworks and content management systems out there. Each one of these makes certain separations for you, like MVC, to help ease the burden of maintenance.

There's no denying that php does some things well. After all, millions of web sites can't be wrong, right?

The Cost of Abstraction

October 26, 2008

Most programmers follow a progression as their skills improve.

Initially, we have a just make it work mentality. In this phase, there is no structure or attempt at creating any abstractions. If functionality needs to be duplicated in a slightly different fashion, the lines of code get duplicated. We happily trudge along in this phase until we are met with a project of significant complexity. Here we start to break down, and quickly learn that it becomes difficult to maintain.

In the next phase, which I like to call "abstraction envy", we start to learn the different ways we can architect our applications. We begin to learn the different designs, structures, and patterns that we can use to ease the burden of maintaining applications. As we learn more and more patterns however, we try to apply our newly found knowledge as widely as possible. Given that this is still a learning process, and much of programming is learned by doing, we often pick the wrong tool for the job. To use the popular metaphor, we use a hammer, but we're not hitting a nail. We slowly start to learn that their are some downsides to these patterns as well, and just because we understand them doesn't mean we should use them. If a little is good, that doesn't mean that a lot is better.

This next phase I'm about to describe is the middle ground between the first two. This ideal phase is essentially a zen state of programming. Abstractions are only used when they are necessary. And the abstractions introduced are not just for code reuse, but they create simple, yet powerful ways of thinking about our problems. These types of abstractions are more useful because they more closely match the conceptual, or real problems. You'll know when you see one of these because it'll seem like the problem was created for the abstraction instead of the other way around. Besides creating perfect abstractions for the parts of the application that need them, the flip side is just as important, if not more so. Abstractions are not created for the parts of the application that don't need them. Simple things are left simple, and complicated things are possible.

I think most programmers are in the middle state. If you ask a programmer his advice on a particular problem, you'll probably get an answer, and you'll probably get a lot more information about other problems that get created along the way. Sometimes these problems themselves require complicated solutions, and then these problems cascade.

That brings us to the heart of the problem really. Simple things should be easy to change. If they're not, something is wrong. But none of us writes code with the idea that we're making certain things harder. We always believe that we're improving things. We see some boilerplate, and we think to ourselves, gee, wouldn't it be great if we didn't have to do that all the time? Let's find a way to eliminate it. We move along happily, proud of how much code we've reduced. The problem lies in the future, when we're thrown a curve ball. It doesn't fit in our strike zone, but we've got to hit a home run anyway. We look at the code, and we realize that it would have been a whole lot easier to do if we didn't have to deal with those abstractions we added in earlier. We'll have to modify them significantly to make the new problem fit. In other words, the abstraction leaks, because we now have to understand the implementation.

And now we're faced with a tough decision. Do we make these modifications - modifications is a good word for this, it's usually a hack - or do we scrap them and go for a simpler approach. The hacks are usually easier and we feel more confident that they'll work, but they increase the entropy of the system which will make it harder for us to maintain. A bigger cleanup will take more time and is riskier, but is probably a good long term investment in some cases. Unit tests will help in this case, but like good abstractions, I find that it's just as hard to write or find tests at the right abstraction level to pull this off.

So how can we avoid getting ourselves into a mess like this in the future? I think the real answer is that we have to get ourselves tangled into a few webs of anarchy before we can learn how to avoid them. But in an attempt to answer the question, instead of erring on the side of introducing abstractions, we can err on the side of leaving them out. Simpler code is usually easier to modify, although that's not true for large pieces of spaghetti where a few well placed abstractions trim out a lot of the complexity. But my personal opinion is that it's usually easier - and more fun - to add in additional layers of abstractions rather than break free from existing ones. It also feels safer.

But perhaps what should guide us the most in these ambiguous cases is the principle of least astonishment. If the presence of a particular abstraction is surprising, then it probably shouldn't be in there. Another good question to ask ourselves is if the new abstraction actually makes things simpler. If the code is easier to understand with the abstraction, awesome. If it's a puzzle to figure out what is going on, and it's just there to save on lines of code, change it.

It's easy to make things. It's hard to make things simple
Always two there are, a master and an apprentice

When learning a trade, the master/apprentice relationship can be tremendously useful. It is a slow path, but an effective one. It's a great way to pass on the lessons from the past.

And not only is the knowledge of skills passed on to future generations, so is the knowledge of common pitfalls. As an apprentice begins to make mistakes, the master can correct them immediately, potentially avoiding much of the headache associated with recovering from the error. And we all know that the earlier an error can be recovered from, the smaller the consequences.

However, you don't see very much of this in the programming world. I think there are a few reasons for this.

  1. Being self taught is encouraged. If you can learn yourself just by rtfm, then you haven't wasted anybody else's time.
  2. Programmers all think that they can solve the problem better themselves.
  3. There tends to be a high turnover in the programming industry.

But then, each generation is doomed to repeating the same mistakes. And with programming, it's difficult to know when you're making a mistake. But, a master can point this out right away, and correct it before it becomes a major problem. This can save an enormous amount of time, and greatly speed up the learning process.

Programmers, and those expecting programming based solutions, are usually impatient. There's enormous pressure to release the next best thing yesterday. The programming field also changes rapidly, and it will probably continue to change at an even faster rate in the future. It should be no surprise then that programmers are in a rush.

But the flip side is that it takes a while to become a great programmer. We really need to experience many failures before we can identify how to architect successes. And contrary to popular belief, it takes years to become proficient.

If there are no silver bullets, then I think that encouraging master/apprentice relationships can help propel us as a whole in the right direction faster.

The Command Line

September 28, 2008

It's interesting to hear so many different opinions on the command line. Many see it as archaic. Others are scared of it. And then there's a few who prefer it to all other interfaces. But regardless of how you feel about it, there's no denying that it's here to stay, at least for the near future.

I've mentioned earlier that interacting with a computer can be likened to communicating with it. We usually "speak" to it in the physical sense through a keyboard or mouse. On the virtual or software side of things, the most popular interfaces are either through the command line, or a gui (wimp interface).

Command line interfaces allow for more direct communication with the computer. You type words that you want the computer to execute, and it returns the response. Going through a gui is like talking to an interpreter first, and then having the interpreter relay the information. This can be more useful if the interpreter can figure out what you mean and make a more informed request to the computer. But for informed users, interpreters just get in the way.

Where I think the command line really shines is its flexibility. At a moment's notice, you have access to virtually anything, all through a single interface. You have access to a large, powerful set of tools that can be widely used with one another. For example, what if I wanted to count the number of files in a directory that had an odd number of lines in them? That were edited in the last week. And have those sent out in an email. Biweekly. Granted, that's a contrived example off the top of my head, but one that is virtually effortless to accomplish from the command line, yet difficult with a gui that wasn't designed specifically for that.

Another useful feature of the command line is its inherent repeatability. Once a command has been run, it can be recalled, executed as is, or executed with slight variations, all with little effort. This is true for sequences of commands as well. And, if the sequences themselves start repeating in sequence, then they can be moved to a shellscript or function and run with a single command. In this way, the command line allows for the user to create new language that is better suited for the problem, in a bottom up approach.

Guis have their place too though. The knowledge required is usually much lower to start being effective with an application. And guis are a better fit to solve visual problems. It's much easier to work with a wysiwyg type app when you're producing a new design, rather than having to perform transforms with commands and then redisplaying.

But I think the audience is what makes the biggest difference. Programmers, and "expert users" have a tendency to prefer command line tools. Novice users are usually afraid of having something go wrong, and find comfort with gui applications. And for the die hard gui guys, all you have to do is tell them:

Smith and Wesson was the original point and click interface

Programming is Communication

September 19, 2008

Programming is really just another form of communication. We type in some code, tell the computer that we've got some stuff for it to read, and then it gives us back a response when we run it. The write, compile, run feedback loop is like having a conversation with the computer.

Generalizing a little bit, using a computer itself is a form of communication. We put some input into the computer, whether we type something in, or use a mouse or some other device. Then the computer responds back to us. It can also work the other way too. The computer can prompt us for action, usually after a particular event occurs. For example, we can have a msgbox alert us of new mail as it arrives.

Thinking about it from a conversational perspective, our interactions with the computer become more effective if we can better articulate what we are trying to do. If we can phrase a question, or a particular action in a more effective manner, then it's more likely that we'll get a good response back from our computers. Usually, the whole trick comes down to saying the right thing.

And as programmers, the choice of programming language is really just that. It's how we choose to describe our application to the computer. Seen from this light, arguments have been made that if you're able to express yourself in fewer words, then you're using a better language for the job. It also follows the language parallel in that certain languages have many more words for a particular concept, each with a slightly different meaning. These languages are adept at expressing the subtle variations of this concept. And if a language doesn't have a word for a concept, then chances are it doesn't know how to express it effectively either.

Programmers generally get a bad rap about being poor communicators. But, if programming is just a form of communication, can this be true? I'd argue no. The best programmers turn out to be effective communicators too. They really are able to express their points in a clear, concise way. After all, they're used to writing clear, concise code.