Metrics: they simply don’t work. | BASeCamp Programming Blog

Companies and businesses want to prosper. At the core, being a successful business means that you make a sizable revenue; But, when those start to falter, how do businesses determine the cause? They generally measure something; for a programming firm, this may be lines of code written. For a restaurant, perhaps the time it takes for a customer to be seated, and so forth. And on the surface, it makes sense.

The problem is, it only makes sense,in theory. Why? because both methods are trying to tack a quantitative statement onto something that cannot be measured in this way; additionally, they forget the human factor.

Let’s look at a hypothetical situation; Programmer Joe. Programmer Joe has worked at a Software startup since it’s early days, and enjoys his job. However, one day, the CEO calls them all into the office to discuss declining revenue. Realistically, the revenue is declining because a competitor has just released a product that directly competes with the startup’s flagship software package. However, management believes that the loose leash they’ve been keeping the programmers on is to blame; so one of the idea men determines that they should implement a management technique known as metrics. In this case- they start with something basic; each programmer’s performance will be judged directly correspondent to the number of lines of code they write.

Programmer Joe continues his work as normal; he notices some of his co-workers checking in code that re-rolls standard library routines, and cuts out excess code like this. One day he’s called into the Project Manager’s office.

“Joe… I don’t know quite how to explain this- maybe you can. Somehow, you’ve managed to accumulate a [i]negative[/i] lines per day metric… any idea how that is possible?”

“Well, I’ve noticed a lot of redundant code being checked in, and trimmed it down. the code still works the same way- it’s just faster; also, I’ve managed to fix half the bugs on the list just by doing that.”

The Project Manager may, in this case, take this particular information to the higher levels. they decide that the metric is flawed- what they need ot measure is lines of code written minus the number of bugs introduced.

A feature request arrives from a client near the end of the day, at 4:30PM. the Project Manager asks if Joe can look over the request and if he can maybe stay late to try to get some of those features implemented. So Joe spends the next three hours creating a prototype for the new feature and once he has the main functionality down he checks the code in and goes home. The next day, Joe continues his work on this new feature. However, he is once again called into the Project Manager’s office.

The Project manager hands Joe a large stack of papers.

“you know what those are, Joe?”

“No…”

“Those are the bug reports filed on code you’ve checked in recently.”

“All the code I’ve been working on recently has been a prototype… I haven’t yet gotten it fully integrated into the rest of the system.”

…

I think you can see where I was going there. basically, the fact is that havign a measuring metric that measures small fiobles in the entire process is doomed to cause hiccups within that process and even bring business to a standstill. In this case, rather then worry about the number of lines of code written or the number of bugs introduced, it might be better to focus on fixing the bugs and adding features; the number of lines of code in a project does not directly correspond to the quality of that project as a whole, and in some cases can even do so inversely.

In a strange coincidence, a local Coffee shop that Programmer Joe visits on his way to work had noticably changed. When Joe used to go though their drive thru to grab his morning coffee and a box of donuts for his colleagues, the staff were friendly and often asked him how things were going. Recently, however, Joe has felt like a piece of scrap metal on an assembly line. Once he got to the pay window, he would often hear audible grumbles of discontent as he simply reached into his pocket for his wallet. His donuts have often been obviously simply thrown in the box with little care, too. Joe couldn’t understand it, since these were the very same people that would serve him before. Eventually, Joe decided to go elsewhere for his morning coffee.

The previous paragraph is not an entire fabrication; in fact, I work at a location that does just this. They time every single drive thru customer’s time at the window, and it’s treated as the single most important measurement of performance in the entire store. I’d even go so far as to say it seems to be used to reflect directly customer satisfaction. However, this simply is not the case. The franchise, in particular, places a recommended limit of 42 seconds for waiting at the window. a reasonable time frame, depending on the volume of customers. However, at my location it has now been decreed that we shall not take longer then 30 seconds or, from what I hear of others, they will be verbally “abused” about it.

In any case, a little backstory is probably in order. Lately it would appear that revenue has been dwindling; there are far fewer customers then I remember coming in, at all times of the day. Additionally, various seemingly pedantic rules have been places on our release of such seemingly trite things as butters and napkins. It’s fair to assume in this case that they obviously want to bring business back up again- and that is certainly something anybody in that position would try to do.

However, with that said, they are going about it wrong. In the Retail and Customer Service industry- where almost all revenue is coming from consumers who purchase your product by coming to your establishment, the all-time, 100% most important thing for business is customer satisfaction. No exceptions. Interestingly enough, since this has started, I’ve gotten dozens of complaints from Customers who visit regularly about the shoddy way they were treated while going through the drive thru at some other time during the day. This certainly is not the fault of the workers at the time; since, as I said, they are essentially being “forced” to try to reduce times to <30 seconds. But when you get customers, the main source of revenue for the store, complaining about something that is the result of a change to the store policy that was introduced to try to increase revenue it is pretty solid evidence that the technique has failed miserably.

The problem here is not that those in charge know, as well as anybody else, that customer satisfaction is the single most important thing to the stores success, but rather in that they have tried to assign a single metric to measure this particular quality. And they do not correspond. I certainly won’t argue that customers prefer fast service— it certainly is on their list of hopes when they enter a drive-thru to be out as fast as possible— but I think what one needs to realize is that having their orders (literally) thrown into their car for the sake of speed of service doesn’t please the customer. They aren’t going to think, “well, golly, they threw the sandwich right into the passenger seat and refused to carry on a quick little bit of small-talk while I waited, but damn, it took only 30 seconds, so I am going to say I am satisfied”. I’m sorry, but this does not happen. As long as a customer is not waiting an exceedingly long time for their purchase, they tend not to notice it. Think of it this way- there is the old adage where you can either have fast service, good service, or right service, but you can only choose two. I put forth that, for many consumers it also holds true that when one of these is omitted, and the other two are done in a way that exceeds their expectations, they are likely to generally be satisfied. For example, if a customer is greeted with courteous fervor, and perhaps a friendly conversation ensues in addition to their order being made perfect, they are less likely to notice that they had to wait in the line for a few minutes. Now, if they had to wait that same amount of time and they were treated brashly, they are more likely to take offense. In fact, the single most important thing to almost all customers is not the time they wait, or even the fact that their order is made perfectly to their specifications, but rather to be treated as people, and not as some mindless consumer whose particular interests and concerns are of no importance. The sad fact is using this particular measurement metric encourages the workers to do the latter.

The thing is, there are even further flaws in the mechanism. First, the device doesn’t even work half the time, so times come out skewed and sometimes two vehicles get counted in the same interval. Additionally, since the time measured is the entire time the customer is at the window, this is not simply a measure of the efficiency of the workers to get the product to the customer but also a test of how quickly the customer can pay for their product. if a customer needs to dig for change for 10 extra seconds after the employees have successfully finished their entire order and simply await payment, why is this 10-seconds attributed to the detraction of the employees? “Because we have no way of knowing” the people who read the recorded times may say. The issue here is that obviously if there is one fact you don’t know about what the measurement indicates then there are certainly more, and in fact it could even be put forth that the measurement is completely meaningless. a Customer could be at the window for 10 seconds and still not be pleased, be it because of the shoddy service received while being squeezed through in a tiny window of time, or because they got their order mixed up with another, or some other error instigated by the fact that inhuman demands are made of people there. On the converse, a person could wait for a entire minute at the window and still be perfectly pleased with their service, so using the measure of window time as some sort of barometer by which to judge the performance of a business is downright ridiculous, and this only multiplies when one considers that such time-based constraints are not placed on those consumers who decide to enter the establishment for their service.

And while the latter example has certainly acquired far more attention for personal reasons, it certainly is no more or less ludicrous then the software companies implementation; performance metrics have been tried in nearly every industry and in every industry they have failed to provide the hoped for results. The key here is that the measure is not strictly of customer satisfaction or even of revenue; but more directly, of the value of the business.

Value is a perception and must be communicated and measured to be perceived as value. Value measurement is the process by which management decides on operational performance measures that will enable them to secure the the owners return on investment. Value measures must be aligned with the business strategy positioning the business. Measurement includes measuring lead factors and lag factors. Lead factors identify performance measures that will proactively provide an indication of whether objectives will be achieved or not. Lead factors allow management to receive warning sign proactively. Lag factors measure how successful management was in terms of creating value. Financial reports are the ultimate lag measures of success. When a business lead measures indicate that the business is performing well but the business lag factors are showing the contradictory then it is time to review the lead measures.

The value measurement process determines the behaviour of the business and aligns the behaviour of human capital with the value expectations of all stakeholders e.g. owners, management, customers, employees and partners. Value should be measured and reported on from every stakeholder’s perspective. A true balanced scorecard will drive performance improvements for all five stakeholder dimensions. This all applies to ANY business, regardless of industry.

Flipping quickly back to software, an excellent overview of the problems present with Performance metrics as used in that industry can be found here: http://discuss.joelonsoftware.com/default.asp?biz.5.304155.19

Have something to say about this post? Comment!