Semantic data types

Back in 2005 Joel Spolsky wrote a blog post about Making Wrong Code Look Wrong.

Joel’s basic argument is that programmers should look for conventions that make incorrect code stand out, or look wrong. He goes on to argue that Applications Hungarian.aspx) notation is a good thing as it confers semantic information about the variable.

As an example he gave used the Microsoft Word code base. The developers had added prefixes to variables such as xl being horizontal coordinates relative to the layout and cb meaning count of bytes. This trivial example seems to make a lot of sense, as it makes it easy to see that there is no way that xl = cb should be allowed to happen.

Taking a more complex expression, and it starts to involve more thought and become less clear and requires memorising an increasing number of prefixes. Seeing code such as (pcKesselRun/msFoo)/msBar could quite conceivably lead to a thought process such as: Looks right, but why is Kessel Run in parsecs? Is that a typo? I can’t remember what ms is. I assume it is milliseconds but I am sure someone used it as meters/second somewhere.

With more modern languages shouldn’t there be a better way?

Types to the rescue

The concept of Applications Hungarian notation surfaced (according to Wikipedia) in around 1978 when fewer languages allowed user defined types. Modern languages with better type systems allow concise definitions of types and operators. Code then doesn’t just look wrong, the code fails either to compile or to run.

Once the type system is used to represent the semantic type of the variable, it removes a lot of the ambiguity and chance of making mistakes. In treating everything as real numbers type conversions are almost always valid

(pcKesselRun/msFoo)/msBar => (real/real)/real => real/real => real

Where as using datatypes to represent the semantic type with limited possibilities for type conversion provides more automatic checking of type assignments and conversions. So that the result type of the type would be acceleration

(pcKesselRun/msFoo)/msBar => (parsec/millisecond)/millisecond => velocity/millisecond => acceleration

This means rather than relying on humans to see the wrongness is code, the compiler does it for us.

Too much hard work?

In small projects I would agree that this is in no way required. At that point the entire project fits in the programmers head and there is probably just one programmer working on it.

In languages with adequate type systems, the amount of code to define a new type is minimal (excuse my appalling Haskell)

newtype Velocity = Velocity Double deriving (Eq, Ord, Read, Show)

(/) :: Velocity -> Time -> Acceleration
(/) vel time = ....

At the point where a project starts to require interfaces to allow different people code to interact then it becomes worth the added complexity. It would allow different teams to use different systems of measurement. Using this sort of system could have avoided Mars Climate Orbiter crash. The crash was caused by one team using metric and the other imperial units of measurement.

Even if it wouldn’t have helped with the Mars Climate Orbiter it would have helped with SDP, where at one point half the system worked in radians and the other half in degrees. Ogre3D solves the problem by having two angle types and taking advantage of the implicit type conversion feature of C++.

But dynamic languages…

It doesn’t matter if the type system is dynamic or static, it would just tend to fail at a different point. Where as a static language would fail at compile time a dynamic language would fail at runtime. For the following hypothetical Python example, this would fail if the time object doesn’t have any function to_seconds defined when trying to execute the second line.

class Distance(object):
    def __init__(self, ...)
        ...

    def from_meters_per_second(self):
        return ...

    def __div__(self, time):
        return Velocity.from_meters_per_second(self.to_meters()/time.to_seconds())

With a dynamic language types are only checked at run times when the code is executed as opposed to static languages in which the types are checked at compile time. Having a dynamic system that raises an exception as opposed giving the wrong result better as it is far clearer to the developer that there is a problem and it highlights exactly where the error is. This happening in production isn’t ideal where the exception would lead to the loss of a 193 million space craft. The probability of errors could be reduced by using unit tests that exercise the used code paths.

Was Joel wrong?

In Joel’s defence in a language restricted in some way, such as not being able to define new date types, Applications Hungarian Notation may prove a improvement. In a language that supports it Applications Hungarian should be avoided. In using a type system to the fullest extent, it shouldn’t just make wrong code look wrong. It should make wrong code noticeably fail, either by not compiling or by raising an exception.

A/B Testing – Use with care

In the blog post A/B Testing: More than Just Sandpaper the author suggests that A/B testing is one of the key ways to guide the design of products. It sets out to dispute the claim that Jeff Atwood makes that A/B testing is like sandpaper. You can use it to smooth out details, but you can’t actually create anything with it. While A/B testing is a useful tool I think the author overstates its usefulness.

Local minima

The author says that:

[A/B testing can] quantify how good the final result is

This assertion isn’t the case. It can only quantify how good it is based on some metric against another version of the product.

This makes it possible to get stuck in local minima. Using A/B testing to iteratively improve may make it impossible to escape from the local minima as larger non iterative changes may be needed. Of course there is no reason why it isn’t possible to test distinctly different sites against each other using A/B testing. However developing this different design this would require a UX designer, user testing etc. Not just A/B testing.

As it only tests one design against another, it often isn’t able to address fundamental problems with developer perception. A key example of this would be eBay. From the Wired article that the blog post linked to eBay is also a company heavily into A/B testing. One of their notable failures was to expand into China. Why? Their entire site was designed from a western perspective and so failed to take into account the extent of the Chinese culture of haggling. They were bound by their preconceptions about how the site should have been. In not having a deeper understanding of the market they allowed TaoBao to capitalize on eBay’s mistake.

No amount of A/B testing allowed them to discover their error in time as A/B couldn’t quantify how bad their design was as they didn’t escape from the local minima and come up with a better design to test against.

Users aren’t the focus

The article proposes that:

[A/B] testing democratises the process of software development and brings better outcomes for both the developers and the users

I feel that this isn’t adequately explored, and isn’t true, so I offer some counter points

Skyscanner is a site that has a vested interest in getting users to click through to airlines sites. Through A/B testing they found that green buttons helped and many other minor improvements. As they continued to iterate their design they realised that in degrading the site experience they were causing their primary target to increase but causing fewer people to return to the site. There was a worry that if they kept following the path A/B testing suggested then it would hurt their bottom line long term.

On a more personal level this can be seen with the Wikipedia Ads. The heavily A/B tested Wikipeida fundraising banner was highly noticeable and hence highly distracting and was one of the few times I have manually added rules to AdBlock to remove an ad. Having the highly noticeable banner was clearly useful to Wikipedia but for me it was mostly just annoying having my eye drawn to it every time I used Wikipedia.

These examples admittedly are just anecdotes and lack any firm basis in studies but hopefully present the idea that without due care A/B testing can be taken too far. Fixation on maximising a few simple variables can be at the expense of other unmeasured areas of the design can be both detrimental to the users and the overall goals of the company.

Better designs?

The blog post says that:

A/B testing has shown time and time again that in some cases the solutions that violate the rules of visual composition or could even be perceived as vulgar are the most appealing.

I am unsure that this conclusion can adequately be drawn from the linked articles. While it is true to say that they increase some metric, I perceive it as a large to step to say that improving the metric means the designs are visually appealing. Putting yellow highlighting on the text in an email looks horrific from a visual standpoint, it does however draw the users attention to what the designer wants them to see.

Just another tool…

I agree with the premise of the blog post that A/B testing is a good tool, but it should be viewed as just that, another tool. It isn’t a panacea for deciding what the design should look like but an aid in guiding UI design decisions. It shouldn’t replace other design measures such as user testing which provide more feedback as to *why* part of a design works.

I understand where Jeff Atwood is coming from. A/B testing can be used to create, to guide, but it is one tool among many. But without other tools it is as he said, just sandpaper.

Why “Could you build a rabbit hutch?” is asking the wrong question

Writing software is very much still viewed as implementation of a design. This is exemplified in the question asked in lectures: “Could you build a rabbit hutch?”. The building of the rabbit hutch is often the stage that needs the least thought; the design of the rabbit hutch and the consultation with the rabbits over what they want are much harder problems. It is the same with software. Fully specifying the solution (i.e. writing code) is highly complex but the actual building is a mostly solved and automated problem.

It is evident that the waterfall model draws some inspiration from the construction industry. The model is based around the concept that design changes after the completion of the build are costly if not impossible. The parallels drawn in this model are that the design is the stage of software development done away from the code such as requirement gathering. Actual coding (or what is considered to be the implementation phase) is analogous to the construction of the rabbit hutch. This comparison draws from the fact that the actual building of the software product (for a compiled language — converting it into a machine code) has become so cheap that it is almost free.

If we consider a design to be made up of only design documents and requirements then the majority of software designs are woefully underspecified with regards to their internal workings. They often fail to take into account edge cases and don’t fully specify the system, mostly due to a lack of understanding of the system. The acid test for this being to give the same design to two separate teams and getting different results as they both make different assumptions about underspecified areas of the design. A fully specified design would, in all likelihood, be executable and no different from what is considered in the waterfall model to be the implementation.

When an architect is designing a building the client is regularly consulted and shown mock-ups of designs varying from simple floor plans to physical models of the proposed building. Agile is in many ways a methodology that recognises this problem, with a work flow that is more in line with the idea of treating writing software as part of a design phase. The client is involved through the coding phase but rather than being shown mock-ups they are shown actual versions of the software. This cannot be done with physical buildings as the cost would be prohibitive.

Changing the assumption as to what the coding phase of a project actually is would fix a lot of the misconceptions made about software project management.

The central theme of The Mythical Man-Month is that adding programmers to a late project will just make it later. This may not seem obvious because if coding is treated as merely the implementation of a design then adding programmers should make the project progress faster, just as adding builders to a construction project can often make it faster.

To the layperson it makes sense that adding more architects won’t improve the rate at which a building is designed as they clearly associate what an architect does as a creative processes without a single solution. In changing the perception of programming to being equivalent to design the layperson’s perception of the process would also carry over to software. It would become more obvious that doubling the number of programmers wouldn’t halve development time.

The supposed high failure rates of products seems to becomes less of an issue when compared to other industries. Very rarely do you see stalled or unfinished construction projects. Harder to notice are the skyscrapers that never get built and the products that are never manufactured. In other words the failures happen in the design phase.

In designing a product, be that a rabbit hutch, a skyscraper or a piece of software. The designer is attempting to quantify the unknowns. So could I build a rabbit hutch? Yes. Could I build the Sheth Tower? Yes. Because if I have the design, a work force and the required money the majority of the unknowns have been dealt with, leaving me a far simpler problem. So the better question is “Could you design a rabbit hutch?”. The distinction is critical.