A Lever for the Mind
Contents
Introduction
Give me a lever long enough, and a fulcrum on which to place it, and I shall move the world.
Humans are the cleverest animals in the known universe. But our brains, forged over millions of years in the unforgiving furnace of natural selection, are still primitive. They evolved to be good at hunting animals, making tools, recognising facial expressions and maintaining social bonds, but they’re just not very good at working with several difficult ideas simultaneously, and they’re particularly bad at reasoning about the counterintuitive emergent behaviour of complex non-human systems.
That’s a problem, because complex non-human systems now dominate our world.
Abstraction is our solution to this problem. Abstraction works as an adapter between the fearsome complexity of the universe and our simple primate minds; when the real world is too detailed, too confusing or too counterintuitive for us to work with directly, abstraction gives us big friendly levers to pull on instead. This one powerful idea underlies computers, design, culture, language, and everything else we rely on in our daily work.
This is an article about abstraction: where it comes from, what it’s for, and how we can use it to make our programs better.
Numbers
Let’s begin by talking about numbers. Numbers seem obvious and natural to us, but they have no independent existence. They are a human invention, a human technology, designed by us to solve problems we have.
There was a point in our cultural history where we had no concept of numbers, but then we created them, and gave ourselves a new and powerful tool for categorising, understanding and predicting the real world.
Imagine you’re living in a prehistoric time and the concept of numbers hasn’t been invented yet. You’ve grown some apples, which is great, but you’ve been eating apples every winter for years and you’re kind of tired of them now.
Fortunately a visitor arrives from the next village, and they have some bananas:
Now, you feel that swapping an apple for a banana is a fair trade, and you would like as many bananas as possible.
So are you prepared to swap your entire collection of apples for the visitor’s collection of bananas? Is that fair to both of you? Well yeah, it is fair, because your collection of apples has something in common with their collection of bananas. There’s a special property that both collections share, even though they contain different kinds of fruit.
And that’s a property that your apple collection doesn’t share with this collection of bananas:
Or this one:
Or this collection of apples (even though they’re the same kind of fruit!):
Or this one:
The special property shared by your apples and the visitor’s bananas means that they can be paired up. Each apple goes with a banana, with no apples or bananas left over:
(It doesn’t matter how they’re paired up, only that there’s some way of doing it.)
We can come up with other collections that have this property too. This collection of cherries has the same property, because we can pair them up with the apples:
Every apple has a cherry and vice versa; nothing gets left over.
In fact, it’s fairly obvious that the kind of fruit is irrelevant. Being a fruit at all is irrelevant. This collection of dots has the property as well:
So if it’s irrelevant whether the things in the collection are fruit or dots or whatever, then what else is irrelevant? Well, the size and arrangement of the things in the collection doesn’t matter either. Each of these groups of dots has the same property:
What do we call this property? How do we identify the property that those apples, and those bananas, and those cherries, and all those different groups of dots have? Our usual name for it is “three”.
Now, technically speaking, what you’re seeing here isn’t actually the number three. It’s just a pattern of light, made of pixels. That arrangement of pixels represents a specific glyph from a typeface.
That glyph represents a character, in this case a numeral, also called a numerical digit. And that numerical digit denotes the number three in the base ten positional numeral system, which is the usual system used in Western culture.
The number three itself is this property, this relationship, this abstract structure:
If you’ve got a collection with three things in it, you can put it on one side of this diagram and connect each thing in the collection to a dot on the other side. “Three” is our name for all the things that look like this. That’s what it means to be “three”.
Unlike our imaginary prehistoric person, you learned about three when you were very young, so you already have a picture in your head of what three looks like, and it’s probably like one of these arrangements of dots. You can recognise other collections of three things by pairing them up with the dots in your mental image.
Once we have the idea of “a number”, we can build other concepts from it. For example, we can identify the particular relationship between these collections:
Once the individual fruit have been paired up, the right-hand collection has an apple left over.
When this happens we say that the number represented by the right-hand collection is the successor of the number represented by the left-hand collection. In other words, four is the successor of three.
That gives us a way to generate new numbers. The successor of three is four, and the successor of four is five, and the successor of five is six:
And that suggests another relationship between numbers: when two numbers are connected by an unbroken sequence of successors like this, we say that the first number is less than the second one. Three is less than six.
Once we have the idea of successor, we can also use it to build a new operation on two collections. For every item in the first collection, we generate the successor of the second collection:
That operation is called addition: we’ve added three to two to get five.
The point of all this is that numbers, and operations on numbers, allow us to predict the outcome of events in the real world. You don’t need to physically pair up your fruit with a visitor’s fruit to decide if it’s a fair trade. You recognise that they have three bananas and you have three apples. You don’t even need to see their bananas! They can just tell you they have three bananas and you’ll be able to decide.
If you want to know whether any of your sheep have escaped when you go to round them up on the hillside, you don’t actually have to search the whole mountain. Just count them and you’ll know.
Likewise for addition. If you want to know what the combined contents of two buckets of apples will look like, you could physically combine them. Or, instead of working in the real world, you could step into the abstract world: count each bucket to get two numbers, then add them to get a result.
That result is valuable because it’s the same as what you get if you actually do the work in the real world and then turn the result into a number. The abstract operations in the world of numbers accurately predict what happens in reality, by focusing only on those aspects of reality which affect the outcome.
This technology is so groundbreakingly useful that it just becomes second nature, and we get used to working in this abstract world. Numbers feel very simple and obvious to us. But they only exist because people observed the real world, spotted patterns in it, and built abstractions to represent those patterns.
Sums
Of course, once we have numbers, we need to develop other layers of abstract ideas to make them easier to work with. Here’s an example: taking the sum of lots of numbers.
What’s the sum of all the numbers between 1 and 10? That’s not too hard to work out by hand: 1 plus 2 is 3, plus 3 is 6, plus 4 is 10, plus 5 is 15, plus 6 is 21, plus 7 is 28, plus 8 is 36, plus 9 is 45, plus 10 is 55.
So the answer is that 1 + 2 + … + 10 = 55.
But here’s a harder version: what’s the sum of all the numbers between 1 and 100? It would take us at least ten times as long to work that out by hand.
There’s a story about a child prodigy called Carl Friedrich Gauss, who was born in the 18th century. The story goes that his primary school teacher told him to do this sum as a punishment, but he came up with the answer almost immediately. The story may not be true, but it could be, because there’s an easy way of working out the answer.
Let’s go back to the 1 + 2 + … + 10 example. One property of adding is that it doesn’t matter what order we list the numbers in; we still get the same result. (The fancy way of saying that is “addition is commutative”.) So we can add the numbers from 1 to 10 in a different order and it won’t affect the answer. Let’s rearrange them:
Now they’re listed slightly differently — it goes “1, 2, 3, 4, 5” down the left side, then “6, 7, 8, 9, 10” up the right side. But all the same numbers are still there, so the answer will be the same.
Another property of addition is that if we’re adding many numbers it doesn’t matter what order we do the individual additions in. (The fancy name for that property is “associativity”.) So we can choose to add together the first two numbers to get 11. Then we can pick any other numbers, say the 2 and the 9, and decide to add them next, and then the 3 and the 8, and so on:
So once we rearranged the numbers into those pairs, we found that each pair added up to eleven, and now it’s easy to see that the answer is 55.
Instead of blindly summing the numbers in order, this time we got to the answer by arranging them into five pairs — there are five because five is half of ten — and each pair adds up to eleven, which is one more than ten.
So that’s the secret: when you’re adding up the numbers from 1 to 10, there are five pairs of them, and each pair adds up to 11. This is what Gauss ostensibly realised.
Or, in particular, he realised that we can do the same thing for the numbers up to 100. We can rewrite the sum 1 + 2 + … + 100 as 1 + 100, plus 2 + 99, all the way up to 49 + 52, plus 50 + 51. And again, by choosing to add each pair first, we can see that each adds up to 101:
There were a hundred numbers to start with, so there must have been 50 pairs here. And, although this is slightly harder, 50 times 101 is 5,050.
So our secret is that when you add up the numbers from 1 to 100, that’s the same as adding 50 pairs of numbers, and each pair adds up to 101.
And this pattern generalises. 10 and 100 aren’t special. The sum of all the numbers up to any whole number n is half-of-n multiplied by one-greater-than-n. (If n is odd then you get something-and-a-half when you halve it, but then you’ll be multiplying by the even number n + 1, so the half goes away again.)
Another way of thinking about this is by looking at it visually. When we represent numbers with dots instead of numerals, 1 + 2 + … + 10 looks like this:
That gives us a triangle! But by left-justifying the rows of dots, we can rearrange that triangle to look like half of a rectangle. If we fill in the other half, we can see that the rectangle is 10 dots tall and 11 dots wide:
So there are 110 dots overall, and our original dots make up half of that. Half of 110 is 55, so our original triangle has 55 dots in it, which is the same answer as we got by adding up the pairs.
Here’s what we did this time: the sum of all the numbers up to any whole number n is one-greater-than-n, multiplied by n, then divided by two.
And that’s essentially the same formula as we got by rearranging and adding pairs, just with the “one greater than n” and “n divided by 2” the other way around, which doesn’t change anything because multiplication is commutative as well.
Now, can you remember all the details of that argument, and why it must be true? If not then it doesn’t matter, because you don’t need to remember the details, you just need to know the formula. It’s a pattern, an abstract tool that you can use even if you don’t know why it works.
Quick, what’s 1 + 2 + … + 1000? Try doing it in your head.
We now know it must be half-of-1000 multiplied by one-more-than-1000, which is 500 times 1001, which is (after a moment’s thought) 500,500. Pretty impressive.
Graphs
Let’s have a look at an example of an abstraction that isn’t quite so concerned with numbers.
Here’s a map of what used to be Königsberg in Prussia (now Kaliningrad in Russia):
There’s a river running through the city, with two large islands in it, and seven bridges connecting the islands to the north and south banks as well as to each other.
Now, is it possible to walk around Königsberg in a way that crosses each bridge exactly once? Let’s try it. Here’s one possible walking route:
We can start on the south bank, cross all the way up to the north bank, then all the way back down to the south bank, then all the way north again. That crosses six of the seven bridges, but it misses out the bridge connecting the two islands.
If we try to detour across that bridge, we get stuck on the small island and can’t reach the last bridge:
Here’s a different approach, spiralling inwards, but again we miss out one of the northern bridges:
If we detour across that bridge, we get stuck on the north bank before we finish:
This way doesn’t work either:
Nor does this way, where we start on the small island:
And nor does this way, where we start on the large one:
If we tried this for long enough we’d become pretty convinced that it can’t be done. But why can’t it be done? What is it about Königsberg that makes it impossible? That question was analysed by Leonhard Euler in the 18th century. He proved it was impossible, and explained exactly why.
To see why, let’s concentrate on the small island. When you arrive on this island by a bridge, you must leave by another bridge, unless you’re at the end of your journey, and the next time you arrive by a different bridge, you must leave by yet another bridge:
The number of times you arrive on this island equals the number of times you leave, unless you start or finish there. So, unless you want to start or finish your walk on this island, it must have an even number of bridges connecting it to the other land masses — half to arrive by, half to leave by.
And that’s the case for all the other land masses too. We can start or finish on a land mass with an odd number of bridges, but all the other ones must have an even number.
But all of the land masses in Königsburg have an odd number of bridges!
The small island has five bridges connected to it, and all the other land masses have three. They’re all suitable places for starting or finishing a walk, but a walk can only have two end points, so some of them have to be visited partway through the walk.
Euler noticed some other important things about this problem. The first thing was that the route you take on the land doesn’t matter, just the order in which you cross the bridges. So we can disregard everything except the existence of each land mass and the bridges that connect them.
The other thing he noticed was that the relative positions of the land masses and bridges don’t matter; all we care about is how they’re connected. So for our purposes, these two arrangements are the same:
And so are these two:
This gives us a structure called a graph. A graph is made of two things: a set of nodes — the dots here — which have no structure at all, just identity; and a set of edges — the lines connecting the dots. Edges have no direction; an edge is just an assertion that two particular nodes are connected.
A graph is incredibly simple, but it’s actually a very rich structure, and we can identify lots of interesting properties of graphs by thinking about them more. The first property we’re interested in is called degree, which counts how many edges are connected to a node:
One of the nodes above, for example, has two edges connected to it, so its degree is 2. Another is connected to three edges, so its degree is 3.
The next idea is a walk, which is a sequence of connected nodes and edges:
Walks give us a way to decide when a graph is connected. A connected graph is one where you can find a walk between any two nodes you pick:
If we pick the two yellow nodes above, there’s at least one walk that connects them.
If we pick two different nodes, there’s at least one walk between them too:
And if we pick a third pair of nodes, there’s also a walk between them:
And we can do that for any two nodes, so this graph is connected.
If we take away a couple of edges, the graph isn’t connected any more, because now there’s a way to pick two nodes that have no walk between them:
The next idea is a trail, which is just a walk where all the edges are distinct; you can’t use an edge more than once:
And finally, there’s an idea called an Eulerian trail, which is a trail that visits every edge in a graph. That may or may not be possible on a given graph, but in this case it is:
That’s an Eulerian trail, because every edge got visited exactly once.
The point of naming all those ideas about graphs is that they give us a precise way of stating the answer to the Königsberg bridges problem. Crossing all the bridges means finding an Eulerian trail in the graph of land masses and bridges, and Euler showed that’s only possible when the graph is connected and has at most 2 nodes of odd degree, because your Eulerian trail can start and finish at nodes with odd degree, but the trail only has 2 ends.
Now, in a week’s time, will you remember all the details of that argument, and why it must be true? If not then it doesn’t matter, because you don’t need to remember the details, you just need to know this. You can use this result even if you don’t know why it works.
For example, is it possible to drive around the United States interstate system in a way which covers each stretch of interstate exactly once (even if we ignore the interstates that lead out of the country or are only joined to the rest of the system on one end)?
To decide this by trial and error would take much more work than the Königsberg problem. But because of Euler’s result, we can just look for nodes of odd degree. And look, I already see one in Portland, one in Denver, and one in Buffalo:
And that’s it, we can stop. There are more than two nodes of odd degree, so there can be no Eulerian trail, so that incredibly boring road trip can’t happen.
Triangles
Here’s one last example.
When you draw a right-angled triangle you can control the lengths of two of its sides, and then you have to measure the third side to find out how long it is. It’s pretty clear that these three lengths are related. If you make one side longer, another side gets longer; make one of the sides shorter, and another side gets shorter:
So what exactly is the relationship between the lengths of the sides? It’s been known for thousands of years: if we call the lengths of the three sides a, b and c, then the sum of the squares of a and b is always equal to the square of c.
This is known as the Pythagorean theorem, although the relationship was known well before Pythagoras’s time; he just came up with a particularly simple way of proving it.
Pythagoras realised that when you arrange four copies of the same triangle into a large square, that leaves space for a smaller, rotated square in the middle. The length of each side of this square is c, so its area must be c²:
But by rearranging the triangles inside the large square, you can make space for two smaller squares. One of them has sides of length a, so its area is a², and the other one has sides of length b, so its area is b²:
These two arrangements have the same total area:
So if we take away the four triangles, we’re left with two smaller squares whose combined area must be the same as the area of the larger square:
You can draw this diagram for any right-angled triangle. So even if you make the triangle fatter or thinner, that just corresponds to the right-hand square being rotated at a different angle.
So that’s Pythagoras’s proof. Would you be able to draw the diagram from memory? No? Well, it doesn’t matter, you just need to know the formula.
Mathematics
I’ve been talking about abstraction, but there’s another word that comes to mind when discussing these ideas: mathematics. And that’s not an accident, because mathematics is the study of abstraction.
I’ve shown you a few general problems that we can solve by first solving individual examples of the problem and then looking for patterns in our solutions. Once we spot a pattern, we can use it to design an abstraction that lets us solve any instance of that problem, without having to think about or even understand why it works.
That’s essentially what mathematics is: spotting patterns and building reusable abstractions.
It’s pretty popular at the moment to say that you don’t need mathematics or computer science to do programming, and although I basically agree with what people mean when they say that, I do think that the words they’re actually saying are wrong.
Most of us have had terrible mathematics education where we’ve been taught that mathematics is about memorising meaningless formulas or symbols or mnemonics so that we can pass exams. But mathematics isn’t about that; it’s about building abstractions that magnify the force we can apply with our minds.
So it’s like saying that you don’t need to know music to be able to play the violin. Well, if by “music” you mean knowledge of music theory or the ability to read sheet music notation or other incidental things, then of course, you don’t need any of that to play the violin. You can just pick one up and learn to play it.
But that’s not what music is! Music isn’t blobs on a stave or theories about harmonic intervals. Music is what happens when you play the violin.
And in the same way, mathematics isn’t Greek symbols and strange words and remembering how to integrate every trigonometric function. Mathematics is what happens when you simplify the world by building abstractions. Mathematics is what happens when you write a program.
Poker
Let me show you a more practical example of this. There’s an exercise I often do with Ruby programmers: write some code that can compare two poker hands and decide which one beats the other one.
In case you don’t know, a hand in poker is five playing cards, and depending on what those cards are, the hand falls into a particular category.
The simplest category is a high card hand, which is just five cards with no particular pattern. To compare two high card hands you just compare their highest card; if those are the same then you fall back to the second-highest card, and so on.
Then there’s a one-pair hand, which contains two cards of the same rank — the one above has a pair of twos. A one-pair hand always beats a high card hand, and it beats another one-pair hand if the pair card is higher.
A hand with two pairs is even better.
And then there are three-of-a-kind hands with three cards of the same rank, then straight hands with five consecutively ranked cards, then flush hands where all the cards are the same suit. And it keeps going after that: full house, four of a kind, and finally straight flush.
How can we implement these rules?
The naïve way is to pick an easy way of representing cards, like a Struct
with rank
and suit
attributes, and then use an ordinary Fixnum
to represent the rank of a card, and a symbol or string for its suit:
Card = Struct.new(:rank, :suit)
three_of_diamonds = Card.new(3, :diamonds)
queen_of_hearts = Card.new(12, :hearts)
ace_of_spades = Card.new(14, :spades)
For the face cards — jack, queen and king — we can just use 11
, 12
and 13
, and it makes most sense to use 14
to represent an ace, because an ace is (mostly) the highest card.
Next we can create a Hand
class that wraps up a collection of cards:
Hand = Struct.new(:cards)
Then we’ll mix in Comparable
, and start implementing #<=>
so that two hands can be compared:
Hand = Struct.new(:cards) do
include Comparable
def <=>(other)
# TODO
end
end
To support high-card hands, the common case is that we just compare the rank of the best card in each hand:
Hand = Struct.new(:cards) do
include Comparable
def <=>(other)
highest_card_rank <=> other.highest_card_rank
end
# …
end
(I’ll leave out the implementation of helper methods like #highest_card_rank
.)
If those highest cards are ranked the same, we can compare the hands again with the highest card removed, so the recursive call will be checking the second-highest cards, and so on:
Hand = Struct.new(:cards) do
include Comparable
def <=>(other)
if highest_card_rank == other.highest_card_rank
without { |c| c.rank == highest_card_rank } <=>
other.without { |c| c.rank == highest_card_rank }
else
highest_card_rank <=> other.highest_card_rank
end
end
# …
end
And we need a base case for that recursion:
Hand = Struct.new(:cards) do
include Comparable
def <=>(other)
if empty? && other.empty?
0
elsif highest_card_rank == other.highest_card_rank
without { |c| c.rank == highest_card_rank } <=>
other.without { |c| c.rank == highest_card_rank }
else
highest_card_rank <=> other.highest_card_rank
end
end
# …
end
If both hands are empty, we say that neither one wins, which is what 0
means here.
To support one-pair hands we need to mostly duplicate the code we’ve already written. First we check if we’re comparing a one-pair hand with another one-pair hand, and compare the pairs if so, otherwise we fall through and do (mostly) what we were doing before:
Hand = Struct.new(:cards) do
include Comparable
def <=>(other)
if one_pair?
if other.one_pair?
if highest_pair_rank == other.highest_pair_rank
without { |c| c.rank == highest_pair_rank } <=>
other.without { |c| c.rank == highest_pair_rank }
else
highest_pair_rank <=> other.highest_pair_rank
end
else
1
end
else
if empty? && other.empty?
0
elsif other.one_pair?
-1
elsif highest_card_rank == other.highest_card_rank
without { |c| c.rank == highest_card_rank } <=>
other.without { |c| c.rank == highest_card_rank }
else
highest_card_rank <=> other.highest_card_rank
end
end
end
# …
end
Then we add support for two-pair hands in much the same way:
Hand = Struct.new(:cards) do
include Comparable
def <=>(other)
if two_pair?
if other.two_pair?
if highest_pair_rank == other.highest_pair_rank
without { |c| c.rank == highest_pair_rank } <=>
other.without { |c| c.rank == highest_pair_rank }
else
highest_pair_rank <=> other.highest_pair_rank
end
else
1
end
elsif one_pair?
if other.two_pair?
-1
elsif other.one_pair?
if highest_pair_rank == other.highest_pair_rank
without { |c| c.rank == highest_pair_rank } <=>
other.without { |c| c.rank == highest_pair_rank }
else
highest_pair_rank <=> other.highest_pair_rank
end
else
1
end
else
if empty? && other.empty?
0
elsif other.one_pair? || other.two_pair?
-1
elsif highest_card_rank == other.highest_card_rank
without { |c| c.rank == highest_card_rank } <=>
other.without { |c| c.rank == highest_card_rank }
else
highest_card_rank <=> other.highest_card_rank
end
end
end
# …
end
Then three of a kind:
Hand = Struct.new(:cards) do
include Comparable
def <=>(other)
if three_of_a_kind?
if other.three_of_a_kind?
if highest_triple_rank == other.highest_triple_rank
without { |c| c.rank == highest_triple_rank } <=>
other.without { |c| c.rank == highest_triple_rank }
else
highest_triple_rank <=> other.highest_triple_rank
end
else
1
end
elsif two_pair?
if other.three_of_a_kind?
-1
elsif other.two_pair?
if highest_pair_rank == other.highest_pair_rank
without { |c| c.rank == highest_pair_rank } <=>
other.without { |c| c.rank == highest_pair_rank }
else
highest_pair_rank <=> other.highest_pair_rank
end
else
1
end
elsif one_pair?
if other.two_pair? || other.three_of_a_kind?
-1
elsif other.one_pair?
if highest_pair_rank == other.highest_pair_rank
without { |c| c.rank == highest_pair_rank } <=>
other.without { |c| c.rank == highest_pair_rank }
else
highest_pair_rank <=> other.highest_pair_rank
end
else
1
end
else
if empty? && other.empty?
0
elsif other.one_pair? || other.two_pair? || other.three_of_a_kind?
-1
elsif highest_card_rank == other.highest_card_rank
without { |c| c.rank == highest_card_rank } <=>
other.without { |c| c.rank == highest_card_rank }
else
highest_card_rank <=> other.highest_card_rank
end
end
end
# …
end
Then straight:
Hand = Struct.new(:cards) do
include Comparable
def <=>(other)
if straight?
if other.straight?
if highest_straight_rank == other.highest_straight_rank
without { |c| c.rank == highest_straight_rank } <=>
other.without { |c| c.rank == highest_straight_rank }
else
highest_straight_rank <=> other.highest_straight_rank
end
else
1
end
elsif three_of_a_kind?
if other.straight?
-1
elsif other.three_of_a_kind?
if highest_triple_rank == other.highest_triple_rank
without { |c| c.rank == highest_triple_rank } <=>
other.without { |c| c.rank == highest_triple_rank }
else
highest_triple_rank <=> other.highest_triple_rank
end
else
1
end
elsif two_pair?
if other.three_of_a_kind? || other.straight?
-1
elsif other.two_pair?
if highest_pair_rank == other.highest_pair_rank
without { |c| c.rank == highest_pair_rank } <=>
other.without { |c| c.rank == highest_pair_rank }
else
highest_pair_rank <=> other.highest_pair_rank
end
else
1
end
elsif one_pair?
if other.two_pair? || other.three_of_a_kind? || other.straight?
-1
elsif other.one_pair?
if highest_pair_rank == other.highest_pair_rank
without { |c| c.rank == highest_pair_rank } <=>
other.without { |c| c.rank == highest_pair_rank }
else
highest_pair_rank <=> other.highest_pair_rank
end
else
1
end
else
if empty? && other.empty?
0
elsif other.one_pair? || other.two_pair? || other.three_of_a_kind? || other.straight?
-1
elsif highest_card_rank == other.highest_card_rank
without { |c| c.rank == highest_card_rank } <=>
other.without { |c| c.rank == highest_card_rank }
else
highest_card_rank <=> other.highest_card_rank
end
end
end
# …
end
And so on. The code very quickly gets unmanageable. We’re essentially just typing in an extremely large flowchart about how to compare every possible combination of hands.
It’s possible to do a certain amount of refactoring here to reduce the repetition, but that just moves the mess around and tends to produce overly clever, overly metaprogrammed, templatey code where it’s hard to see where the actual work is getting done.
Let’s start again and think a bit harder. First off, we were using numbers as ranks, but ranks aren’t numbers, not really: you can’t add them or multiply them. “Queen” isn’t a number. They’re mainly just unique identifiers.
So let’s make an empty Card::Rank
class and create inside it thirteen constants, one for each rank:
Card = Struct.new(:rank, :suit)
class Card::Rank
TWO, THREE, FOUR, FIVE,
SIX, SEVEN, EIGHT, NINE,
TEN, JACK, QUEEN, KING, ACE = 13.times.map { new }
end
Now a rank is just an instance of this vanilla class, so it doesn’t have any special behaviour or structure.
We can do the same thing with suits. We don’t need them to actually contain the name of the suit or support any special methods, they just need separate identities:
class Card::Suit
CLUBS, HEARTS, SPADES, DIAMONDS = 4.times.map { new }
end
And now creating cards looks like this:
three_of_diamonds = Card.new(Card::Rank::THREE, Card::Suit::DIAMONDS)
queen_of_hearts = Card.new(Card::Rank::QUEEN, Card::Suit::HEARTS)
ace_of_spades = Card.new(Card::Rank::ACE, Card::Suit::SPADES)
(If it’s the sort of thing you like, you could add a cute Card::Rank#of
method, so that you can create a card by writing QUEEN.of(HEARTS)
.)
We do need to be able to compare the ranks, so let’s add that structure to the Rank
class:
class Card::Rank
include Comparable
ORDER = (
TWO, THREE, FOUR, FIVE,
SIX, SEVEN, EIGHT, NINE,
TEN, JACK, QUEEN, KING, ACE = 13.times.map { new }
)
def <=>(other)
ORDER.index(self) <=> ORDER.index(other)
end
end
The ranks are already declared in ascending order, so we can just remember that order and then write an implementation of #<=>
that compares them by their position in that list.
Now, briefly, we can introduce another class to represent the different categories of hand:
class Hand::Category
include Comparable
ORDER = (
HIGH_CARD, ONE_PAIR, TWO_PAIR, THREE_OF_A_KIND,
STRAIGHT, FLUSH, FULL_HOUSE, FOUR_OF_A_KIND,
STRAIGHT_FLUSH = 9.times.map { new }
)
def <=>(other)
ORDER.index(self) <=> ORDER.index(other)
end
end
We do a similar trick again: a constant to represent each category, and a #<=>
method to compare them.
Then we can add singleton methods to each category. #matches?
decides whether a hand meets the requirements for that category, and #compare
compares two hands that both belong to the category, in this case by comparing their highest cards:
class << Hand::Category::HIGH_CARD
def matches?(hand)
true
end
def compare(a, b)
# …
end
end
(We won’t go into the details of how #compare
works, but there are other interesting abstractions to discover in there.)
Those methods are implemented differently for the ONE_PAIR
category:
class << Hand::Category::ONE_PAIR
def matches?(hand)
pairs(hand).size == 1
end
def compare(a, b)
# …
end
end
Similarly for TWO_PAIR
, THREE_OF_A_KIND
, STRAIGHT
and so on.
Finally we can make a Hand::Rank
class. This is a decorator for Hand
objects; it adds the ability to find the best category for a hand, which lets us compare two ranks according to the order of categories and the special per-category rules:
Hand::Rank = Struct.new(:hand) do
include Comparable
def category
Hand::Category::ORDER.
select { |category| category.matches?(hand) }.max
end
def <=>(other)
if category == other.category
category.compare(hand, other.hand)
else
category <=> other.category
end
end
end
When you have two Hand
objects, you can compare them by getting their respective Rank
objects and comparing those:
Hand = Struct.new(:cards) do
def rank
Hand::Rank.new(self)
end
end
if my_hand.rank > your_hand.rank
puts 'I win!'
end
(It’s good to keep this separate because two hands having equal rank doesn’t necessarily mean they’re equal; those are separate ideas. Two royal flushes of different suits rank identically, but consist of different cards.)
As before, we can enrich these simple abstractions by building operations on top of them. For example, we can give card ranks a new operation called #succ
(for “successor”) that just returns the next highest rank:
class Card::Rank
ORDER = (
TWO, THREE, FOUR, FIVE,
SIX, SEVEN, EIGHT, NINE,
TEN, JACK, QUEEN, KING, ACE = 13.times.map { new }
)
def succ
ORDER[ORDER.index(self).succ]
end
end
So TEN.succ
is JACK
, and ACE.succ
is nil
because ACE
is the highest rank. Notice that Fixnum
already has its own #succ
method that we can use here.
That #succ
operation gives us an elegant way of identifying a straight, which is a hand containing five cards of consecutive ranks: the one below has ranks 5, 6, 7, 8 and 9. First we take the successor of each card; after a 6 comes a 7, after a 9 comes a 10, and so on:
Then we try to pair up cards of the same rank between these two hands:
If and only if the original hand was a straight, we’ll end up with one unpaired card in the original hand (here the five of clubs), and one unpaired card in the hand of successors (the ten of spades). Even better than that, the unpaired card from the original hand must be the lowest card in the straight, and if we look at which card produced the other unpaired card as its successor (the nine of spades), that must be the highest card in the straight, which is what we’ll use to rank it against other straights.
(Sorry, that’s a very brief explanation, but you can have fun thinking about why it must work.)
This technique is beautiful because it distils the idea of a straight down to its abstract essence, which is that its cards are of consecutive ranks. As a bonus, we also get to identify the highest card without relying on the overall order of ranks; we don’t have to sort them or take the maximum or anything, it just falls out.
That makes it easy to implement straights containing low aces. We just make #succ
wrap around, so that the successor of an ACE
is a TWO
:
class Card::Rank
ORDER = (
TWO, THREE, FOUR, FIVE,
SIX, SEVEN, EIGHT, NINE,
TEN, JACK, QUEEN, KING, ACE = 13.times.map { new }
)
def succ
ORDER[ORDER.index(self).succ % ORDER.size]
end
end
Now we can have straights that go ACE
, TWO
, THREE
, FOUR
, FIVE
, and the FIVE
will come out as the highest card even though ACE
is larger than FIVE
. (Then we need to filter out invalid straights where the ACE
appears partway through, but that’s another story.)
Maybe this is all just “object-oriented design”, but it works by identifying minimal abstractions and plugging them together to get more complex emergent behaviour. The final solution has a structure that more closely mirrors the actual rules of poker, and you never have to do anything clever.
Conclusion
I watched a documentary recently where Richard Feynman said offhand:
When a thing looks complicated, it’s possible that we’re looking at it wrong and missing some of the pieces of the puzzle.
He was talking about particle physics, which is a classic example of this: real experiments produce lots of complex-looking results, and the challenge is to find the elegant abstract model that predicts them.
To a lesser degree, we have the same problem when making software, or cooking a meal, or writing a novel, or understanding someone else’s feelings. We need abstractions to take apart the complex things from the real world and reconstruct them in our heads so that we can get some purchase on them.
Alright, it’s time to wrap up. How do you make use of all the stuff I just said?
Firstly, when building software, you should make time to find the honest abstraction. It should go without saying that the wrong abstraction is worse than no abstraction at all. But the right abstraction in the right place can make the difference between an incomprehensible solution and an effortless one.
So pay attention to your domain; take a step back from the detail, look at the overall shape of the problem, and separate its essential features from its incidental ones. Once you recognise the true structure of your problem, you can allow the answer to naturally assume the same shape.
If you’ve ever seen two developers write almost-identical code to solve the same problem, this is probably why: they both saw through the fog and found the problem’s underlying structure, and that structure guided them towards the suddenly obvious solution.
That takes time, but it’s worth it, because it feels so good when you build an honest model of your problem and the solution just falls out; when the abstraction works like a lever, like a crowbar, magnifying the weak effort of your mind at the surface of the problem and producing a more powerful force deep down below.
It’s also important to find the minimal abstraction. Even a very simple abstraction, like a number or a graph or a triangle, can actually be very rich and support a lot of complex ideas. So when you’re making abstractions, start from nothing and only add structure and behaviour that are absolutely essential.
This is especially important in a dynamically typed language like Ruby, where you have to think of the public API of an object as a sort of attack surface that’s exposed to every other object in the program. So keep the number of ideas and operations as small and as honest as possible, and your code’s more likely to be good.
Trapped inside every boring problem is an interesting problem, waiting to be discovered. A lot of problems appear superficially boring, but your code will be better, and you will be happier, if you take a little time to discover that nugget of interestingness. First you can enjoy it, and then you can exploit the hell out of it.
Be excited! Mathematics is exciting and beautiful and it belongs to all of us. The most powerful thing you can do with your mind is to strap it into the hulking metal exoskeleton of mathematics and throw complexity straight out of an airlock. Don’t waste it because you had a bad experience in high school.
But, finally, remember to keep your perspective. All of this advice comes with a caveat: don’t lose sight of why we write software, which most of the time is to deliver value to users, not to satisfy our own curiosity and appetite for beauty.
The measure of greatness for most software is “does it make people's lives better?”, not “does it contain beautiful abstractions?”. But those goals don't have to be in opposition either. A bit of both is healthy; better abstractions can lead to more satisfaction for you and for your software’s users.
I’d like to leave you with this thought: if you can write a program, you’re good at mathematics. You might not choose to call it that, you might not even like it, but that’s the way it is. Mathematics is the study of abstraction, and abstraction is what we talk about when we talk about programming.
Sorry to drop that on you like this! (I’m not really sorry.)
Thanks for listening.