Hello and welcome back to my blog!

In a kind of follow up to 10 steps to becoming a better programmer, this time I wanted to write about binary and binary operations, because I’ve been rather surprised by the number of people (some even running highly successful tech businesses) who simply don’t understand such a useful tool.

If you’ve seen this kind of thing before and been confused, this article is for you:

const uint kSignal = 1<<0; const uint kBreak = 1<<1; const uint kWait = 1<<2; uint flags = Wait(kSignal|kBreak|kWait); flags &= ~kWait; |

I’m going to be talking about the binary representation of unsigned integers in this article.

## What is binary?

The binary numeral system is a way of representing numbers, in the same way that the decimal system that humans use every day is. The difference between the two is the ‘base‘, which is simply a count of the unique number of digits representable in the system, including 0. So binary is base 2 (since we have 0 and 1 as the full set of digits) and decimal is base 10 (since we have from 0-9).

We wouldn’t get very far in life as humans if we could only count up to 9, though. We needed a way to count the number of tens while still being able to accumulate units, which we solved by pre-pending the tens count to the start of the number and resetting the units to zero. So 9+1 = one ten and 0 units or 10. Obvious right? With binary it’s a similar thing:

## Bits

In binary, each one of the digits is called a bit which is short for **b**inary dig**it**. Bits can be numbered, like so (always starting from 0):

Computers usually deal with bits a chunk at a time, either 8, 16, 32, 64, 128 bits etc depending on the register size. For the rest of this article I’m going to be assuming a fictional register size of 4 bits (known as a nibble), for the sake of clarity.

## Converting from binary to decimal

It’s useful to be able to covert from binary to decimal. To see how we can do this, it’s helpful to explore the composition of numbers in base ten and base two.

Decimal numbers which represent tens, hundreds, thousands and so on can be called ‘powers of ten‘:

**10 ^{x} in decimal:**

`10`

^{0} = 1

10^{1} = 10

10^{2} = 100

10^{3} = 1000

Looking at them in this way, it’s clear to see that when we raise ten to the power of an integer x, we can write it as: **one** appended with x **zeros**. In binary it’s a very similar thing, except we’re now dealing with base two, instead of base ten:

**2 ^{x} in binary:**

`2`

^{0} = 1

2^{1} = 10

2^{2} = 100

2^{3} = 1000

This time, it’s binary **one** appended with x binary **zeros**. If we do the same thing but convert to decimal:

**2 ^{x} in decimal:**

`2`

^{0} = 1

2^{1} = 2

2^{2} = 4

2^{3} = 8

The individual powers of two above correspond to a particular bit in a binary number. The bit which is ‘set’ (filled with a one) is the bit numbered with the corresponding power:

So, now we have a way to calculate the value of any particular binary digit. In order to convert an arbitrary binary number into decimal, all we have to do is add the powers of two together:

Adding together only the powers of two which have a corresponding 1 in the bit:

2^{0} + 2^{2} + 2^{3} =

…we arrive at the decimal equivalent

1 + 4 + 8 = **13**

## Bitwise operations

Since a computer is built out of transistors (which have two states, on or off) binary is the natural language of the computer. As such, bitwise operations are often the fastest possible performing operations since there is no need to mess about dealing with the human’s silly decimal system.

### Logical left shift

The logical left shift operator, or **<<** (in most programming languages) simply takes a binary number and moves every digit left by the number of bits in the 2nd argument, filling the bottom most bit(s) with zero. Here is the binary representation of decimal one:

Shifting this left by two bits:

The astute among you will recognise this as left shift of one by two as being equivalent of multiplying by two and then multiplying by two again. Indeed, every time we left-shift a number by one, it’s equivalent to multiplying it by two. This is usually much faster than actually multiplying the number by two.

### Logical right shift

Logical right shift, or >> is the opposite of left shift. Instead of it being equivalent to multiply, it’s equivalent to divide:

12 >> 2 = 12 / 4 = 3

### Underflow and overflow

What happens when we take a binary number, shift it and some of the digits get shifted off one end or the other? In this example we only have 4 bits of storage for our numbers.

Above, decimal three has been right shifted by one and we’ve lost a digit off the end:

3>>1 = 3/2 = 1

As you can see, this is equivalent to the integer decimal operation 3/2. We lost a bit of information, hence the approximation. This is called underflow.

Overflow is arguably worse:

This time we don’t end up with an approximate answer, it’s just plain wrong! This is why it’s very important to understand the **storage capacity** of variables that we are working with. It’s worth noting that this result is no different to what we would end up with if we performed the equivalent decimal multiply operator in the same storage capacity.

### Bitwise Or

Bitwise or, represented by | in most languages ‘combines’ the two binary numbers together. The resulting binary number has a **one** in each bit where either of the two source numbers have a **one**, and only has a **zero** where both of the two source numbers have **zero**.

This is incredibly useful when dealing with binary numbers which represent a set of ‘flags’. Such numbers are called bitfields. Flags are pieces of state information which can either be set or clear, which is a nice fit for binary. Flags are often defined like this:

const uint kSignal = 1<<0; // binary 0001 const uint kBreak = 1<<1; // binary 0010 const uint kWait = 1<<2; // binary 0100 |

Each flag is given a meaningful name and assigned the **unique** binary bit which represents it. Because a bitfield is designed to hold many such flags together, it’s often useful to be able to set multiple flags together, which is where **bitwise or** comes into play:

const uint kSignal = 1<<0; // binary 0001 const uint kBreak = 1<<1; // binary 0010 const uint kWait = 1<<2; // binary 0100 // 0011 = 0001 | 0010 // bitField = 0011 uint bitField = kSignal | kBreak; |

Above we have composed a new bitfield containing the flags *kSignal* and *kBreak* but not *kWait*. Because our flags are unique, when we **or** them together we end up with the combination of both flags set in our bitfield.

### Bitwise And

Bitwise and, represented by & in most languages combines the two input numbers in a different way. The resulting binary number has a **one** in each bit **only** where both source numbers have a **one**, and has a **zero** in all other locations.

**Bitwise and** is also very useful when dealing with flags. For example, what if we want to tell if a given bitfield contains a particular flag?

const uint kSignal = 1<<0; // binary 0001 const uint kBreak = 1<<1; // binary 0010 const uint kWait = 1<<2; // binary 0100 // 0011 = 0001 | 0010 // bitField = 0011 uint bitField = kSignal | kBreak; // (0011 & 0100) = 0, so // containsWait = false bool containsWait = (bitField & kWait) != 0; |

Here, *containsWait* will be false because our bitfield does not have the bit represented by *kWait* set, so when it gets **bitwise anded** with the flag kWait (which of course only contains one particular bit) the resulting integer will be zero. You can also write that like this:

bool containsWait = (bitField & kWait) == kWait; |

### Bitwise not

Bitwise not, represented by ~ in most languages, simply takes each bit in a single number and flips it.

This is useful in conjunction with **bitwise and** when we’d like to remove a flag from a bitfield:

const uint kSignal = 1<<0; // binary 0001 const uint kBreak = 1<<1; // binary 0010 const uint kWait = 1<<2; // binary 0100 // 0011 = 0001 | 0010 // bitField = 0011 uint bitField = kSignal | kBreak; // clear kBreak // ~kBreak = ~0010 = 1101 // 0011 & 1101 = 0001 // bitField = 0001 bitField &= ~kBreak; |

Because ~ flips every bit, performing that operation on a flag forms a mask which represents every possible flag except the given one. When this mask is then **bitwise anded** with a bitfield, it clears the bit from the bitfield which is zero in the mask, thus removing it.

Multiple flags can be cleared at the same time by combining all three operators:

const uint kSignal = 1<<0; // binary 0001 const uint kBreak = 1<<1; // binary 0010 const uint kWait = 1<<2; // binary 0100 // 0011 = 0001 | 0010 // bitField = 0011 uint bitField = kSignal | kBreak; // clear kBreak and kSignal // kBreak|kSignal = 0010 | 0001 = 0011 // ~(kBreak|kSignal) = ~0011 = 1100 // 0011 & 1100 = 0 // bitField = 0 bitField &= ~(kBreak|kSignal); |

### Exclusive or

Exclusive or, represented by ^ in most languages takes two numbers and produces a **one** for each bit which is set in one number but **not** the other and **zero** elsewhere.

This operator is useful with flags if we want to toggle a flag:

const uint kSignal = 1<<0; // binary 0001 const uint kBreak = 1<<1; // binary 0010 const uint kWait = 1<<2; // binary 0100 // 0011 = 0001 | 0010 // bitField = 0011 uint bitField = kSignal | kBreak; // toggle kSignal // bitField = bitField ^ kSignal // 0010 = 0011 ^ 0001 // bitField = 0010 bitField ^= kSignal; |

Again, we can use it with a mask to toggle various flags together:

const uint kSignal = 1<<0; // binary 0001 const uint kBreak = 1<<1; // binary 0010 const uint kWait = 1<<2; // binary 0100 // 0011 = 0001 | 0010 // bitField = 0011 uint bitField = kSignal | kBreak; // toggle kSignal and kBreak // bitField = bitField ^ (kSignal | kBreak) // 0000 = 0011 ^ 0011 // bitField = 0 bitField ^= kSignal | kBreak; |

## Maintainability

Because it’s more difficult to read the true meaning of bitwise operations on a set of flags you should consider very carefully whether you need the extra level of compactness offered by using these techniques before writing new code. It might be just as easy to use multiple boolean variable types, which will be much easier to read and maintain.

Understanding bitwise operations will, however, set you in good stead should you encounter them in legacy code and a good programmer should be able to fully understand all the tools at his disposal.

That’s it for now, hope this has helped you get a better understanding of how binary works!

Until next time, have fun!

Cheers, Paul.

very useful ,thanks very much!!!!

This article is so useful I read it 0010 times

There are just 10 sorts of people. The ones who understand binary and the one who don’t.

This is really really helpful thanks a lot for the hard work

No problem