Characters - the char data type
In C/C++, characters are represented by the char
data type. A char
is just an 8-bit integer (a single byte) used to represent a single character. Something like 01001010
.
01001010
is a binary number. It’s made up of 8 bits, each of which can be either a 1
or a 0
. To arrive at the decimal integer value it is storing, we start at the right-hand side and work our way left, doubling the value of each bit as we go. If the bit is a 1
, we add the value to our total. If the bit is a 0
, we don’t add anything. Let’s take a look at the above binary number:
128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 |
---|---|---|---|---|---|---|---|
0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 |
So, to arrive at the decimal value of the above binary number, add columns 2, 8, 64 together to arrive at 74. Now we need a way of turning these numbers into characters. This is achieved by mapping each value to a character. In this way, we can construct text from binary values.
So how does this mapping occur? There many ways that we could map numbers to characters, and as a result there are different standards such as UTF-8 and Unicode. UTF-8 is becoming the main coding standard for things like webpages, and it builds upon the earlier ASCII mapping. To understand how this works, we can look at the details of the ASCII standard, and how it works to create characters.
The ASCII Format
ASCII, standing for American Standard Code for Information Interchange, is a character encoding standard for electronic communication. You can represent this mapping as a table, showing which character each numeric value maps to.
The ASCII table is a standardised lookup table that maps integers to charactersm and is used by all modern computers. It’s is a great example of a low-level programming construct as it’s a simple, efficient way to map integers to characters. Let’s take a look and see what our binary sequence 01001010
maps to:
So we consult our ASCI table, and find that 74 maps to the character J
! So, the binary number 01001010
is the integer 74
, which is the character J
! We now have a simple, reproducible method to go from 0
s and 1
s in memory, to characters!
Now that we know how to store a single character in memory, let’s take a look at how we can store an array of characters (a string) in memory.