Characters

Characters can also be represented in binary. Characters are usually grouped together in a character set. A character set includes:

  • alphanumeric data (letters and numbers)
  • symbols (*, &, : etc.)
  • control characters (Shift, Escape etc.)

ASCII

ASCII uses 8 bits to represent a character.

However, one of the bits is a parity bit. This is used to perform a parity check (a form of error checking). This uses up one bit, so ASCII represents 128 characters (the equivalent of 7 bits) with 8 bits rather than 256.

For example, the ASCII code for lower case z is 122 and is shown below:

Parity Bit6432168421
01111010

Extended ASCII

It is possible to disregard the use of a parity bit to allow ASCII to represent 256 characters. This is known as extended ASCII. There are different versions of extended ASCII in use.

curriculum-key-fact
  • ASCII uses 8 bits to represent a character
  • ASCII can represent 128 characters
  • ASCII sets the most significant bit as a parity bit
  • Extended ASCII can allow for the representation of 256 characters and disregards that use of a parity bit
  • ASCII is less demanding on memory use than Unicode

Limitation of ASCII

The 128 or 256 character limits of ASCII and Extended ASCII limits the number of character sets that can be held. Representing the character sets for several different language structures is not possible in ASCII, there are just not enough available characters.

Unicode

Unicode was created to allow more character sets than ASCII.

Unicode uses 16 bits to represent each character. This means that Unicode is capable of representing 65,536 different characters and a much wider range of character sets.

curriculum-key-fact
  • Unicode can represent 65,536 charaters
  • Unicode uses 16 bits to represent each character
  • Unicode can represent a greater range of character sets than ASCII
  • There are adapted forms of the original Unicode standard capable of representing millions of characters