3. String storage
ASCII is a very popular standard for storing characters and groups of characters i.e. strings.
It is a 1 byte per character system. The string:
MyString ="hello"
is five characters in length and it needs 5 bytes of storage.
ASCII is used for standard 'Latin' characters (this includes English). It uses denary values 0-255 and can store:
- upper case letters
- lower case letters
- some control characters
- symbols
- a few accented characters.
However, the world speaks many languages which use non-Latin characters such as Ў (a Russian cyrillic letter). To handle this, Unicode was developed. This is 2 bytes per character (Unicode-16).
Many computer languages support both, and will switch storage size depending if a single byte or a 2 byte unicode version is needed.
Python for example uses Unicode by default, but if the character is part of the ASCII set it uses a single byte.
For example, the string
mystring = helЎlo
is six characters long and it takes 7 bytes of storage because the Ў cyrillic character needs 2 bytes.
Challenge see if you can find out one extra fact on this topic that we haven't already told you
Click on this link: How much storage is needed for a string?