Characters: Unicode | |||||||||||||
Many world languages cannot be represented using 8-bit code. | |||||||||||||
Unicode | |||||||||||||
16-bit representation | |||||||||||||
64K different symbols | |||||||||||||
ASCII included as a subset (zero-extend to 16 bits) | |||||||||||||
evolving standard | |||||||||||||
version 2.0 supports 38,885 distinct characters from many languages | |||||||||||||
supported by Java (char type is 2-byte) | |||||||||||||
needs separate byte type | |||||||||||||