![]() The first plane (code points 0 through 65535), known as the Basic Multilingual Plane (BMP), represents its code points via syntax U+hhhh, where each "h" represents a hexadecimal digit. To support the thousands of rarely used or obsolete characters (e.g., Egyptian Hieroglyphs) found in historic scripts, Unicode 2.0 increased its codespace to more than one million code points by introducing a new architecture based on planes and surrogates.Ī plane is a group of 65,536 code points Unicode supports 17 planes. Unicode 1.0 fixed the size of a character at 16 bits, limiting the maximum number of characters that could be represented to 65,536. Various UTF encodings have been devised, with the variable-length UTF-8 and UTF-16 encodings being commonly used. Unicode-based text is encoded for storage or transmission by using a Unicode Transformation Format (UTF) encoding. The resulting standard became known as Unicode. In 1987, work began on a universal coded character set that could accommodate all of the characters of the world's living (and, eventually, dead) languages. For example, the American Standard Code for Information Interchange (ASCII) is a coded character set (e.g., hexadecimal value 41 is assigned to "A").ĪSCII is an old coded character set standard with an English language bias. Characters and UnicodeĪ character is a minimal unit of text that doesn't have a shape (a font's glyph provides the shape) and doesn't have an associated numeric value (e.g., "a" – I've placed this character in double quotes to signify its abstractness).Ī character set is a collection of characters, and a coded character set is a character set in which code points (numeric values) are associated with characters. Adding cast operators: notĮxpression byte i = (byte) + (short) + (double) 2000 first casts 32-bit integer 2000 to a double-precision floating-point value, then applies the unary plus (identity) operator to the result (leaving the result unchanged), then casts the double-precision floating-point value to a 16-bit short integer, then applies the unary plus operator to the result, and finally casts the short integer to an 8-bit byte integer before assigning the result to i. Instead, I'm going to show you that not every value that can be assigned to a char variable denotes a character. For example, did you know that Java lets you declare a class within an interface, as in interface A ? Also, were you aware that you can "add" cast operators together in an assignment statement such as byte i = (byte) + (short) + (double) 2000 ? (You're not really adding cast operators.) Neither of these oddities is the subject of this post. ![]() Just when you thought you knew everything there is to know about the Java language, along comes something new to challenge your sense of complete mastery.
0 Comments
Leave a Reply. |