XKP HOME PAGE

- Contents -
 
Cover
Introduction
Contents
  Q What is a personal computer?
  Q How do personal computers output Kanji?
  Q What is a font?
  Q What is a character code?
  Q How are Kanji input?
  Q Is it impossible to use characters not in the character set?
  Q Do I only need a user-defined character editor to use user-defined characters?
  Q How do I input user-defined characters with IME?
  Q Is it possible to use characters from mainframes?
  Q What is the JIS code?
  Q What is the Shift JIS code?
  Q What is Unicode?
  Q How long will it take for Unicode to become available?
  Q What is XKP?
  Q Can user-defined characters be transferred using XKP?
  Q Is it possible to use more user-defined characters than the number of character codes available?
  Q Can XKP user-defined characters be printed?

WindowsNT®
Japanese Processing Guidebook

CONTENTS


Q What is Unicode?

A A character set containing characters from around the world.

It is a character set, containing characters from around the world, that was created by the Unicode Consortium, a group consisting mainly of American private corporations.
It is not a standard defined by a public body, but roughly the same content is specified by the ISO/IEC-10646-1 international standard, while the domestic JIS standard equivalent to this is JIS X 0221.
Unicode is characterized by its expression of the characters of the major world languages using 16 bits (see below). (A bit is the basic unit of information, and can be either 0 or 1). For this reason, it has no compatibility with traditional character codes that use 7 or 8 bits for describing characters.
Unicode is a character set that contains many characters from around the world, but that does not mean that by using a Unicode-based system that it is possible to use all the world's characters. Actually, a font containing all the characters specified by Unicode does not exist (without such a font, it is impossible to either display or print the characters). In addition, because of the different rules used for writing (for Japanese, this includes end-of-line processing, etc.), application software that is capable of handling such rules is also necessary.
In reality, Unicode is utilized in order to better handle a particular language's characters, rather than use several different languages simultaneously.
For example, in Japanese, the benefits of using Unicode rather than Shift JIS are as follows.

  • The characters equivalent to JIS X 0212 can be used.

  • The area for user-defined characters is expanded from 1880 characters to 6400 characters.

Unicode has been criticized for not allowing the simultaneous use of Japanese and Chinese typefaces, but this is not necessarily true. After unifying characters that differ minimally in appearance based on certain rules, in some cases Japanese characters and Chinese characters were given the same character code. However, location and language information should be determined elsewhere, rather than by character code, and if such information is correctly provided, there is no possibility of typefaces from different languages being confused with each other.

Note: Some material available states that Unicode uses 16 bits per character; however, this is an incorrect statement. Just as JIS X 0208 expresses one character using two pieces of 7 or 8 bit data, Unicode may express one character as a combination of several pieces of 16 bit data.