EBCDIC 930

EBCDIC 930: CCSID 930 (sometimes known as CP930 or codepage 930) is one of several Japanese EBCDIC code pages created by IBM for representation of Japanese text. It is commonly used on IBM z/OS and IBM System i operating system.

It encodes halfwidth Katakana, fullwidth Katakana, Hiragana and Kanji.

Contents

1 Technical detail

2 Practical considerations

3 References

4 External links

Technical detail

CCSID 930 uses a stateful EBCDIC encoding scheme that uses 1 byte to encode halfwidth Katakana and 2 bytes to encode all other Japanese characters. The single byte portion is CCSID 290, which is also known as EBCDIK (Extended Binary Coded Decimal Interchange Kana). The double byte portion is CCSID 300, which is shared with CCSID 939.^[1]^[2] If only halfwidth Katakana mixed with Latin characters is used, which was the standard till the 80s, CCSID 930 can be considered a pure 8bit encoding. When other types of Japanese or fullwidth characters are used, it is a multibyte encoding where the Shift-In 0x0E and Shift-Out 0x0F bytes are used to indicate the start and end of a double-byte encoding.

The most recent versions of CCSID 930 (CCSID 1390) supports JIS X 0213.

It was invented by Alan Lloyd Jones at IBM Hursley Laboratories, UK.^{[citation needed]}

Practical considerations

CCSID 930 itself and its encoding scheme contains a number of idiosyncrasies that makes working with CCSID 930 in practice hard (see also EBCDIC for idiosyncrasies of the EBCDIC standard) and are of some practical relevance.

Because of the Shift-In, Shift-Out codes parsing a byte sequence from the middle is hard. Interpretation of the bytes requires backing up until one of the shift bytes is encountered.

Although CCSID 930 allows for mixed halfwidth and fullwidth character text, many database schemas strictly distinguish between columns containing only single byte halfwidth Katakana and such containing only double byte fullwidth characters. This is a convenience created for software developers to make text length prediction for a given column size in bytes easier and vice-versa.

On the downside the above means that for consistency Latin text in such fullwidth character column will have to be entered or converted into fullwidth Alphabetic characters (interesting when doing database searches) such that they are encoded as double byte characters

When database columns are implicitly defined as pure fullwidth character text the Shift-In, Shift-Out codes are often omitted, which results in strictly speaking incorrect encoding. When the shift codes are missing, usually CCSID 290 or CCSID 300 needs to be used for proper conversion to another charset, like the more portable Unicode.

References

Lunde, Ken. CJKV Information Processing. Sebastopol, Calif.: O'Reilly & Associates, 1998. ISBN 1-56592-224-7.

^ http://www.ibm.com/software/globalization/ccsid/ccsid930.jsp

^ http://www.ibm.com/software/globalization/ccsid/ccsid939.jsp

External links

Graphical view of CCSID 930 in ICU Converter Explorer

v · d · eCharacter encodings

Character sets

Early telecommunications
ASCII · ISO/IEC 646 · ISO/IEC 6937 · T.61 · sixbit code pages · Baudot code · Morse code · Chinese telegraph code

ISO/IEC 8859
-1 · -2 · -3 · -4 · -5 · -6 · -7 · -8 · -9 · -10 · -11 · -12 · -13 · -14 · -15 · -16

Bibliographic use
ANSEL · ISO 5426 / 5426-2 / 5427 / 5428 / 6438 / 6861 / 6862 / 10585 / 10586 / 10754 / 11822 · MARC-8

National standards
ArmSCII · CNS 11643 · GOST 10859 · GB 2312 · HKSCS · ISCII · JIS X 0201 · JIS X 0208 · JIS X 0212 · JIS X 0213 · KPS 9566 · KS X 1001 · PASCII · TIS-620 · TSCII · VISCII · YUSCII

EUC
CN · JP · KR · TW

ISO/IEC 2022
CN · JP · KR · CCCII

MacOS codepages ("scripts")
Arabic · CentralEurRoman · ChineseSimp / EUC-CN · ChineseTrad / Big5 · Croatian · Cyrillic · Devanagari · Dingbats · Farsi · Greek · Gujarati · Gurmukhi · Hebrew · Icelandic · Japanese / ShiftJIS · Korean / EUC-KR · Roman · Romanian · Symbol · Thai / TIS-620 · Turkish · Ukrainian

DOS codepages
437 · 720 · 737 · 775 · 850 · 852 · 855 · 857 · 858 · 860 · 861 · 862 · 863 · 864 · 865 · 866 · 869 · Kamenický · Mazovia · MIK · Iran System

Windows codepages
874 / TIS-620 · 932 / ShiftJIS · 936 / GBK · 949 / EUC-KR · 950 / Big5 · 1250 · 1251 · 1252 · 1253 · 1254 · 1255 · 1256 · 1257 · 1258 · 1361 · 54936 / GB18030

EBCDIC codepages
37/1140 · 273/1141 · 277/1142 · 278/1143 · 280/1144 · 284/1145 · 285/1146 · 297/1147 · 420/16804 · 424/12712 · 500/1148 · 838/1160 · 871/1149 · 875/9067 · 930/1390 · 933/1364 · 937/1371 · 935/1388 · 939/1399 · 1025/1154 · 1026/1155 · 1047/924 · 1112/1156 · 1122/1157 · 1123/1158 · 1130/1164 · JEF · KEIS

Platform specific
ATASCII · CDC display code · DEC-MCS · DEC Radix-50 · Fieldata · GSM 03.38 · HP roman8 · PETSCII · TI calculator character sets · WISCII · ZX Spectrum character set

Unicode / ISO/IEC 10646
UTF-8 · UTF-16/UCS-2 · UTF-32/UCS-4 · UTF-7 · UTF-1 · UTF-EBCDIC · GB 18030 · SCSU · BOCU-1

Miscellaneous codepages
APL · Cork · HZ · IBM code page 1133 · KOI8 · TRON

Related topics
control character (C0 C1) · CCSID · Character encodings in HTML · charset detection · Han unification · ISO 6429/IEC 6429/ANSI X3.64 · mojibake

Categories:
EBCDIC code pages
Character sets
Encodings of Japanese

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

Code page 930 — (abbreviated as CP930, also known as Japanese EBCDIC) is a code page created by IBM for representation of Japanese text. It is a superset of EBCDIC. It is commonly used on IBM OS390 and IBM AS400 operating system.It encodes halfwidth Katakana,… … Wikipedia
Code page — is another term for character encoding. It consists of a table of values that describes the character set for a particular language. The term code page originated from IBM s EBCDIC based mainframe systems,[1] but many vendors use this term… … Wikipedia
Unicode — For the 1889 Universal Telegraphic Phrase book, see Commercial code (communications). The Unicode official logo since October 2009 … Wikipedia
Character encoding — Special characters redirects here. For the Wikipedia editor s handbook page, see Help:Special characters. A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of… … Wikipedia
Baudot code — The Baudot code, invented by Émile Baudot,[1] is a character set predating EBCDIC and ASCII. It was the predecessor to the International Telegraph Alphabet No 2 (ITA2), the teleprinter code in use until the advent of ASCII. Each character in the… … Wikipedia
Control character — In computing and telecommunication, a control character or non printing character is a code point (a number) in a character set, that does not in itself represent a written symbol. It is in band signaling in the context of character encoding. All … Wikipedia
ISO/IEC 646 — This article is about a character encoding standard. For the ISO C header file, see iso646.h. ISO/IEC 646:1991, Information technology ISO 7 bit coded character set for information interchange, is an ISO standard that since its first edition in… … Wikipedia
UTF-7 — (7 bit Unicode Transformation Format) is a variable length character encoding that was proposed for representing Unicode text using a stream of ASCII characters. It was originally intended to provide a means of encoding Unicode text for use in… … Wikipedia
Western Latin character sets (computing) — Several binary representations of character sets for common Western European languages are compared in this article. These encodings were designed for representation of Italian, Spanish, Portuguese, French, German, Dutch, English, Danish, Swedish … Wikipedia
Morse code — Chart of the Morse code letters and numerals Morse code is a method of transmitting textual information as a series of on off tones, lights, or clicks that can … Wikipedia

Academic Dictionaries and Encyclopedias

EBCDIC 930

Contents

Technical detail

Practical considerations

References

External links

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

EBCDIC 930

Contents

Technical detail

Practical considerations

References

External links

Look at other dictionaries:

Share the article and excerpts

Direct link