INT 21 - DOS 3.3+ - SET GLOBAL CODE PAGE TABLE AX = 6602h BX = active code page (see #01757) DX = system code page (active page at boot time) Return: CF set on error AX = error code (see #01680 at AH=59h/BX=0000h) CF clear if successful AX = EB41h (Novell NWDOS v7.0 when NLSFUNC not installed and request was for previously-active code page) SeeAlso: AX=6601h,INT 2F/AX=14FFh (Table 01757) Values for code page: 0 Reduced 7-bit ASCII [NetWare] 37 EBCDIC: US/Canada English (CECP) [Windows NT 3.51+] 38 EBCDIC: International (old) 111 Greek 112 Turkish 113 Yugoslavian 161 Arabic [Linux] 162 Arabic [Linux] 163 Arabic [Linux] 164 Arabic [Linux] 165 Arabic [Linux] 237 EBCDIC: Germany (CECP) 273 EBCDIC: ??? (CECP) 274 EBCDIC: Belgium 275 EBCDIC: Brazilian 277 EBCDIC: Danish/Norwegian (CECP) 278 EBCDIC: Finnish/Swedish (CECP) 280 EBCDIC: Italian (CECP) 281 EBCDIC: Japanese-E 284 EBCDIC: Latin-American/Spanish (CECP) 285 EBCDIC: UK English (CECP) 290 EBCDIC: Japanese Kana 297 EBCDIC: French (CECP) 367 US-ASCII (ISO 646-US, 7-bit) 420 EBCDIC: Arabic 1 423 EBCDIC: Greek 424 EBCDIC: Hebrew 437 US / English / PC-8 / IBM-2 500 EBCDIC: Belgium/Switzerland (CECP) 500 EBCDIC: International 646 (??? reserved for ISO 646 7-bit codepages) 667 Eastern Europe (Polish) 668 Eastern Europe (Slavic) 708 Arabic/Middle East 737 Greek (2) 775 Baltic / Baltic Rim 819 Latin-1 (ISO 8859-1) 850 Multilingual (Latin-1) 851 Greek 852 Slavic/Easter Europe (Latin-2) [DOS 5+] 853 Turkish (Latin-2) 854 Spanish 855 Cyrilic (1) 857 Turkish 860 Portugese 861 Icelandic 862 Hebrew 863 French Canadian 864 Arabic/Middle East 865 Nordic (Norwegian/Danish) 866 Russian (Cyrillic 2) 867 Czech 868 Arabic 869 Greek (1) 870 EBCDIC: Yugoslavia (Roece) 871 EBCDIC: Icelandic 874 Thailand 875 EBCDIC: Greek 880 Russian (Cyrillic GOST) 880 EBCDIC: Cyrillic 881 Latin 1 (ISO 8859-1) 882 Latin 2 (ISO 8859-2) 883 Latin 3 (ISO 8859-3) 884 Latin 4 (ISO 8859-4) 885 Latin 5 (ISO 8859-5) 891 unknown 897 Japanese (Shift-JIS) 903 unknown 904 unknown 905 EBCDIC: Turkish 912 Latin 2 (ISO 8859-2: Eastern Europe) 913 (??? reserved for Latin 3) 914 (??? reserved for Latin 4) 915 Cyrillic (ISO 8859-5: Latin/Cyrillic) 916 (??? reserved for ISO 8859-6: Latin/Arabic) 917 (??? reserved for ISO 8859-7: Latin/Greek) 918 EBCDIC: Arabic 2 919 (??? reserved for ISO 8859-9: Latin 5) 920 (??? reserved for ISO 8859-10: Latin 6/Sami) 932 DBCS: Japanese (Shift-JIS) 934 DBCS: Korean 936 DBCS: Chinese (PRC/ROC, Simplified/xGB) 938 DBCS: Taiwan 938 DBCS: Chinese (PRC/ROC) 942 DBCS: Japanese SAA 944 DBCS: Korean SAA 948 DBCS: Chinese SAA (PRC/ROC) 949 Korean (Unified Hangul; Extended Wansung) 950 Chinese Traditional, Big5 (Taiwan, Hong Kong) 966 Saudi Arabian 972 Hebrew (Israeli VT100) 999 reserved for user-definable codepages 1004 Desktop Publishing 1026 EBCDIC: Turkish (Latin 5) 1047 EBCDIC: International (CECP, de-facto EBCDIC-US) 1250 MSWIN: Eastern Europe (Latin 2) 1251 MSWIN: Cyrillic 1252 MSWIN: English/W. Europe/Standard (Latin 1) 1253 MSWIN: Greek (GRC) 1254 MSWIN: Turkish 1255 MSWIN: Hebrew 1256 MSWIN: Arabic 1257 MSWIN: Baltic (Estonian, Latvian, Lithuanian) 1258 MSWIN: Vietnamese 1361 ANSI???: Korean (Johab) 10000 MAC: International/Standard (Roman) 10006 MAC: Greek 10007 MAC: Cyrillic 10029 MAC: Latin 2 10079 MAC: Icelandic 10081 MAC: Turkish 10646 (should be reserved for the future ISO 10646 32-bit codepage???) 65400 OS/2: reserved for Glyphs Notes: not all code pages are available in all versions of DOS or DOS-compatibles, and many (particularly EBCDIC) have not been implemented for *any* DOS to date CECP = 'Country Extended CodePage' by IBMInternational Busiuness MachinesInternational Busiuness Machines) A hardware, software and other service technology company founded in 1911.. Unicode (UCS-2) is a 16-bit character codeset, covering all commonly used characters from almost any language. Not all definitions are fixed at the time of this writing. Unicode will be the future of character coding for the foreseeable future, but is only the "basic multilingual plane" (BMP) subset of 32-bit ISO 10646 codes (UCS-4), a single character set standard covering requirements for all countries and languages, which is still under construction. The MS Windows 'ANSI' codepage 1252 (based on the MS Windows 3.0+ implementation) appears to be 100% compatible with the code sets used by Amiga OS and Acorn Archimedes RISC-OS and is also a linear subset of the 16bit UniCode code set (UCS-2); the actual ANSI codepage is defined by ISO 8859-1 (Latin 1). At least applications for OS/2 Warp 3 Presentation Manager can use EBCDIC codepages, but the codepage ID assignments for EBCDIC codepages are not known for OS/2. OS/2 SAA codepages are not supported in CONFIG.SYS. Codepage 65400 "Glyphs" is not actually a codepage, but a way to directly access the first 256 of the 383 glyphs from the current font set. Novell DOS 7/DR DOS 6/Caldera OpenDOS undocumentedInformation about a product which is not publicly available from the manufacturer, and must be determined by reverse-engineering (disassembly, trial-and-error, etc.). Undocumented information tends to change -- often dramatically -- between successive revisions of a product, since the manufacturer has no obligation to maintain compatibility in behavior which is not explicitly stated. codepage 853 does not necessarily match with MS-DOS' undocumentedInformation about a product which is not publicly available from the manufacturer, and must be determined by reverse-engineering (disassembly, trial-and-error, etc.). Undocumented information tends to change -- often dramatically -- between successive revisions of a product, since the manufacturer has no obligation to maintain compatibility in behavior which is not explicitly stated. codepage 853. Undocumented codepages 667 and 668 can be found in Russian's PTS/DOS 6.51 and S/DOS 1.x DISPLAY.CPI and contain some Eastern European characters. Novell NetWare 3.xx clients support UniCode and codepages 437, 850, 860, 863, 865, 897, 932, and 1252 (possibly more). NetWare 4.xx clients also support 1250, 1251, 1256. Personal NetWare 1.0 (PNW), as it was distributed in Europe, supports UniCode and codepages 437, 850 and 1252. Novell's Client32 for DOS/Windows supports 874, 932, 936, 949, 950, 1250 - 1257. For codesets not yet available, Novell offers a reduced ASCII 7-bit support through a codepage 0 used as a translation table to UniCode, that supports characters 32-127 except 92 ('\'). Format of DOS .CPI (Code Page Information) file header: Offset Size Description (Table 01758) 00h BYTE ID tag FFh FONT file (Standard for generic display or printer font files used by MS-DOS, PC-DOS, DR DOS and Novell DOS) 7Fh DRFONT file (Used by DR DOS 6.0 / Novell DOS 7 for enhanced & compressed display font files. DR DOS 6.0 and Novell DOS 7 still support the standard FONT files, thus allowing leaning of .CPI files from MS-DOS to DR DOS / Novell DOS!) 01h 7 BYTEs ID string "FONT " = FONT file (Standard for display or printer) "DRFONT " = DRFONT file (Enhanced compressed format used by DR DOS 6.0 / Novell DOS 7 for display fonts) 08h 8 BYTEs reserved (0) 10h WORD number of pointers (1) 12h BYTE type of pointers (1) 13h DWORDDoubleword; four bytes. Commonly used to hold a 32-bit segment:offset or selector:offset address. pointer to file offset of FontInfoHeader (Generally pointing to the byte just after FontFileHeader, that is 0000h:0017h. Due to extra data at offset 17h, this value has changed with DR DOS 6.0 / Novell DOS 7 DRFONTs! "MS-DOS 4.0 programmers reference" claimed word offset +15h as an endmarker (0000h), but actually it is the High-Word of the pointer.) --- Extended FontFileHeader with DR DOS 6.0 / Novell DOS 7 DRFONTs: --- 17h BYTE number of fonts per codepage supported by this file (N=4 with both DR DOS 6.0 / Novell DOS 7 DRFONT files) 18h N BYTEs cellsize (Height) of fonts 1..N the cellsize corresponds with the character boxes height, but is also the count of bytes used for each of the characters inside the font data (as currently all fonts are organized heightx8 and 8 pixel width is just one byte). var N DWORDs file offsets of DisplayFontData. Format of DOS .CPI file Font Information Header: Offset Size Description (Table 01759) 00h WORD number of codepage entries var N codepage entry headers (see #01760) SeeAlso: #01758 Format of DOS .CPI file CodePage Entry Header: Offset Size Description (Table 01760) 00h WORD size of this header (normally 1Ch) 02h DWORDDoubleword; four bytes. Commonly used to hold a 32-bit segment:offset or selector:offset address. offset of next entry, or 0000h:0000h or FFFFh:FFFFh if last (if a valid "next" pointer but all of the fonts indicated in the .CPI header have been processed, this field normally points at an optional text area at the end of the .CPI file containing copyright information) 06h WORD device type 01h display (FONT or DRFONT) 02h printer (FONT) 08h 8 BYTEs blank-padded device name string 10h WORD code page (see #01757) 12h 3 WORDs reserved (0) 18h DWORDDoubleword; four bytes. Commonly used to hold a 32-bit segment:offset or selector:offset address. pointer to Font Data Header (see #00222) normally immediately follows this header SeeAlso: #01758 Format of DOS .CPI file Font Data Header: Offset Size Description (Table 01761) 00h WORD record type 0001h FONT 0002h DRFONT (DR DOS 6.0/Novell DOS 7 display font) 02h WORD number of fonts 04h WORD length of font data (display fonts) ??? (printer fonts) 06h var font data (#fonts * fontlength) bytes SeeAlso: #01758 Format of DOS .CPI file ScreenFONT Header: Offset Size Description (Table 01762) 00h 6 BYTEs display-font header (see #01764) 06h var display font data SeeAlso: #01758 Format of .CPI file DRFONT Header: Offset Size Description (Table 01763) 00h 6N BYTEs DisplayFONT headers for N fonts (see #01764) M WORDs character index table for cell offsets in font data currently 256 words in length SeeAlso: #01758 Format of .CPI file DisplayFONT header: Offset Size Description (Table 01764) 00h BYTE height of character cell 01h BYTE width of character cell (currently always 08h) 02h BYTE aspect ratio (height) (currently 00h, unused) 03h BYTE aspect ratio (width) (currently 00h, unused) 04h WORD number of characters per font (256) SeeAlso: #01758 Format of .CPI file PrinterFONT header: Offset Size Description (Table 01765) 00h WORD type of printer 0001h (4201.CPI, 1050.CPI, EPS.CPI) 0002h (4208.CPI, 5202.CPI, PPDS.CPI) 02h WORD bytes per hardware/download codepage-select escape sequence (max 31, typically 12) 04h N BYTEs escape sequence to select hardware codepage N BYTEs escape sequence to select download codepage var download data for printer font (including escape sequence to transfer data) SeeAlso: #01758