The Kermit Project Columbia University New York City USA fdc@columbia.edu http://www.columbia.edu/kermit/ 31 March 2000
Format: Plain text with line breaks. Encoding: ISO 8859-1 (1)
(1) So accents look right when viewed through a Web browser.
There is no way to announce UTF-8 in a plain-text file.
STATUS
UTC Document L2/00-159 ISO WG2 Document N2265 May 2000: UTC Motion [83-M24]: Move to Accept (for Unicode 3.2). Sep 2000: Approved by ISO WG2 for ISO/IEC ISO 10646-1:2000/Amd.1 at Athens; Resolution M39.20 (Terminal Graphic Symbols): Unanimous. Assigned Codes: 2071 SUPERSCRIPT LATIN SMALL LETTER I 23B7 RADICAL SYMBOL BOTTOM 23B8 LEFT VERTICAL BOX LINE 23B9 RIGHT VERTICAL BOX LINE 23BA HORIZONTAL SCAN LINE-1 23BB HORIZONTAL SCAN LINE-3 23BC HORIZONTAL SCAN LINE-7 23BD HORIZONTAL SCAN LINE-9 2596 QUADRANT LL 2597 QUADRANT LR 2598 QUADRANT UL 2599 QUADRANT UL AND LL AND LR 259A QUADRANT UL AND LR 259B QUADRANT UL AND UR AND LL 259C QUADRANT UL AND UR AND LR 259D QUADRANT UR 259E QUADRANT UR AND LL 259F QUADRANT UR AND LL AND LR Added in Unicode 3.2, 27 March 2002: http://www.unicode.org/unicode/reports/tr28/#12_5_technical_symbols http://www.unicode.org/charts/ In Unicode 3.0, the control picture glyphs at U+2400-2421 were revised to have a diagonal presentation, as recommended in earlier versions of this proposal.
ABSTRACT
A selection of terminal graphics characters is proposed to allow Unicode-based terminal emulation software to display glyphs that are found on popular types of terminals but that are not currently available in Unicode, and to exchange these characters with other Unicode-based applications. Approval of this proposal will promote the migration of terminal-based technical and forms-filling applications from physical terminals or emulators with custom fonts to standard Unicode-based emulators, and it will promote interoperability of terminal emulators with other Unicode applications and with each other.
INTRODUCTION
This is an update of the November 1998 proposal, which was composed of the following pieces:
TERMINAL GRAPHICS FOR UNICODE (plain text) ftp://kermit.columbia.edu/kermit/ucsterminal/ucsterminal_03.txt The full November 1998 proposal. STATUS: Revised and resubmitted as the present document. HEX BYTE PICTURES FOR UNICODE (plain text) ftp://kermit.columbia.edu/kermit/ucsterminal/hex.txt STATUS: Rejected by UTC, December 1998. ADDITIONAL CONTROL PICTURES FOR UNICODE (plain text) ftp://kermit.columbia.edu/kermit/ucsterminal/control.txt STATUS: Rejected by UTC, December 1998. Glyph Map (PDF, contributed by Michael Everson) ftp://kermit.columbia.edu/kermit/ucsterminal/terminal-emulation.pdf Exhibits (PDF, contributed by Markus Kuhn) ftp://kermit.columbia.edu/kermit/ucsterminal/terminal-exhibits.pdf Clarification of SNI Glyphs (Microsoft Word 7.0) ftp://kermit.columbia.edu/kermit/ucsterminal/sni-charsets.doc Discussion (plain text e-mail) ftp://kermit.columbia.edu/kermit/ucsterminal/mail.txt
Since the original proposal in 1998, Unicode 3.0 was released and after that an extended set of mathematics and technical symbols, STIX [35], was accepted in principle by the UTC and WG2, pending forthcoming ballots. And SHARE (the IBM mainframe users society) indicated no interest in the IBM 3270 operator status glyphs.
Therefore, the terminal graphics proposal is revised as follows:
. 15 glyphs now available in the STIX Math Set have been withdrawn. . 11 glyphs unique to Data General terminals have been withdrawn, since all remnants of Data General host-terminal culture seem to have disappeared from the planet. . 8 glyphs for IBM 3270 Terminal Operator Status Indicators have been withdrawn. . 2 characters currently available in Unicode 3.0 have been withdrawn.
The total number of characters now proposed is 23 (18 with suggested unifications). They are from the non-IBM-3270 terminal types that are still widely used or emulated: DEC, Wyse, Siemens-Nixdorf, Televideo, IBM, Heath/Zenith. This is down from 59 in the 1998 proposal.
Some of the characters proposed are candidates for unification with existing Unicode characters, but only if the Unicode Standard is modified to specify their "semantics" with respect to terminal-emulation (monospace or duospace) font cell boundaries, line weights, and so on, as indicated in the discussion of each character. The Unicode Standard leaves much unsaid about box-drawing characters, block elements, and geometric shapes; one assumes that they all have the same width, but this is nowhere stated and it becomes a more serious issue with the approval of the STIX group, in which continuation lines must match in weight, angle, and position, and in which brace, bracket, and other symbol pieces must line up to be joined properly, a development not anticipated by the statement in 12.6[24] that the "Unicode Standard does not encourage this kind of character-based graphics model". Perhaps a new notion of "Connecting Class" or "Alignment Class" would be helpful for the characters in the 2300 (STIX) and 2500 blocks.
Since the number of proposed characters is small, I will simply list each character and its properties, with a brief discussion. The U+Exxx reference numbers remain unaltered from the original proposal, and are retained to key with the original glyph map. The final Unicode values for these characters should be assigned in the appropriate blocks of Plane 0 (so they can be used in Windows 95 and 98).
Grateful acknowledgements to those whose comments on previous drafts are reflected in this one: Kevin Bracey, Michael Everson, Doug Ewell, Asmus Freytag, Christine Gianone, Tony Harminc, Elliotte Rusty Harold, Edwin Hart, Kent Karlsson, Paul Keinanen, Markus Kuhn, Alain LaBonté, Heinz Lohse, Rick McGowan, Sean O'Leary, Jonathan Rosenne, Otto Stolz, Geoffrey Waigh, Kenneth Whistler, and Paul Williams. Special thanks to Michael Everson for his rendition of the proposed glyphs and to Markus Kuhn for scanning the exhibits.
The text of this proposal is available on the Internet as:
ftp://kermit.columbia.edu/kermit/ucsterminal/ucsterminal.txt
MOTIVATION
NOTE: This section is unchanged from the first proposal.
Terminal-host communication was the dominant form of interaction between human and computer from about 1974 (when CRTs became affordable)(1) to about 1994 (when the Web and Windows took over the mass market). Terminal-host communication is still widespread, especially in large organizations, and is expected to remain so for decades to come, playing an important part in organizations like universities, hospitals, government agencies, and corporations with central computing facilities, for use in applications ranging from sofware development and system/network administration, to email and text-based Web access, to data entry and inquiry, to transaction processing, and it is also important to people who use speech or Braille devices and Telecommunications Devices for the Deaf (TDDs).
A text terminal, for purposes of this document, is a device for entry and display of text in a fixed-pitch font on a screen (or on paper) in which graphic characters are displayed as glyph images in rows and columns of "cells" of fixed and uniform size, one glyph image per cell. Text terminals generally display (or otherwise handle) the characters of ASCII [1] or EBCDIC [13], and often also accented or non-Roman letters (or ideograms), and often also "graphics" (2) (non-alphabetic, non-digit, non-punctuation) characters for purposes of line- and box-drawing, mathematics, or other special effects, and they also accept control characters or escape sequences for formatting.
In recent years, physical terminals have largely disappeared from the scene, their functions subsumed into PCs running terminal-emulation software alongside other applications. Unicode (viewed as a process) has effectively met the need for encoding the earth's writing systems, but so far it is not as well suited to terminal emulation as it might be since it lacks some of the required graphics characters.
Without a standard encoding for the missing glyphs, each maker of terminal emulation software must create or contract for custom fonts with private encodings. Such fonts are not compatible with other (otherwise compatible) fonts on the same platform (e.g. when copying from a terminal window and pasting to a word processor), nor with each other. Furthermore, should Unicode printers become standard equipment on PCs, terminal graphics characters will not print correctly on them (e.g. when used with the terminal's transparent printing, autoprinting, or dump-screen features).
This document proposes a modest repertoire of terminal graphics characters to be added to Unicode and ISO 10646, to supplement those already there (e.g. the line and box drawing characters at U+2500) to which all makers of fonts, code pages, and printers can refer when designing their products, and upon which all makers of terminal emulation and/or debugging software can base their screen displays.
To state the motivation for this proposal as clearly as I can:
1. There are numerous terminal emulation products on the market, with a user base numbering in the millions. 2. Increasingly, these products are designed for and used on systems — like Windows NT — that have Unicode fonts. 3. Many terminal based applications take full advantage of the features and glyph repertoires of the terminals they are designed for (far beyond the simple models supported, e.g. by termcap/terminfo). 4. The glyph repertoire of many common terminals — VT100/VT220, Wyse, Siemens Nixdorf, Data General, etc, include glyphs that are not presently in Unicode. 5. Customers of terminal emulation products often demand complete and accurate emulation. 6. In order to succeed, makers of terminal emulation software must create private fonts containing the missing glyphs (which, as an aside, unnecessarily drives up the cost of the product for the end user) in the Private Use area. 7. Because of the closed and proprietary nature of this process, each terminal emulation product potentially (and in fact) encodes the same characters at different places. 8. Other applications use the Private Use Area for other purposes (and other glyphs). 9. The result is that terminal emulation products do not interoperate with each other or with other applications on the same platform.
For example, a VT100 or HP forms-based screen can not be pasted into a word processing document without changing the forms borders (etc, depending on exactly how they are encoded) into whatever other glyphs happen to be defined at the same code points in the font used by the other application. Ditto for mathematical formulae displayed on DEC or Siemens Nixdorf screens. Ditto for character-cell illustrations or tables in numerous online texts intended for display on any of the widespread terminals.
Notes:
(1) Strictly speaking, terminals predate electronic computers by some decades; the Teletype (used as the control terminal on many mainframes and most minicomputers in the 1950s through 1970s) dates back to 1929. (2) Note the distinction between "graphic" meaning "printing" (as in "ISO 8859-1 is a graphic character set") versus "graphics" meaning having something to do with pictures. Graphics terminals (such as the Tektronix 4010) also exist, but are not relevant to this proposal.
SCOPE
NOTE: This section is unchanged from the first proposal.
This document represents a survey of the following terminals:
Data General D210,215,217,413,463 [2] Digital Equipment Corporation VT100 through VT520 [3-9] Heath / Zenith 19 [10] Hewlett Packard HP-2621 and HP-2648 [11,12] IBM 3164 and 3270 [15,16,27] Siemens Nixdorf 97801 [21] Televideo 922 and 965 [22,23] Wyse 60 and 370 [25,26]
as well as:
IBM PC code page 437 [14]
which is the basis for numerous PC-oriented so-called ANSI emulations.
Even within this fairly narrow scope, arriving at a sufficient set of character-cell terminal graphics for Unicode is complicated by the well-known problems that affect other preexisting character sets to varying degrees:
1. Lack of official names for the characters of some of the sets. 2. Lack of definitive, high-quality pictures of the glyphs in some cases. 3. Lack of descriptions of the purpose and intended use of the glyphs. 4. Lack of a current registration authority or owner in some cases. 5. Questions of unification of glyphs from different terminal makers. 6. End-user demand for specific characters or sets.
The issue of unification is complicated by the fact that some of the terminal graphics characters are designed to join at cell boundaries to form "pictures" (such as boxes or forms to be filled out) or large characters (such as big math symbols) spanning multiple rows and/or columns. The relationship of similar-looking glyphs for different terminals is difficult to determine — e.g. exactly where does a line touch an edge, and at what angle, and does it make a difference?
The question of unification should be considered not only in the GUI environment but also for platforms where only one font is available — a fixed-pitch "console" font — and in "DOS"-like windows or fullscreen sessions, where only one fixed-pitch font may be used; this sort of environment is often host to terminal applications. Examples: a full-screen Windows NT session; the new Unicode-based Linux console driver and font.
This proposal does not require any action for well-known terminal presentation forms such as double-high and/or double-wide characters, bold, blinking, inverse, italic, underlining, color, etc, since these are not encoding issues. In particular, no special code points are needed for double-high or double-wide characters, such as those seen on the DEC VT100 family of terminals, nor for compressed characters as seen on Data General and DEC terminals.
This proposal also does not cover true graphics terminals, such as Tektronix vector graphics units, DEC ReGIS or Sixel graphics, BBN Bitgraph, etc, since these graphics regimes are not character-cell based.
No attempt was made to account for the many Viewdata, Videotex, Minitel, NAPLPS, or similar character sets. These should be tackled, if at all, by someone who knows something about them.
Note that the graphic characters listed in this proposal rarely, if ever, appear on keyboard key labels. In general, these characters are never typed, not even on real terminals, but are displayed when the terminal is commanded into a special mode by the host; for example, with ISO 2022 [17] character-set designation and invocation escape sequences.
The characters proposed in this document are assigned temporary Unicode values from the Private Use area, strictly for reference within (or to) this document only. Final values should be assigned outside of the Private Use range. The temporary allocations are:
E0A0-E0BF Math Symbols E0D0-E0EF Line and Box Drawing
There are many holes in the sequence; this reflects the withdrawal of numerous characters during the evolution of this proposal (see Appendix).
Legend: UL = Upper Left LL = Lower Left UR = Upper Right LR = Lower Right Reference key: DGL = Data General Line Drawing Character Set [D3] DGM = Data General Word-Processing, Greek, and Math Character Set [D2] DSG = The DEC Special Graphics Character Set [A3] DTC = The DEC Technical Character Set [C2] H19 = The Heath/Zenith 19 Graphics Character Set [L1] IBM = IBM Graphic Character Global Identifier (GCGID) [14] SNI = Siemens Nixdorf Mathematisch [E5], SNI 97801. TVI = The Televideo 965 Multinational Character Set [23] WG3 = The Wyse Graphics 3 Character Set [F2] WYA = Wyse 60 "Standard ANSI", "UK ANSI", and "ANSI Graphics" [F3]
PROPOSED NEW CHARACTERS
Proposed character names should be changed as needed to conform to UTC and WG2 naming rules or conventions. References to STIX U+23xx values are based on L2/00-033R, and are subject to change. Suggested encodings in the appropriate blocks are given in case it is helpful, but these are in no way indicative of what the final encodings, if any, might be.
1. RADICAL SYMBOL BOTTOM
Code: U+E0B0 (reference key to original November 1998 proposal) Suggested encoding: U+23B7 General Category: Symbol, math (Sm) Combining Class: None Bidirectional Category: Other Neutral (ON) Decomposition Mapping: None Decimal Digit Value: None Digit Value: None Numeric Value: None Mirrored: Yes (1)
Discussion: This character is from the DEC Technical Character Set, position 02/01 (column/row notation), used on VT330 and higher terminals and in DECterm windows to form large radicals. The vertical stroke extends to the top of the cell and connects with a centered vertical line (DEC Technical 02/06 = U+2502 or STIX U+23AE, whichever is appropriate (2)) or Upper Left Box Corner (DEC Technical 02/02 = U+250C), which in turn can be extended to the right with centered horizontal lines (DEC Technical 02/03 = U+2500 or STIX U+23AF as appropriate).
(1) Mirrored because (a) the radical signs at U+221A-C are mirrored, and
(b) the Integral pieces at U+2320-1 are mirrored. Or not, if the STIX brace and bracket pieces are not mirrored.
(2) If the STIX extensions are appropriate but the U+25xx lines are not,
then a STIX equivalent of Upper Left Box corner might also be needed.
Also note: The glyph shown for this character in the glyph table is not the right shape; the vertical stroke must be truly vertical and centered, not slanted.
2. SUPERSCRIPT LATIN SMALL LETTER i
Code: U+E0B2 Suggested encoding: U+208F, U+209F, or U+2071 General Category: Symbol, math (Sm) Combining Class: None Bidirectional Category: Other Neutral (ON) Decomposition Mapping: None Decimal Digit Value: None Digit Value: None Numeric Value: None Mirrored: No
References: IBM GCID [29] LI011000 and Siemens-Nixdorf SNI 97801 [21] Math Character 03/00.
Discussion: In the GUI environment, one would argue that the rendering software should change the size and baseline of the regular small "i" at U+0069 to create the small superscript letter, and therefore a new character would not be needed. But this can not be done on a character-cell terminal or in (e.g.) a Windows NT Console screen; thus a new character is required if the terminal is to be emulated accurately and the meaning of the display not altered (does "3i" mean "3 times i" or "3 to the ith power"?). An objection to this character might be: "If we encode superscript i, then what about all the other letters (etc) in all the other scripts of the world?". The answer would be that none of the character-cell terminals surveyed include any superscript letter but "i" and "n", and "n" is already encoded at U+207F.
3. LEFT VERTICAL BOX LINE
Code: U+E0D0 Suggested Encoding: U+23B8 General Category: Symbol, other (So) Combining Class: None Bidirectional Category: Other Neutral (ON) Decomposition Mapping: None Decimal Digit Value: None Digit Value: None Numeric Value: None Mirrored: No
Discussion: This line goes along the left edge of the cell, extending to the top and bottom cell boundaries. Used with the Horizontal Line Scan 9 character (item 9 below) or the Low Line character, U+005F, to make square corners. Should have the same weight as the Box Drawings Light lines in the U+2500 block. Should abut with U+2500 to form a sideways T shape. References: IBM GCID SF640000 and SV330000, Heath/Zenith 19 Graphics Character Set 07/12.
Note: The glyph shown for this character in the glyph table is not in the correct position. It should be all the way to the left in the cell, not near the center.
4. RIGHT VERTICAL BOX LINE
Code: U+E0D1 Suggested Encoding: U+23B9 General Category: Symbol, other (So) Combining Class: None Bidirectional Category: Other Neutral (ON) Decomposition Mapping: None Decimal Digit Value: None Digit Value: None Numeric Value: None Mirrored: No
Discussion: Like (3) but on the right. References: IBM GCID SF650000, Heath/Zenith 19 Graphics Character Set 07/13.
5-9. HORIZONTAL LINES
Codes: U+E0D6-E0DA Suggested Encodings: (see discussion) General Category: Symbol, other (So) Combining Class: None Bidirectional Category: Other Neutral (ON) Decomposition Mapping: None Decimal Digit Value: None Digit Value: None Numeric Value: None Mirrored: No Code Name References E0D6 Scan 1 DSG 06/15, H19 07/10, WG3 05/00, TVI 09/00, IBM SV300400 E0D7 Scan 3 DSG 07/00, WYA 01/01, WG3 05/00, IBM SV300200 E0D8 Scan 5 DSG 07/01, WYA 02/02, IBM SV300300, IBM SM920000 E0D9 Scan 7 DSG 07/02, WYA 01/03, WG3 05/01, IBM SV300100 E0DA Scan 9 DSG 07/03, H19 07/11, WG3 05/01, TVI 09/01, IBM SV300600
Discussion: These are horizontal lines at different heights, designed to be joined to lines in adjacent cells that terminate at the same height and therefore extending the full width of the character cell. They should have the same weight as the Box Drawings Light lines in the U+2500 block. The Scan numbers refer to the pixels of the VT100 character cell; these are the names given in the VT terminal manuals; similar terminology ("Horizontal Line 7/9ths Height") is used in the IBM GCID.
Scan 1 is at the top of the cell; Scan 5 is vertically centered; Scan 9 is at the bottom. Scans 1 and 9 should make square corners with Left and Right Vertical Box Lines (items 3 and 4 above). Scans 3, 5, and 7 should abut squarely against the Vertical Box Lines, forming a sideways T shape.
Note: The glyphs shown for these characters in the glyph table are not wide enough; they should extend the full width of the charcter cell.
Unification: Since Scan 5 is centered vertically in the cell, it corresponds to U+2500 if the latter (a) extends to the edges of the cell and (b) is centered vertically in the cell. These properties are unstated, but obtain in currently available monospace Unicode fonts such as Lucida Console and Courier New. The same might be true of STIX U+23AF, Horizontal Line Extension, and if so, this would be the more appropriate unification.
Suggested encodings:
Scan 1: U+23BA Scan 3: U+23BB Scan 5: Unify with U+23AF or U+2500 Scan 7: U+23BD Scan 9: U+23BD
10-13. WEDGES
Codes: U+E0D2-E0D5 Suggested Encodings: (see discussion) General Category: Symbol, other (So) Combining Class: None Bidirectional Category: Other Neutral (ON) Decomposition Mapping: None Decimal Digit Value: None Digit Value: None Numeric Value: None Mirrored: No Code Name References E0D2 UL Wedge H19 07/02, IBM SF870000 E0D3 UR Wedge H19 05/14, IBM SF860000 E0D4 LL Wedge IBM SF850000 E0D5 LR Wedge IBM SF840000
Discussion: A wedge is a character cell with a diagonal line connecting opposite corners, dividing the entire cell into two triangles: one dark, the other light; the wedge is the dark part. These characters are used for mosaic graphics in the Heath/Zenith-19 Graphics and IBM character sets.
Unification: An Upper Left Wedge is similar to U+25E4, except it fills the entire character cell. It can be inferred from the Unicode 2.0 and 3.0 books that U+25E4 does NOT extend to the edges of the cell, since its vertical and horizontal edges are the same length as Black Square, U+25A0, which is contrasted with (and smaller than) Full Block, U+2588. If this inference is not true, then the wedges do not need to be encoded. However, in that case, the Unicode Standard should state that U+25E4-25E7 extend to the edges of the cell when implemented in a terminal-emulation font.
Suggested encoding: Unify with U+25E4-25E7.
14-23. QUADRANTS
Codes: U+E0DB-E0E4 Suggested Encodings: U+2596-259F. General Category: Symbol, other (So) Combining Class: None Bidirectional Category: Other Neutral (ON) Decomposition Mapping: None Decimal Digit Value: None Digit Value: None Numeric Value: None Mirrored: No Code Name References E0DB Quadrant LL H19 06/13,WG3 05/05, TVI 09/05 E0DC Quadrant LR H19 06/12,WG3 05/04, TVI 09/04 E0DD Quadrant UL H19 06/14,WG3 05/06, TVI 09/06 E0DE Quadrant UL and LL and LR WG3 05/11,TVI 09/11 E0DF Quadrant UL and LR H19 06/10 (3) E0E0 Quadrant UL and UR and LL WG3 05/12, TVI 09/12 E0E1 Quadrant UL and UR and LR WG3 05/13, TVI 09/13 E0E2 Quadrant UR H19 111, WG3 83, TVI 09/03 E0E3 Quadrant UR and LL E0E4 Quadrant UR and LL and LR WG3 05/14, TVI 09/14
Discussion: A character cell can be divided into four equal quadrants by a horizontal and vertical line that intersect in the center. Each quadrant can be light or dark. There are 16 possible combinations, of which six are already encoded: all white (U+0020 or U+00A0), all dark (U+2588), the left and right half blocks at U+258C and U+2590, and the top and bottom half blocks at U+2580 and U+2584. Nine of the ten remaining combinations are used on the terminals listed above for mosaic graphics; the tenth (E0E3) should be included for completeness.
SUMMARY
The following supplemental terminal graphic characters are proposed:
1. E0B0 Small Radical Symbol 2. E0B2 Superscript Latin Small Letter i 3. E0D0 Left Vertical Box Line 4. E0D1 Right Vertical Box Line 5. E0D6 Horizontal Line - Scan 1 6. E0D7 Horizontal Line - Scan 3 7. E0D8 Horizontal Line - Scan 5 8. E0D9 Horizontal Line - Scan 7 9. E0DA Horizontal Line - Scan 9 10. E0D2 Upper Left Wedge 11. E0D3 Upper Right Wedge 12. E0D4 Lower Left Wedge 13. E0D5 Lower Right Wedge 14. E0DB Quadrant LL 15. E0DC Quadrant LR 16. E0DD Quadrant UL 17. E0DE Quadrant UL and LL and LR 18. E0DF Quadrant UL and LR 19. E0E0 Quadrant UL and UR and LL 20. E0E1 Quadrant UL and UR and LR 21. E0E2 Quadrant UR 22. E0E3 Quadrant UR and LL 23. E0E4 Quadrant UR and LL and LR
Of these, the following are candidates for unification:
7. Horizontal scan line 5 could be unified with U+2500, Box Drawings Light Horizontal, or with U+23AF, Horizontal Line Extension, if if either of these is centered vertically in the cell, extends to the right and left cell boundaries, and is the same weight as the other scan lines. 10-13. The wedges can be unified with U+25E2-25E5 if the latter are guaranteed to extend to the edges of the cell in terminal emulation fonts.
APPENDIX: DISPOSITION OF WITHDRAWN CHARACTERS
The following characters from the original proposal have been withdrawn for the reasons shown:
Code Description Reason E080 Human stick figure Withdrawn (3270) E081 Human stick figure in box Withdrawn (3270) E082 Clock at 6:10 (or 1:30) Withdrawn (3270) E083 White rectangle with stroke Withdrawn (3270) E084 Black rectangle with stroke Withdrawn (3270) E085 Lighting with stroke Withdrawn (3270) E086 Security key Withdrawn (3270) E087 Black and White Right-Pointing Triangles Withdrawn (3270) E0A0 Extensible left brace middle STIX U+23A8 E0A1 Extensible left parenthesis bottom STIX U+239D E0A2 Extensible left parenthesis top STIX U+239B E0A3 Extensible left SB bottom STIX U+23A3 E0A4 Extensible left SB top STIX U+23A1 E0A5 Extensible right brace middle STIX U+23AC E0A6 Extensible UR or LL brace section STIX U+23B1 E0A7 Extensible LR or UL brace section STIX U+23B0 E0A8 Extensible right parenthesis bottom STIX U+23A0 E0A9 Extensible right parenthesis top STIX U+239E E0AA Extensible right SB bottom STIX U+23A6 E0AB Extensible right SB top STIX U+23A4 E0AC Summation symbol bottom STIX U+23B3 E0AD Summation symbol top STIX U+23B2 E0AE Right ceiling corner U+2309 (1) E0AF Right floor corner U+230B (1) E0B1 Radical symbol with stroke Withdrawn (DG) E0E5 Full black diamond STIX U+29EB E0E6 Black framus (2) Withdrawn (DG), STIX U+29D7 E0E7 Black framus + H center bar Withdrawn (DG) E0E8 White framus Withdrawn (DG), STIX U+29D6 E0E9 White framus + H center bar Withdrawn (DG) E0EA R & L arrow to V center bar Withdrawn (DG) E0EB Up arrow to H center line Withdrawn (DG) E0EC R arrow to V center line Withdrawn (DG) E0ED L arrow to V center line Withdrawn (DG) E0EE Down arrow to H center line Withdrawn (DG) E0EF Box drawing double dash H Withdrawn (DG) E000-E09C Additional Control Pictures Rejected by UTC E100-E1FF Hex byte pictures Rejected by UTC
(1) These characters were in Unicode all along, but the shape shown in the
Unicode book was different from the shape on the terminal. However, this is not sufficient reason to have two versions of the same symbol.
(2) STIX uses the term Hourglass for this shape.
REFERENCES
Note: The references are preserved from the original proposal, even though some of them apply only to portions of it that were rejected or withdrawn. Reference [24] is updated. Reference [35] is new.
[1] American National Standards Institute, ANSI X3.4-1986, Code for Information Interchange (ASCII), 1986. [2] Data General, Programming the Display Terminal: Models D217, D413, and D463, Westboro, MA, 1991. [3] Digital Equipment Corporation, VT100 User Guide, EK-VT100-UG-002, Maynard, MA, 1979. [4] Digital Equipment Corporation, VT102 Video Terminal User Guide, EK-VT102-UG-003, Maynard, MA, 1982. [5] Digital Equipment Corporation, VT220 Owner's Manual, EK-VT220-UG-003, Maynard, MA, 1984. [6] Digital Equipment Corporation, VT220 Series Programmer Reference Manual, EK-VT240-RM-002, Maynard, MA, 1984. [7] Digital Equipment Corporation, VT330/VT340 Programmer Reference Manual, Volume 1: Text Programming, ED-VT3XX-TP-002, Maynard, MA, 1988. [8] Digital Equipment Corporation, Installing and Using the VT420 Video Terminal EK-VT420-UG.002, Maynard, MA, 1988. [9] Digital Equipment Corporation, VT520/VT525 Video Terminal Programmer Information, EK-VT520-RM.A01, Maynard, MA, 1994.
[10] Heathkit Manual for the Video Terminal Model H19, The Heath Company,
Benton Harbor, MI, 1979.
[11] Hewlett Packard 2621A/P Interactive Terminal Owner's Manual, 1978.
[12] Hewlett Packard 2648A Graphics Terminal Reference Manual, 1977.
[13] IBM System/360 Principles of Operation, GA22-6821-8, Poughkeepsie,
NY, 1970.
[14] IBM National Language Design Guide, Volume 2: National Language
Support Reference Manual, 4th Edition, SE09-8002-03, North York ON, 1994.
[15] IBM 3270 Information Display System, Component Description,
GA27-2749-10, 1980.
[16] IBM 3164 ASCII Color Display Station Description, GA18-2317-1, 1986.
[17] ISO International Standard 2022, Information processing — ISO
7-bit and 8-bit coded character sets — Code extension techniques, Third Edition, Geneva, 1986.
[18] ISO/IEC International Standard 6429, Information technology —
Control functions for coded character sets, Third Edition, Geneva, 1992.
[19] ISO/IEC 10646-1, International Standard 10646,
Information Processing — Multiple-Octet Coded Character Set, 1993-now.
[20] Perkin Elmer Model 1100 User's Manual, Randolph, NJ, 1978.
[21] Siemens Nixdorf, Bildschirmeinheit 97801-5xx Schnittstellen,
Benutzerhandbuch, München, 1991.
[22] Televideo 922 Video Terminal Display Operator's Manual, Sunnyvale, CA,
1984.
[23] Televideo 965 Video Terminal Display Operator's Manual, Sunnyvale, CA,
1988.
[24] The Unicode Standard, Version 3.0, Addison-Wesley, 2000.
[25] Wyse WY-60 Programmer's Guide, Wyse Technology, San Jose, CA, 1987.
[26] Wyse WY-370 Programmer's Guide, Wyse Technology, San Jose, CA, 1990.
[27] IBM 3270 Information Display System, Data Stream Programmer's Reference,
GA23-0059-06, 1991.
[28] ISO International Register of Coded Characters to Be Used with Escape
Sequences, European Computer Manufacturers Association (ECMA), Geneva, 1985-present.
[29] IBM Character Data Representation Architecture, Level 1 Registry, IBM
Canada Ltd., National Language Technical Centre, Ontario, SC09-1391-00, 1990 (superseded by: IBM Character Data Representation Architecture, Registration and Registry, IBM Canada Ltd., Toronto, SC09-2190-00, 1995).
[30] Knuth, Donald, "TeX and METAFONT, New Directions in Typesetting",
American Mathematical Society / Digital Press, Bedford MA, 1979.
[31] Apple Computer Corporation, Inside Macintosh, 1984.
[32] HDS-3200 Terminal Series Owner's Manual, Philadelphia PA, 1987.
[33] Zenith Data Systems Video Terminal Z-19-CN Operation Manual, Saint
Joseph, MI, 1981.
[34] Interview 30A/40A Operator's Field Reference Guide, Atlantic Research
Corporation, ATLC-107-919-101, Alexandria, VA, 1982.
[35] Unicode Consortium Document L2/00-033R, STIX Math Symbols, 9 Feb 2000.
EXHIBITS
The following exhibits, which may be viewed at:
ftp://kermit.columbia.edu/kermit/ucsterminal/terminal-exhibits.pdf
are reproduced from the terminal manuals indicated by the numeric reference number. Each exhibit is 1 page unless otherwise indicated.
[A1] VT220 Display Controls Font (Left Half) [5].
[A2] VT220 Display Controls Font (Right Half) [5].
[A3] VT220 DEC Special Graphics Character Set [5].
[B1] VT320 Display Controls Font (Left Half) [7].
[B2] VT320 Display Controls Font (Right Half) [7].
[C1] VT420 Display Controls Font (Both Halves) [8].
[C2] VT420 DEC Technical Character Set [8].
[C3] HDS-3200 DEC Technical Character Set [32].
[D1] Data General US ASCII Character Set [2].
[D2] Data General Word-Processing, Greek, and Math Character Set [2].
[D3] Data General Line Drawing Character Set [2].
[D4] Data General Special Graphics Character Set [2].
[D5] Data General VT Multinational Character Set [2].
[D6] Data General VT Special Graphics Character Set [2].
[D7] Data General ISO 8859/1.2 Character Set [2].
[E1] Siemens Nixdorf 97801 ISO 8859-1 Character Set [21].
[E2] Siemens Nixdorf 97801 Klammern (Brackets) Character Set [21].
[E3] Siemens Nixdorf 97801 Facet Character Set [21].
[E4] Siemens Nixdorf 97801 IBM Character Set [21].
[E5] Siemens Nixdorf 97801 Math Character Set [21].
[E6] Siemens Nixdorf 97801 Character Generator (8 pages) [21].
[F1] Wyse 60 Native, Multinational, PC, and ASCII Character Sets [25].
[F2] Wyse 60 Graphics 1, 2, and 3 Character Sets [25].
[F3] Wyse 60 Standard ANSI, ANSI Graphics, and UK ANSI Character Sets [25].
[G1] Wyse 370 Controls Display Mode (74Hz) [26].
[G2] Wyse 370 Controls Display Mode (60Hz) [26].
[G3] Wyse 370 C0, ASCII, and Special Graphics Character Sets [26].
[G4] Wyse 370 C1, Multinational, and Latin-1 Character Sets [26].
[H1] IBM 3270 Operator Information Area Symbols (10 pages) [15].
[I1] TeX Standard Extension Font [30].
[J1] Apple Symbol Font (2 pages) [31].
[K1] Hewlett Packard 2621A/P National Terminal Character Set [11].
[L1] Heath/Zenith-19 Graphic Symbols (2 pages) [33].
[M1] Televideo 922 ASCII, Supplemental, Special Character Sets (4 pages) [22].
[N1] Sample screen from a data analyzer showing hex display [34].
(End)