The original Atom VDG chip's text font uses a simple 5 x 7 pixel array of 64 characters. Lower case letters were omitted to reduce the ROM needed, and avoid dealing with their pesky descenders. Instead, the VDG provides 32 inverted characters and 64 semi-graphic characters. The ROM needed for the latter is economical, and can be implemented with simple logic instead. Other niggles include a lack of characters such as the British Pound or the upright bar (|). To make things worse, the font order is in not an exact match to ASCII. Some re-ordering is required to turn ASCII into the equivalent VDU code.
Original |
These limitations were just about acceptable in 1982, they are definitely not today. Even a simple text editor requires lower case. So it is desirable to have some way of providing a more versatile font.
This has only 7 bits for the character code, the 8th bit being reserved for parity. Many systems used an 8-bit character set where the lower half was ASCII but the upper half was anything they wished. Since this leads to incompatibility, the 8-bit coding was superceded by ISO-8859.
ISO-8859-1 Character set |
This uses 8 bits for the character code. Codes 00 to 7F are
compatible with the most common form of ASCII.
Codes A0 to FF are regional variants, mainly holding accented
characters:
See http://czyborra.com/charsets/iso8859.html for more details.
If designing a display chip from scratch I would be tempted to implement the common characters in ROM, and the variations in RAM. However, the FPGA chips often have blocks of RAM that can be used equally easily for implementing ROM or RAM.
Translating ISO 8-bit codes to Unicode
Useful table that highlights the characters that have to be changed for regional variants of ISO-8859.
This allocates 16 bits per character, to cope with the wide range of characters in the world's languages. Codes are allocated in pages of 256 bytes. The most-significant 8-bits are the page number.
There isn't room to support 64K of character glyphs, so just the first 256 are to be implemented.
0000-001F Basic Control codes 0020-007F Basic Latin characters (20-7F) 0080-009F Extra Control codes 00A0-00FF Extra Latin characters (A0-FF)
Other interesting groups
0100-0177 Latin extended A 0180-024F Latin extended B 0250-02AF International Phonetic Alphabet 0340-309F Hiragana (Japanese) 0900-097F Devanagari (Hindi script) 2800-28FF Braille dot patterns 30A0-30FF Katakana (Japanese)
Many of the world's fonts are rather ornate and hard to pack in an 8-pixel wide cell. For these it might best to render the characters on the graphic screen.
5 x 7 dots are enough to define the basic latin characters, though lower case letters look a bit kludgy as they have to be shifted up to fit in the matrix. One economical solution might be to have a programmable attribute bit to shift the 5x7 cell down by 3 lines in a 5x10 cell.
The simplest method is to have an 8x12 character cell for all 256 characters. Since I am not strapped for FPGA RAM at the moment, this is the method I intend to use.
I currently use 6K of the 8K block-RAM for firmware. This leaves 2K, enough for 256 character cells of 8x8 pixels, but not the 3K needed to define them in 8x12 pixels. The font memory has to be inside the FPGA, the firmware can be in external ROM or RAM, so the latter has to give way.
. | . | . | # | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | Space for accents over capitals |
. | . | # | . | # | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | |
. | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | |
. | . | . | # | . | . | . | . | . | . | . | . | . | # | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
. | . | # | . | # | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | |
. | # | . | . | . | # | . | . | # | # | # | # | . | # | . | . | # | # | # | . | . | # | . | # | # | . | . | Main space for lower case letters |
. | # | . | . | . | # | . | # | . | . | . | . | . | # | . | # | . | . | . | # | . | # | # | . | . | # | . | |
. | # | # | # | # | # | . | . | # | # | # | . | . | # | . | # | . | . | . | # | . | # | . | . | . | # | . | |
. | # | . | . | . | # | . | . | . | . | . | # | . | # | . | # | . | . | . | # | . | # | . | . | . | # | . | |
. | # | . | . | . | # | . | # | # | # | # | . | . | # | . | . | # | # | # | # | . | # | . | . | . | # | . | |
. | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | # | . | . | . | . | . | . | . | Space for descenders |
. | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | # | # | # | . | . | . | . | . | . | . | . |
The best compromise seems to use a default character coding
that corresponds to codes 0000 to 00FF of Unicode.
The ISO-8859-1 character codes then map directly onto their
unicode equivalent.
It also incurs the least work when modifying it to national
variants.
There is another matter to consider: the control codes do not have associated glyphs that appear in a font, but their values can still be poked into display RAM. Therefore it seems sensible to use such values for semigraphic glyphs (as the original Atom VDG does) or for attribute-changing bytes (as teletext chips do).
There are enough codes to enumerate the full 64 semigraphic characters of the original Atom VDG.
Given the choice of Atom semigraphics and Teletext attributes, the latter seems the most appealing because it allows coloured words and graphics to appear on screen without needing extra memory to store colour data.
The 256-character font has been implemented and the results are as expected. Some minor details have been changed.
4F | Letter O rounded off to improve appearance. |
30 | Number 0 has stroke added to differentiate from letter O. |
23 | Hash sign improved |
2A | Star sign modified |
5E | Up arrow replaced with caret (^) |
5F | Left-arrow replaced with tilde (~) |
The first two changes make it easier to read listings.
The last two changes are more than cosmetic, they will display different glyphs but this should not be significant since they are so seldom used. If any program does find this a problem, those glyphs may be changed as required.
There is also a register to control various features. For instance, one bit selects either all 256 character when set, or Atom compatible mapping when clear (the default):
Atom char | Font char | Output |
00 to 3F | 40 to 5F | Alphanumeric |
40 to 7F | 00 to 1F | Graphic |
80 to BF | 40 to 5F | Alphanumeric inverted |
C0 to FF | 80 to 9F | Graphic |
Note that the inverted glyphs really are inverted: any change to the normal alphanumeric glyphs also changes the inverted glyphs.
Teletext requires 24 rows of 40 characters, or 25 lines if a status row is used. This is not problem for a TV, but a QVGA screen has 240 lines. Thus it can show 24 rows where character cells are 10 lines high. To show 25 lines one would have to settle for 8-lines high (8x25=200) with 40 lines free, or 9-lines high with the one scan line missing (9x25=241). Programmable character cell height (up to 12 lines) would be very useful.
Teletext Character set demo Warning text is enhanced when ramping the intensity. Not wise to demonstrate if taking through customs! :-) |
The text is taken from the self-destruct mechanism of the spaceship Nostromo, from the movie Alien.
DANGER EMERGENCY DESTRUCTION SYSTEM ON ACTIVATION SHIP WILL DETONATE IN T MINUS 10 MINUTES
SCUTTLE PROCEDURE |
||
DANGER, THE EMERGENCY DESTRUCT SYSTEM IS NOW ACTIVATED. THE SHIP WILL DETONATE IN T MINUS TEN MINUTES. THE OPTION TO OVER-RIDE AUTOMATIC DETONATION EXPIRES IN T MINUS FIVE MINUTES.
ATTENTION. THE COOLING UNITS FOR THE LIGHT-PLUS ENGINES ARE NOT FUNCTIONING. |
||
TOO LATE FOR REMEDIAL
ACTION. THE CORE HAS BEGUN TO MELT. |
||
ATTENTION. ENGINES WILL OVERLOAD IN TWO MINUTES. ATTENTION. ENGINES WILL EXPLODE IN NINETY SECONDS. ATTENTION. ENGINES WILL EXPLODE IN SIXTY SECONDS. |