Notes »

Color Graphics Adapter (CGA) Programming

These are some notes after making The Return of Traxtor for DOS.

The code snippets are for GCC (using the IA16 port) and stdint.h.

Setting the graphic mode

This should be done with the BIOS for compatibility.

Int Function Params
0x10 ah: 0x00 al: mode

Example:

void set_mode(uint8_t mode)
{
    asm volatile (
        "movb $0, %%ah\n\t"
        "int $0x10\n\t"
        : /* no output */
        : "al"(mode)
        : "ah"
    );
}

Interesting CGA Modes

Mode Text Resolution Colors
0x04 40x25 320x200 4
0x03 40x25 Text mode 16
0x06 80x25 640x200 2

Selecting the palette

This is set with the control color register on port 0x3d9 in mode 0x04 (320x200, 4 colors).

Bit Description
0 Background color blue
1 Background color green
2 Background color red
3 Background intensity
4 Palette intensity
5 Palette selection
6-7 Unused

The palette selection bit translates as:

  • 0: red, green, yellow
  • 1: magenta, cyan, white

Examples:

/* palette 0, bright */
outporb(0x3d9, 16);

/* palette 1, bright, blue background */
outporb(0x3d9, 1 | 16 | 32);

These may not have effect on a VGA card.

Drawing on the screen

In mode 0x04 (320x200, 4 colors), the video memory is mapped at b800:0000.

The pixel data is encoded using 2 bit per pixel, so in one byte we have 4 pixels, with the most significant bit corresponding to the left-most pixel.

For 4 pixels (a, b, c ,d), we would have:

Bits 7-6 Bits 5-4 Bits 3-2 Bits 1-0
a1a0 b1b0 c1c0 d1d0

Each row has 80 bytes, and the memory is split: first the even rows (0, 2, 4, 8, …, until 198), and offset by 8192 bytes, the odd rows (1, 3, 5, 7, …, 199).

Example clearing the screen:

void clear_screen()
{
    asm(
        "push %%es\n\t"
        "movw $0xb800, %%ax\n\t"
        "movw %%ax, %%es\n\t"
        "movw $0, %%di\n\t"
        "movw $8192, %%cx\n\t"
        "xorw %%ax, %%ax\n\t"
        "cld\n\t"
        "rep stosw\n\t"
        "pop %%es\n\t"
        : /* no output */
        : /* on input */
        : "c", "a", "di"
    );
}

A reasonably efficient “blit” function:

/* x and w in bytes */
/* y and h in pixels */
typedef struct {
    uint8_t x;
    uint8_t y;
    uint8_t w;
    uint8_t h;
} rect;

/* sprite expects 2bpp data, non-interlaced non masked */
void blit(const uint8_t *sprite, const rect *dst)
{
    asm volatile (
        "movw %%di, %%bx\n\t"
        "cld\n\t"
        "blit_loop:\n\t"
        "movw %%bx, %%di\n\t"
        "movw %%dx, %%cx\n\t"
        "rep movsb\n\t"
        "xorw $0x2000, %%bx\n\t"
        "testb $1, %%al\n\t"
        "je blit_skip\n\t"
        "addw $80, %%bx\n\t"
        "blit_skip:\n\t"
        "incb %%al\n\t"
        "decb %%ah\n\t"
        "jne blit_loop\n\t"
        : /* no output */
        : "e"(0xb800),
        "S"(sprite), "D"(((dst->y & 1) ? 0x2000 - 40: 0) + dst->x + dst->y * 40),
        "a"((dst->h << 8)|dst->y), "d"((uint16_t)dst->w)
        : "b", "c"
    );
}

And a faster version with width fixed to 4 bytes (16 pixels wide):

void blit4(const uint8_t *sprite, const rect *dst)
{
    asm volatile (
        "movw %%di, %%bx\n\t"
        "cld\n\t"
        "blit4_loop:\n\t"
        "movw %%bx, %%di\n\t"
        "movsw\n\t"
        "movsw\n\t"
        "xorw $0x2000, %%bx\n\t"
        "testb $1, %%al\n\t"
        "je blit4_skip\n\t"
        "addw $80, %%bx\n\t"
        "blit4_skip:\n\t"
        "incb %%al\n\t"
        "decb %%ah\n\t"
        "jne blit4_loop\n\t"
        : /* no output */
        : "e"(0xb800),
        "S"(sprite), "D"(((dst->y & 1) ? 0x2000 - 40: 0) + dst->x + dst->y * 40),
        "a"((dst->h << 8)|dst->y)
        : "b"
    );
}

Waiting for VSYNC

There is no interrupt to signal the start of vertical sync, so we have to poll the status register on port 0x3da.

Bit Description
0 Display enable
1 Light pen trigger set
2 Light pen switch status
3 Vertical retrace
4-7 Unused

Example:

void wait_vsync()
{
    while (inportb(0x3da) & 8);
    while (!(inportb(0x3da) & 8));
}
Last updated May 5, 2024