Article 41931 of comp.sys.cbm: Xref: undergrad.math.uwaterloo.ca comp.sys.cbm:41931 Newsgroups: comp.sys.cbm Path: undergrad.math.uwaterloo.ca!csbruce From: csbruce@ccnga.uwaterloo.ca (Craig Bruce) Subject: Re: VBM Message-ID: Sender: news@undergrad.math.uwaterloo.ca (system PRIVILEGED account) Nntp-Posting-Host: ccnga.uwaterloo.ca Organization: University of Waterloo, Canada (eh!) Date: Wed, 6 Sep 1995 15:15:03 GMT Alan Jones writes: >From: 057184449-0001@btxgate.de (Arndt Dettke) >Hi Alan, I finished both a VBM-loader an >d saver for GoDot (640x400 saver) and wi >ll post it anywhere in the internet when > I know how to do it. > >I just recieved this message from Arndt Dettke. We now have a way to >create modest size VBM images using a C64. I don't have these >routines, but I do have the Godot demo package and it seems to work >well. The docs have not been translated into english yet. Arndt is >still struggling with very limited Internet access. Programming for the VBM format isn't really very difficult. The basic idea is to have a header followed by data in the basic format of how the C128's VDC chip stores bitmaps (the format is officially named after "VDC BitMap"). Unfortunately, there are three different versions of the format: version #2, version #3 uncompressed, and version #3 compressed. The version #2 format exists because I didn't get the format right the first time. You can tell which format the VBM file is in by reading the header. The header of all the formats is as follows (you can extract all of this information from the "pbmtovbm" version 1.99 conversion program): POS SIZ DESC --- --- ----- 0 1 the character 'b': $43 1 1 the character 'm': $4d 2 1 the binary value $cb 3 1 the VBM format version number: $02 or $03 4 2 the width (X) of the image in Hi/Lo format 6 2 the height (Y) of the image in Hi/Lo format If the image is in version #2 format, then this is it for the header. Version #3 images have the following additional header information: POS SIZ DESC --- --- ----- 8 1 data-encoding type: $00=uncompressed, $01=RLE-compressed 9 1 byte code for general RLE repetitions 10 1 byte code for repeated zeroes 11 1 byte code for repeated $ff values 12 1 byte code for two repeated zeroes 13 1 byte code for two repeated $ff values 14 2 reserved := 0 16 2 length of comment text (0 == no comment text) (Hi/Lo format) 18 n characters of comment text in PETSCII If the data-encoding type is "uncompressed", then the "repeated" fields are ignored; otherwise, they contain the binary byte code that is to be used to trigger an RLE expansion when uncompressing. For version #3, I allowed the data to be either compressed or uncompressed because uncompressed data will be able to be processed and displayed faster, and compressed data will be shorter. Or, depending on the storage device and the image involved, the compressed format may turn out to be faster since fewer characters will be read from a slow I/O device. For the uncompressed formats, the raw data is stored row by row from top to bottom, with each row stored left to right, eight pixels per byte. The most-significant bit of the byte will contain the left-most pixel, and the least-significant bit, the right-most pixel. Where an image has a width in pixels that is not evenly divisible by eight, the image is padded with black pixels to make it so. Thus, each row of the image occupies an integral number of uncompressed bytes. This is also roughly the format that the VDC chip uses to store bitmaps. The difference between version #2 and uncompressed version #3 images is that for version #2 images, "1" bits mean black pixels and "0" bits mean white pixels, after the X-Windows format. For version #3 images, the bits have the opposite meanings. Yes, this is an unfortunate blunder, but easy to fix; you just scan through the version #2 bytes of display data and EOR them with $ff before using them on a display with a black background color and a white foreground color. The compressed format uncompresses into exactly the uncompressed format (sic). To uncompress compressed data, you read one byte of the data at a time and compare it against all five of the "repeated" byte values given in the header. If the byte doesn't match any of them, then you display the byte as-is; it's a literal. Otherwise, if the byte is the code for the "general repetition", then you read the next byte value which is the literal value that will be repeated and then read the next byte after that which gives the number of repetitions to make. This is sufficient to give you Run-Length Encoding compression, but I found that I got better performance if I included additional, shorter RLE sequences for repeated $00 and $ff strings of both general length and of specific length 2. You always use the one that gives you the shortest data. The sequences are summarized as follows: CODE SEQUENCE ---- --------- general rep rep $00 two $00 rep $ff two $ff If you run into a string of only one $00 or $ff byte, then you encode it simply as a literal. If you have a literal that you want to encode but it equals one of the RLE repetition codes, then you must encode it using the general repetition sequence. Since this one literal must be encoded using three bytes, it is possible that you could end up with data that is longer when compressed than when it is uncompressed, although this is extremely unlikely. You can examine the "pbmtovbm" program for algorithms for compressing and uncompressing the data according to this scheme. You can statistically determine what the best code values to use for these five RLE code sequences for each file you compress, but I have found that the following values work quite well for the sample data I have tried (mostly dithered images): $31, $8c, $39, $cc, and $9c, respectively. >BTW Bruce, you said "A pair beats no cards." I think you meant, "An >ACE beats no cards." ;) alan.jones@qcs.org My name is _CRAIG_. Keep on Hackin'! -Craig Bruce csbruce@ccnga.uwaterloo.ca "The irony, of course, is that if the entire herd of deer were to turn and trample the tiger, the tiger wouldn't stand a chance."