Article 62469 of comp.sys.cbm:
Xref: undergrad.math.uwaterloo.ca comp.sys.cbm:62469 comp.emulators.cbm:14489 comp.arch.embedded:12875
Newsgroups: comp.sys.cbm,comp.emulators.cbm,comp.arch.embedded
Path: undergrad.math.uwaterloo.ca!csbruce
From: csbruce@ccnga.uwaterloo.ca (Craig Bruce)
Subject: Re: 6502 relocatable bin format?
Sender: news@undergrad.math.uwaterloo.ca (news spool owner)
Message-ID: <E0LvpH.IC@undergrad.math.uwaterloo.ca>
Date: Sat, 9 Nov 1996 13:48:53 GMT
References: <54ndk4$mtu@narses.hrz.tu-chemnitz.de> <550ujf$44i@bolt.Lakeheadu.Ca> <E0K1rJ.9ru.0.sheppard@torfree.net> <55vt38$6nd@narses.hrz.tu-chemnitz.de>
Nntp-Posting-Host: ccnga.uwaterloo.ca
Organization: University of Waterloo, Canada (eh!)

In article <55vt38$6nd@narses.hrz.tu-chemnitz.de>,
Andre Fachat <fachat@physik.tu-chemnitz.de> wrote:

>That's why I use a relocation table, where the positions of the 
>address bytes are stored by the assembler (either saying that there is
>a low byte, high byte or full address to be relocated. For a high byte
>the low byte of that address is stored in the relocation table, to allow
>finer granularity in relocating - need for carry bit). 
>The assembler knows about a label being an address through the
>"label: opcode" syntax. It then only allows some restricted arithmetic
>operations on these addresses. When they are used in an opcode or table,
>a relocation table entry is then generated.

The ACE-Assembler works in a similar way, by keeping track of the type of
each identifier as either: number, address, address-high-byte, or
address-low-byte.  It maintains these types as sensibly as possible in the
face of operations performed.  For example, if "a" is an address and you
refer to "a+50", the expression is still an address.  But, if you refer to
"b-a", where both "b" and "a" are addresses, then the result is not an
address, since really, it is a constant that doesn't have to be relocated.
Similarly, if "h" is an address-high-byte, then so is "h+1", etc.

The ACE-Assembler keeps track of these types internally but it doesn't
currently do anything with this information.  I invented a relocatable-
binary format a long time ago but I have never done anything with it.
The posting that described the format is included below.

Keep on Hackin'!

-Craig Bruce
csbruce@ccnga.uwaterloo.ca
"Planting a seed doesn't make it grow."  -- Sass Jordan

C=256,64K-VDC,REU,RL16,HD200,FD4000,SL,USR28.8,C=128,1581,1571,2x64,16,1541,VIC
Pentium120,32M-RAM,1.6G-HD,4xCDROM,SB16,3.5"FD,2MB-ATIMach-64,15"Mon,Linux,X

-----
From csbruce@ccnga.uwaterloo.ca Tue Mar  7 13:53:55 EST 1995
Article: 34516 of comp.sys.cbm
Newsgroups: comp.sys.cbm
From: csbruce@ccnga.uwaterloo.ca (Craig Bruce)
Subject: Re: Thoughts on BOS
Message-ID: <D51u4y.2Co@undergrad.math.uwaterloo.ca>
Sender: news@undergrad.math.uwaterloo.ca (news spool owner)
Nntp-Posting-Host: ccnga.uwaterloo.ca
Organization: University of Waterloo, Canada (eh!)
Date: Tue, 7 Mar 1995 02:41:22 GMT

abaugher@worf.infonet.net (Aaron J. Baugher) writes:

>>There is also no way to dynamically load external programs, so the test
>>programs have to be assembled with the kernel code.  Loading external
>>programs in this type of environment has the requirement that the program
>>will have to be relocated upon being loaded to an address that would only
>>be known at load time.
>
>I've been wondering how this should be done.  I'd come up with the idea 
>of having a table of pointers at the end(beginning?) of the program.  Each 
>pointer would point to a two byte value that needs to be relocated.  The 
>shell would then add the value of the load address to the value 
>referenced by each pointer.  The assembler then would need to create 
>these values relative to the beginning of the code, rather than at 
>absolute addresses.  For example:

>Of course, JSR and JMPs to kernel routines should not be adjusted, so 
>the assembler will need to be able to tell the difference between 
>addresses within the code and those outside it. 
>
>Does any of this make sense?  I'm curious as to how you've planned to do 
>it, and if it can be done better than this.

1. EXECUTABLE-FILE FORMAT

The way that I am planning to do it is to have a special format for executable
programs in BOS.  The executable format would have three sections: the header,
the raw program image, and a relocatable-references section.  The format would
look something like the following:

OFFSET   SIZE   DESCRIPTION
------   ----   ------------
     0      3   magic number for executables
     3      1   flag: executable or object module
     4      2   size of raw code image
     6      2   raw code start address
     8      2   raw address of entry point
    10      2   size of "bss" (uninitialized storage) section
    12      4   size of relocatable-references section
    16      4   size of exported-symbol section
    20      4   size of unresolved-symbol section
    24      4   size of unresolved-expression section
    28      8   reserved for future expansion
    36      x   <raw code image>
    xx      x   <relocatable-references section>

This format will also double as the object module format, which will be a
superset of the executable format (since there is more information to keep
track of).

The "magic number" field would tell the type of the file (and prevent you from
accidentially executing something that isn't a program) and the "flag" field
would distinguish between an executable file and an object module.  An object
module would not be executable because it wouldn't be a "complete" program.
An object module would require a number of other sections to be appended to
the file format, which are mentioned in the header but not included in an
executable file.

The "raw code image size" field tells the number of bytes in the "raw code
image" section of the file, which contains simply the raw image of the
program, assembled to execute at some address given by the "raw code start
address" field (it's not really necessary to assemble the program from address
0; in fact, this can cause some confusion in the assembler in deciding between
absolute and zero-page addressing modes).  The "raw address of entry point"
gives the point to start execution in the program image, in terms of the
raw addresses used in the code.

The "size of bss" field tells how many bytes of uninitialized variable space
the program requires.  The bss is logically part of the raw code image, but it
is not included there because it would be wasteful to keep its bytes in the
executable file since they are supposed to be "just random" (or all zeroes if
you're talking about a Unix system).

The "size of relocatable references" filed tells how many bytes are in the
"relocatable-references section" in the file.  The format of this section can
be:

OFFSET   SIZE   DESCRIPTION
------   ----   ------------
     0      2   number of full-address relocations
     2      2   number of low-byte relocations
     4      2   number of high-byte relocations
     6      x   <addresses of full-address relocations>
    xx      x   <addresses of low-byte relocations>
    xx      x   <addresses of high-byte relocations>

There are three things that need to be relocated: full addresses (like
JSR SUB), low bytes (like LDA #<TABLE_ADDR), and high bytes (like
LDA #>TABLE_ADDR).  The first three fields give the number of each type of
relocation that has to be done, and the next three fields give the addresses
in the raw code range where these relocations need to be applied.  Each
address in this section takes two bytes.  For a really big and complicated
program, this section could exceed 64K in size, so a 32-bit value for this
section's size is given in the file header.  For the three example
instructions given here, the relocation address would always be the second
byte of the instruction.  For assembler directives like:

tableA: db <code1, <code2, <code3
tableB: db >code1, >code2, >code3
tableW: dw code1, code2, code3

Where the "codeN"s are all addresses in the program range, the relocation
addresses would all be where the specified bytes and words get put into
memory.

Automatically coming up with the relocation addresses requires inside
information that is available only at the symbolic level of a assembler, so
generating this relocation information has to built into an assembler.  The
assembler has to keep track of which values are addresses that are inside of
the code range and which operands are not.  The assembler also has to
carefully evaluate expressions like:

LDA #>code3-code2
LDX #<code1+5
LDY #>code3-code2+code1

(Note: the "<" and ">" initial operators here are low-precedence
extract-low-byte and extract-high-byte operators, respectively.)

The operand for the LDA instruction is not an address high-byte; it's a
constant!  Two code sections will have a constant difference.  The LDX example
IS a relocatable reference, and so is the LDY (the size of the distance
between code3 and code2 is constant, plus a relocatable reference, gives a
relocatable reference).

Fortunately, such an assembler exists.  Well, almost.  It's called "the ACE
assembler".  It doesn't keep track fully of addresses now nor create
relocatable executable files, but it will.  It has the basic ingredients for
doing this already implemented.  And, even more important, I am on the inside
track with respect to how it works and how to change it, since I created it.

There are, of course, expressions that can't sensibly be resolved, like:

code1 * code2
code1 + <code2

but can just make the assembler produce warnings and do the best that it can
with them.  These constructs aren't useful anyway.  (Note that these can be
resolved _symbolically_, however).  We also don't have to worry about branch
instructions since their operands will all be contained within the code range
(and hence be constant values).

The other fields of the executable-file header are not used for executables.

2. CODE "LOADING" ALGORITHM

The first thing we do when we have to run a binary program is locate and open
its executable file.  We won't be using the Commodore-Kernal LOAD call, since
it isn't necessarily good enough for us, and BOS will be able to "read" files
at great speed anyway.  So, we open the file, read in the header, and save the
header information somewhere handy.

Then we want to load the program.  We first allocate the memory to hold the
program, on any convenient internal bank of memory (they will all look the
same to user programs).  The size of memory to allocate will be the raw
program image size plus the bss-section size.  We then load the program image
in one big gulp (it should be doable in a single read() operation, since
read()s will be implemented like they are in ACE).

We then scan through the relocatable references section (in suitably large
gulps for I/O efficiency) and relocate each reference.  To relocate a
full-address reference, we first figure out the "address delta" value for the
references.  This is the address that the program has been loaded at minus the
raw-image start address.  We add this delta value to each reference to get the
final address.  We do similar things for the other types of relocatable
references.

If we were to assume that programs will only be page-aligned when they are
assembled and when they are loaded, which will be a reasonable assumption for
BOS, then we don't have to worry about relocating anything except for
high-byte references and the high bytes of full-address references (cutting
out work a little bit).  Low bytes are included in the above file format since
this information will also be used when relocating object modules (when
linking program fragments and libraries together), and this will be done one a
byte granularity.

After we have loaded and relocated the program, we are ready to run it.  We
will create a process, initialize it, and set it loose at the relocated
entry point of the code.

Ta-dum.

>I assume that disk, modem, and screen access would be done by server 
>processes completely separate from the C=Kernal, so that other processes 
>could continue to run while a certain process uses a drive, for example.  
>Then the C=Kernal Server could be saved for unusual things that don't yet 
>have driver servers written yet.  It seems that the ideal would be to 
>eliminate the need for the C=Kernal Server altogether, right?

As much as possible, since the C= Kernal is notoriously slow.  It also likes
to disable interrupts while it busy-waits, making it a bit "multitask
unfriendly".  You do have to appreciate that I/O will slow a whole machine,
since most data transfer is done by processor-intensive polling.

>When the system is complete, the null process would be replaced by a 
>shell, or simple login shell, right?  I guess you'd need some kind of 
>shell to load up along with the kernel, since the ability to load further 
>programs will be in the shell; and you'd always want at least one shell 
>process running at all times.

I think that you've misunderstood.  The Null process is necessary in order for
the basic design to work, but its operation will be invisible to the user.
After the system starts, there are two processes on a machine: the Null
process and the Init process (well, the Kernel (with an "E") process too, but
it isn't implemented yet).  The Null process goes into an endless loop at
extremely low priority, and the Init process brings up the system, including
Command-shell processes, Device-driver processes, etc.

The ability to load programs will NOT be in the shell; it will be in the
Kernel process and in the user process itself (it will bootstrap itself, like
in my Master's thesis).  This was mentioned in part I of my design.  The
command shells will just be silly little user processes with no special
powers, like in Unix.

>>What it needs
>>in terms of software is: device drivers, a command shell, utility programs,
>>and an assembler that can produce relocatable code.  Oh where, oh where shall
>>I ever find such software???  ;-)
>
>So, Craig, are you saying that you'll add these features if we'll create 
>the software? :-)  Do you think the device drivers in ACE are anything 
>like what will be needed for BOS?  I haven't seen any screen updates near 
>the speed of those in ACE, so hopefully those could be converted to BOS, 
>and save reinventing them completely.

I was kidding.  With a little bit of reorganization, *ALL* of the code in ACE
will be used in BOS.  The application programs will hardly have to change at
all (if at all).  You didn't think that ACE was anything but another step in
my master plan, did you?  ;-)

>Have you looked at supporting Marko's 1meg expansion?  You'd need to add 
>two more bytes to the PCB block for the PIA registers, but it would allow 
>each process to have any group of 4 - 16k blocks of RAM.  I think it 
>would also eliminate the problem of zero and stack pages being restricted 
>to RAM0.  I'm going to expand at least one of my 128's this spring, so 
>I'll play around with it then and see what can be done.  I think this 
>would also require some customizing of the memory management driver.

I'm looking into it.  The memory allocation will be organized in a page-
oriented way like it is in ACE rather than in a 16K-block-oriented way.

Keep on Hackin'!

-Craig Bruce
csbruce@ccnga.uwaterloo.ca
"Very funny Scottie, now beam me my clothes!"           -- Logan Van Tassell(?)