Contents Up << >>
The Structure of a .COM File
A .COM file consists entirely of executable code and data.
When the file Hello.COM is executed, for example (by typing
either Hello or Hello.COM at the DOS prompt), the contents
of the file are simply loaded into memory. When the file has been
loaded, execution starts with the first byte. All of the segment
registers are set to point to a single 64K segment starting 256 bytes
before the address where the program was loaded, so in fact execution
starts at CS:0100. The first 256 bytes of the segment comprise
the Program Segment Prefix (PSP), which contains a variety of pieces of
information about the executing program, mostly obsolete (holdovers from
the days when MS-DOS was first designed as a clone of the old CP/M
operating system, which was developed in the mid-70's to run on the
original Intel 8080).
The most useful field in the PSP is the tail of the command line; for
example, if Hello.COM had been executed by typing
Hello/full C:\Temp, then the string /full C:\Temp would be
stored in the PSP. The program can access this argument string starting
at offset 80h; the first byte gives the length of the tail (13 in
the example), and that many bytes starting at 81h contain the
string itself. The string is terminated with a carriage return
character (ASCII code 0Dh), which is not included in the count.
Since all of the segment registers point to the same segment, the
structure of a typical .COM program in memory is as follows:
PSP |
Program Text |
Initialized Data |
Uninitialized Data |
Free Space |
Stack |
The program text and initialized data are the bytes that are read in
from the .COM file, corresponding to the .text and
.data sections of the NASM source. The PSP is generated by the
operating system, and the stack is automatically arranged to grow down
from the top of the segment. The uninitialized data, corresponding to
the bytes reserved in the .bss section, are carved out of the
free space between the loaded bytes and the growing stack; since they
were not explicitly initialized before execution, they will start out
containing whatever garbage was left in those locations of physical
memory by the previous programs.