RAW32 Release 3 documentation


Preface

If you use RAW32 (RAW32.ASM and/or the included examples) you accept the full
responsibility of any damage that RAW32 may cause; I am not responsible for
any damage it may do to any computer you run it on. Note: I have not encoun-
tered a PC on which RAW32 caused any serious damage, see 'Bugs' section.

You are allowed to use RAW32 for anything except for commercial purposes. I
give these sources away for free, but if you want to use any part of it for a
commercial purpose you are required to contact me first. See 'Final words'
how to contact me.

If you use RAW32 or any part of it in a program you make you are required to
mention the name RAW32 and the source you got it from in the documentation
accompanying the program.


Introduction

This is the third release of RAW32. See 'What's new' section.
I release the source code, because it is hard to find some useful info about
protected mode (PM) programming for beginners. The extender/PM header is
called 'RAW32', because it is intended to use it to program under any memory
configuration as if under a raw system (it also got to have a name ;). With
'raw' I mean, access to all the available extended memory using only two
pointers; one that points to the start of the available extended memory and
one that points to the top of the available extended memory. Under a real raw
system the first pointer typically points at 1MB, under DPMI it can point to
anywhere in the address space. So, RAW32 does static memory allocation, i.e.,
at starting up it allocates all the memory it can get and does not free it
until the RAW32 program exits. Raw also means that this is not yet another
one of those extenders that implement the DPMI standard; it does not imple-
ment any standard, which is nice when you want to write PM programs that are
just PM and not DPMI (for example, writing your own OS).

RAW32 is not perfect but it is very stable and very well commented. It is
based on Thomas Pytel's (Tran) START32 from which I learned much. START32
supports only a raw system, and contains some bugs. After figuring out how
the PM mechanisms worked I wanted START32 to work also under XMS, VCPI and
finally DPMI. XMS and DPMI were easy, but VCPI was much more difficult,
mostly because of paging. After some time my version of START32 was very
different compared to the original one so I decided to give it another name
and put it on the internet.
RAW32 does run under Windows (3.1/3.11/95/98), but it is not designed (if it
can be called that) for it. See 'Inner workings' section.


What's new

Bug fixes and changes between RAW32 R3 and RAW32 R2.2

bugs fixed:
-Moved the check for enough high memory under VCPI to after the check for
presence of XMS; if XMS is present XMSinit does the check. Now RAW32 programs
can run under EMM386.EXE with the 'noems novcpi' option.
-Fixed alignment of segments in the NASM version. Now the object modules of
the TASM and the NASM version are 100% compatible.
-<cough> fixed second flat RM bug <cough>...
-Switching to a stack that is located above 1MB won't give problems under
VCPI anymore. It gave problems when calling VCPIcall_int. According to the
specification ss:esp must point to a stack located below 1MB (linear).
-"Fixed" the mysterious bug under VCPI by changing the order of calls to VCPI
function 1 and 3.
-RAW32.ASM now contains a stack segment (i.e., the segment class is 'stack').
This solves the memory corruption when making large (>64KB) RAW32 programs.

changes:
-Changed the handling of IRQs again; the PICs are programmed to the value of
the constants MASTER_BASE and SLAVE_BASE. The PICs keep these base values
when switching to RM. See 'Inner workings' section.
-Because the real mode callbacks work I removed the option to make a version
of RAW32.ASM without RMCBs.
-There is a now a more sound solution for segments with names other than
code16/code32/codeend (codeend was called codeend32); segments with other
names than the mentioned ones have to be declared in RAW32.INC and RAW32.ASM
has to be reassembled. Note that the order in which the segments are declared
in RAW32.INC determines the order of the segments in the executable.
-RMCALL_VECT has now the value 34h instead of 32h so that RAW32 modules are
able to call PM int 33h under Windows, see 'Examples' section; MOUSEx.ASM.
-The carry flag isn't a parameter to setvect/getvect anymore.
-Changed FPS.ASM so that one can see something when running FPS.EXE on a very
fast computer ;)
-Changed the module loader so that it works under DPMI servers that don't
allow a flat memory model (full 4GB address space available). (The example
modules have been changed accordingly.) Windows NT and dosemu (Linux) are
such DPMI servers.
-Moved the check for HIGHMIN out of check_memory, because under dosemu all
interupt 15h calls fail. Reading from the CMOS is no good either. The check
for enough memory is done in DPMIinitPM.
-The IRQ stacksize is now independent of RM_STACKSIZE/PM_STACKSIZE.
-better handling of nested IRQs/RMCBs
-I found a free linker suitable for linking RAW32 programs. It's called
ALINK. R.BAT in the NASM directory now calls ALINK.EXE. You will need a DPMI
host if you want to run ALINK under DOS. CWSDPMI is a good one. See 'Links'
section where to find these programs.
-Added support for 386SWAT. See 'Debugging' section.


Bug fixes and changes between RAW32 R2.2 and RAW32 R2.1

changes:
-translated the remaining examples to NASM (except the mouse examples)
-many cosmetic changes
-FPS.ASM contained a minor error; the 2 putpixel macro definitions were
placed after the statements that used them...


Bug fixes and changes between RAW32 R2.1 and RAW32 R2

bugs fixed:
-stack dump routine dumps the stack properly if esp!=200h...
-The interrupt mask registers are restored to the right values under a PL 0
DPMI server.
-Under VCPI the IRQx_vect variables are now initialised to the values
returned by the VCPI server...
-better handling of the situation where one IRQ handler interrupts another in
RM (with RMCBs enabled)

changes:
-added a Netwide Assembler (NASM 0.97) version of RAW32; RAW32 can now be
build without using commercial tools (if you can find a free OMF linker
somewhere)
-changed some ancient comments
-Some jumps that used a pointer variable as target are now hardcoded.
-added some assume directives to avoid some segment overrides with segment
registers other than ds


Bug fixes and changes between the first release of RAW32 and RAW32 R2

bugs fixed:
-setdsc now sets descriptors with RPL <> 0 in the right way
-bug in gateA20; I came across a 486 which required that there is a delay
after the value of the 8042 output port is read.
-bug in _dosshell; can now be called more than once...
-proper virtualization of flags register
-now works under Quarterdeck's VCPI server (VCPIsel0 is a code selector... of
course this says something about M$ not complying with specifications too ;)
I tested only EMM386.EXE's VCPI...
-bug in 32-bit startup code; clearing the NT flag *is* necessary...
-RM_int now loads _e_sp instead of sp (very frustrating to find that one!).
Look at UNDOC.ASM
-stack is used in RM_int/VCPI_int *after* ss:sp is valid...

changes:
-changed the name of several variables (many underscores removed, and the
case of some variable names has been changed)
-setdsc can set descriptors in the LDT
-changed the handling of IRQs; the default IRQ base numbers are maintained in
protected mode. Now the PIC is not reprogrammed on every IRQ. This solved
some unexplainable behaviour like the need to enable IRQs in VCPIinitPM, the
messing up of page table entries, the COM ports that stopped raising IRQs
after some data was recieved etc.
The ugly thing of this is that IDT entries 8-15(14) are shared between IRQs
and exceptions...
'bug_buffer' could also be removed.
nIRQm21 and nIRQma1 aren't necessary anymore; they have been removed too.
-changed DPMI/VCPI/XMS detection order; the check for DPMI is made last.
VCPI, XMS and raw make PL 0 possible, most DPMI hosts not. PL 0 is more 'RAW'
;)
-I came across several PC's where int 15h/eax=e820h enables IF (bug?), so
interrupts are now disabled *after* check_memory.
-dosshell doesn't pass arguments given to a RAW32 program to COMMAND.COM
anymore
-TR contains a valid value at startup under raw and XMS
-The IDT is now build dynamically. I made this change to fully comply with
the VCPI specification without reprogramming the PICs.
-I added real mode callbacks. See 'Inner workings' section.
-Almost all 'public' statements in RAW32.ASM have moved to RAW32.INC.
-I have added some examples. See 'Examples' section.


Features

Yes, this little thing even has unique features!
* It supports >64MB systems under all memory configurations. Yes, even under
VCPI with a VCPI server which does not support that (Watcom's DOS/4GW does
not do this for you). See 'Inner workings' section.
* It has a physical to linear memory mapping function as opposed to Tran's
PMODE 3.07...
* New in release 2 is an executable loader that loads RAW32 programs in
extended memory. Now you can make RAW32 programs that are several MBs in
size.
* A TASM and a NASM version of which the object modules are 100% compatible.
Well, that's it...;)


Inner workings

If you want to know every detail, look in RAW32.ASM; it is very well commen-
ted.

64MB
RAW32 can use >64MB even under VCPI servers that support less (for example
EMM386.EXE that supports 32MB at maximum). It does this by checking if there
is XMS present too when VCPI has been detected. 
With EMM386.EXE this will always be the case, but the VCPI specification is
independent of the XMS specification. XMS supports 64MB (SXMS 4GB). If XMS is
present RAW32 allocates memory through XMS instead of through VCPI. This is
not only nice when you want to access all memory under a VCPI server that has
limitations, it is also *much* faster than allocating memory VCPI-style at
4KB increments. Under VCPI paging could be - if programs are loaded high it
is - enabled. So, if RAW32 allocates memory through XMS it creates page table
entries for it. The page tables are created in extended memory, so the
precious DOS memory is not wasted (I do not know why CWSDPMI does not do
this); for every MB of memory 1KB of page table entries is needed...
Another reason for creating the page tables in extended memory is that in
that way the maximum amount of accessible memory under VCPI is not limited to
the amount of free DOS memory. Ok, 500KB free memory after a RAW32 program
starts is not uncommon and would provide for enough page table memory to make
500MB accessible. But why introduce a limit when it's not necessary? To
support the full 4GB the PM VCPI startup code needs 3 pages in extended
memory to begin; one for the page directory, one for the page table returned
by the VCPI server which makes the first 4MB accessible and then one page
table to be sure that page tables can be created for the not-yet-mapped-memo-
ry in case that there are no free entries left in the first table (no free
memory under linear address 4MB).
At this point nothing stands in the way to support 4GB; the page directory
contains two entries of which one will provide for 4MB 'fresh' memory. One of
the two current page tables contains an entry that maps the second page table
in memory. When the page table creation loop starts it makes the linear
addresses following the second page table accessible (by filling in the
entries) which then in turn can be used to contain page table entries during
the loop. No more than the mentioned page directory entry is needed, because
4MB of page table entries is enough to map 4GB of memory ;)
The 'dark' side of this is that you need 12KB of extended memory under
_VCPI_. That's why VCPIinit checks for HIGHMIN+12.

Windows
It is called RAW32, yet I wanted it to run under Windows too. But many people
would agree with me that Windows is one big bug (have a look at DPMIPL0.ASM).
RAW32's DPMI code behaves strictly according to the DPMI 0.9 specification.
First it asks how much memory it can lock, and allocates that amount. Then
(this should not be necessary, but it is) it tries to lock an ever smaller
amount until it succeeds. Under Windows version 3.1 and 3.11 this results in
a very unstable system, i.e., it can hang or reset at any time. Sometimes
this can result in a couple of lost clusters (run ScanDisk (...) and everyt-
hing should be ok again). Windows 9x survives, but since it does some -
inferior way of - multitasking, locking all the memory you can get slows down
the system terribly (why does it give a too large amount?!). To make life a
little easier under Windows, remove the semicolon (if present) in front of
the line:
WINFRIENDLY     =       0 
If RAW32 is assembled again, (under Windows) it locks only half the amount of
memory it should be allowed to lock. Another way of being more friendly
towards Windows is by not locking the allocated memory so that Windows can
swap your memory to disk, but than the meaning of raw would be completely
lost...

API
RAW32 has a little API - allocating/setting of descriptors, getting/setting
of interrupt vectors, allocating memory from it's heap, mapping physical
addresses to linear addresses, putting a string to screen and a DOS shell
routine - but it doesn't make sense to copy the headers of those functions in
this document; they speak for themselves. Look at the examples and headers(!)
to see how those routines should be used.

Real mode callbacks
When I wrote the real mode callback code I didn't have a good VCPI server at
my disposal. With 'good' I mean a VCPI server that programs the PICs to base
interrupt values other than 8 and 70h. I assume that in V86 mode IRQs are
intercepted by the VCPI server, and that the VCPI server calls the conventio-
nal real mode vectors of 8 and 70h (how else could DOS ever run under such a
VCPI server?).

I added the real mode callback (RMCB) code so that even in RM the PM IRQ
handlers will be called. When for example RM int 16h is called (which by the
way enables interrupts) while the PIC is programmed to generate IRQ 0 at
1000Hz the original RM IRQ 0 handler will be called at this rate too...
Of course, RAW32 could reprogram the PIT at every IRQ (as System 64(?) does),
but besides that I dislike that and that this solves the problem for IRQ 0
only (in a way), one can imagine worse scenarios when there is no RM handler
or the RM handler must not be called.
Because the 'execution chain' is quite complex RMCBs are slow, but the nice
thing is that RMCBs are active only when a RM int is called and interrupts
are enabled. By default redirecting IRQs to PM looks like this (better get a
cup of coffee or tea first :)
The RAW32 program calls a RM int with int RMCALL_VECT. If the int handler
enables interrupts (RM_int and VCPI_int disable interrupts) and takes too
long to do it's job or waits for something an IRQ could be generated. Because
the processor is executing in real mode (under VCPI V86 mode, but the VCPI
server should make the system behave like when in RM) and because RM_int
loads the IDTR with a RM compatible value (i.e., base=0 limit=4*256-1), the
RM interrupt vector table (IVT) is active. So, the IRQ vectors through the
IVT. IRQs 0-7 correspond with RM ints MASTER_BASE - MASTER_BASE+7 (fill in
the value that MASTER_BASE gets at the beginning of RAW32.ASM), IRQs 8-15
correspond with RM ints SLAVE_BASE - SLAVE_BASE+7.
Now the RMCBs come in action, because init_callbacks changed all these
vectors and made them point to the labels RMcallbackIRQn in the code16
segment of RAW32. If for example IRQ 0 is generated execution arrives at
RMcallbackIRQ0. Then the interrupt number which corresponds with the current
IRQ is loaded to determine which PM IRQ handler should be called. This value
is used to find the selector and offset of the PM IRQ handler.
Then a switch is made to PM and the PM IRQ handler is called. By default the
RAW32 PM IRQ handlers do nothing and just reflect the IRQ back to RM (i.e.,
they call the RM IRQ handler). Then things get a bit more complicated.
RMcall_int and VCPIcall_int save the stack registers ss:esp (they are
destroyed by RM_int and by switching to V86 mode via the VCPI server) and
restore ss:esp when the RM int returns. But when a RMCB calls the PM IRQ
handler ss:esp points to another stack than when an IRQ is generated while in
PM. Therefore more than one variable is necessary to save those stackpointers
(ss:esp). That is why an extra offset is added to ebx (stackoff). stackoff
contains the offset from the first stackpointer variable to the second
multiplied with the nestinglevel if a RMCB called the PM IRQ handler. I
tested how many nestinglevels can occur under extreme circumstances, and I
got nestinglevels up to 5. stackoff has the value 0 if not a RMCB called the
PM IRQ handler (i.e., an IRQ occurred in PM or a RAW32 program called a DOS
or BIOS routine).
If only one variable would be used to save the PM stackpointer, that variable
would contain the proper value as long as no RMCB becomes active. When a RMCB
becomes active the variable would be overwritten without any problems until
the RM int returns and the original PM stack should be restored. The stack
would be restored to the PM stack left when exiting the RMCB...
When execution has finally arrived at RM_int or VCPI_int a check is made if
it is an IRQ handler that should be called or a general int handler. Not that
there is any difference, but the RM IRQ vectors now point to the RMCBs in
RAW32 and just doing INT RM_intnum/VCPI_intnum would result in an infinite
loop if MASTER_BASE or SLAVE_BASE has a value that one of the PIC bases had
before the RAW32 program started. So, if a RM IRQ handler must be called a
jump is made to RM_IRQ which calls the appropriate address in orgIVT which
contains the orginal RM IRQ handler addresses (initialised by
init_callbacks).
When the RM IRQ handler returns, execution is given back to RM_int/VCPI_int
which returns to RMcall_int/VCPIcall_int which returns to the PM IRQ handler.
When the PM handler returns, a switch is made back to RM and RAW32 returns to
the RM code that was interrupted by the IRQ.
In short:
IRQ in PM -> PM IRQ handler -> RM IRQ handler -> PM IRQ handler's exit ->
execution in PM continues

RM int called in PM -> RM int handler -> if too long in RM and ints enabled
-> IRQ in RM -> RMCB -> PM IRQ handler -> RM IRQ handler -> PM IRQ handler's
exit -> RMCB's exit -> execution in RM continues

Two things about MASTER_BASE and SLAVE_BASE; if you give one of them a value
of 0 or 8 (i.e., all possible PIC base values below 32 that don't conflict
with standard interrupt numbers) if an exception occurs an IRQ handler will
be called instead. This is so, because buildIDT first fills in the interrupt
gates for the exceptions, then for the IRQ handlers. In this way, you can use
the standard DOS values of 8 and 70h and have a working program with RMCBs
(obviously without correct handling of most of the exceptions). Another
problem with MASTER_BASE and SLAVE_BASE with the mentioned values arises with
many (all?) int 33h implementations. The problem is that int 33h/ax=0 (Mouse
reset) reinstalls interrupt vector 12d or 13d for serial mice. This effecti-
vely removes the RMCB that was installed in that place in the IVT.

V86 mode
The whole idea of RMCBs wouldn't be necessary if V86 mode would be used to
handle RM int calls. Actually, I planned to let RAW32 R2 handle them in V86
mode.
Drawback: it's a little slower - an INT instruction versus a complete
V86-monitor. That's why I wanted to make an extra version.
Advantages:
* IRQs always go through RAW32, so no IRQs will be missed (without the
ugliness of RMCBs).
* The same code can be used under raw/XMS/VCPI to handle RM ints
The version I made (not included) works, but has some problems under VCPI -
especially when DOS is loaded high. Then many RM int calls result in things
like execution that ends up in some data area in the HMA or other nice things
:( The dosshell function for example can give nice results... However, the
code is good enough to let most of the examples run that don't call DOS (with
DOS loaded low). Mail me if you want it.
The problem is that handling RM ints via V86 mode under VCPI needs VCPI
implementation specific code. That's not only ugly and incompatible, it's
also difficult to find out the VCPI server specific details :(


Debugging

If you want to step through a RAW32 program use a debugger with capabilities
like 386SWAT (see 'Links' section). That is, if you want to debug your code
under raw/XMS/VCPI. If the line which defines the constant 'SWAT' isn't
commented out you can use 386SWAT to debug your code.
If you want to debug under Windows' DPMI you could use SoftICE. Note that
SoftICE is a commercial program.
If you want to debug under DPMI (DOS) you'll have to search for a debugger
yourself, because I haven't seen one that can debug RAW32 programs. I have
mainly used Turbo Debugger - an old version which doesn't have it's own DPMI
server loaded, so that I could debug RAW32 also under raw, XMS and VCPI. That
means starting TD, running my program and afterwards using watches to look at
the values of test variables. I wish I had tried 386SWAT earlier, it could
have saved me from a lot of debugging time...


Bugs

In VCPIinitPM you see that I use the set8259vectors macro. Yes, as I wrote in
the comment this is ugly, but necessary under VCPI when there is not XMS
present also. When I make some modifications to force memory allocation
through VCPI (VCPIinit, VCPIexit: remove the jump after check_XMS) my
keyboard stops generating IRQs... However, the serial port, the harddisk and
the timer all happily continue generating IRQs. The problem doesn't come up
when the allocation loop takes less time (for example by manually giving
VCPImem a small value).

Not a bug but more an imperfection is that RAW32 doesn't reload all segment
registers before switching to RM when calling a RM int/IRQ. This means that
under raw or XMS all selectors should have a limit > 64KB or exception 13
could occur while in RM (look at UNDOC.ASM). All segment registers are given
a value in RM, but as you can see in UNDOC.ASM that doesn't reset the segment
limit that was active in PM. If you want to be able to call a RM int while
one of the segment registers has a limit < 64KB, add some lines in RMcall_int
that load all segment registers with data16sel. By default es has a limit of
256 bytes and is the only selector that could cause trouble (guess how I
discovered this imperfection ;) Therefore, RMcall_int loads _es_ with a RM
compatible value to access some variables.

The getvect and setvect functions don't get/set exception handlers under
DPMI. They call DPMI function 204h/205h respectively which do nothing
concerning exceptions. It shouldn't be a problem though; if you want to do
anything serious that needs to install new exception handlers you shouldn't
do it under DPMI anyway. I (you?) could change getvect and setvect in such a
way (for example one extra parameter) that they call the appropriate DPMI
function under DPMI, but I don't feel like implementing something I won't
ever use.

Since Windows 3's DPMI is totally lame RAW32 will crash if 'WINFRIENDLY' is
not defined at compile time. See 'Inner workings' section.

Under Windows NT and dosemu the value of totalextmem has an incorrect value;
all the int 15h functions in check_memory fail under Windows NT/dosemu and
the values read from CMOS addresses 30h and 31h are not the real values.

If you have solved one of these bugs, know a possible solution or have found
a new bug please mail me.


Examples

To assemble an example and link it to RAW32.OBJ use the batch file R.BAT. For
example, to assemble HELLO.ASM type 'R HELLO'. If you haven't assembled
RAW32.OBJ yet or if you have made some change to RAW32.ASM type 'R HELLO.ASM
-r'. That's '-r' and not '-R'. R.BAT calls the assembler and linker directly,
so they should be in the current directory or in one of the directories in
the PATH environment variable.
If you are in a hurry have a look at DPMIPL0.ASM and LOADER\LOADER.ASM.

HELLO.ASM
'Hello, world!' in bright white on bright blue ;)

FLATVID.ASM
This example shows the basics of using a VBE linear framebuffer.

PLSWITCH.ASM
This example shows some aspects of switching between privilege levels (PLs).

MULTIT.ASM
This is a little multitasking example ('round robin'). I made it when I
experimented a little with task gates, so it isn't the best source if you are
really interested in multitasking. I tried to keep it simple.

FPS.ASM
This example shows some use of the timer to dynamically measure and display
how many frames can be drawn in some frame drawing loop.
FPS.ASM contains three different ways to fill a screen with pixels. The
strange thing is that not all PCs (video cards) show the same one as the
fastest.
This example also shows how to use the BIOS fonts to write text to a graphics
screen.

MOUSEx.ASM
MOUSE2.ASM shows some aspects using the mouse by means of RM int 33h calls.
It also shows some graphics palette programming stuff that I added to look at
something nicer than a dull black screen while programming the mouse examples
;)
MOUSE3.ASM does a little more in that it draws the mouse cursor itself.
MOUSE4.ASM hooks the appropriate IRQ to avoid polling as in MOUSE2.ASM and
MOUSE3.ASM.
MOUSE4 shows two noteworthy things:
1.) How to get the mouse coordinates and button status without the option of
allocating a real mode callback and without calling RM int 33h itself. It's
even better than using a callback as one would normally do under DPMI,
because MOUSE4.ASM needs only 2 mode switches, while when using a RMCB it
would need 4 mode switches (assuming the program runs in PM when the mouse
IRQ occurs). In short:
(regular DPMI (under DOS) callback solution)
PM -> IRQ -> PM IRQ handler -> RM IRQ handler (e.g. mouse.com) -> user
routine (=your installed RMCB) -> your PM mouse event code -> RM IRQ hand-
ler's exit -> PM IRQ handler's exit -> execution in PM continues (=4 mode
switches)
(MOUSE4 solution)
PM -> IRQ -> PM IRQ handler ('mouseIRQ') -> calls RM IRQ handler -> user
routine ('mouse_sr') -> user routine sets variables -> RM IRQ handler's exit
-> execution in PM continues (=2 mode switches)
Things get worse for the MOUSE4 solution when the IRQ occurs in RM. Then 4
mode switches occur before execution in _RM_ continues. Then the regular DPMI
solution needs 2 mode switches, but your code shouldn't be doing so much in
RM anyway ;) The dark side of the MOUSE4 solution is that it won't work under
Windows (that's actually the bright side ;) , therefore:
2.) It shows the use of Windows' PM int 33h interface.
 
V86-x.ASM
These are two examples that show some aspects of V86 mode. V86-1 shows how to
get into V86 mode and out again. It also shows something I read in the Intel
80386 Programmer's Reference Manual about calling PM int handlers from V86
mode.
V86-2 shows how to handle a RM int called from V86 mode.

DPMIPL0.ASM
This is an example I made after I read about it on the PM mailing list.
Basically I just followed the algorithm given by Vadim Drubetsky (Black
Phantom) with a few improvements (he was talking about DPMI under Windows
which doesn't use the last two entries in the GDT). So credits go to him. It
works under Win 3.x/Win 9x/CWSDPMI/QDPMI and probably under all DPMI servers
that don't have page level protection. To make it work under Quarterdeck's
DPMI just disable ints before going to PL 0.

UNDOC.ASM
This is an example that shows some undocumented behaviour of the 80386 and
above. I knew about the so called 'flat real mode' (or unreal mode etc) and
have used it for a little RM VBE program, but I never really thought about it
until I came across the esp/sp bug. I wrote UNDOC.ASM to test a few things of
the flat real mode. The funny thing was that *all* fields of the hidden part
of a segment register stay active when switching from PM to RM. That includes
the default bit for a data selector as was the case in the esp/sp bug.

LOADER.ASM
I made this loader with three things in mind:
-to be able to make programs with RAW32 that are larger than 640KB
-do this with TASM/NASM only (no separate RAW32 specific tools)
-do this with as little modifications to a standard RAW32 program source as
possible

Look at FLATVID.ASM and MOUSE4.ASM in the directory LOADER and compare it
with their 'normal' counterparts to see what modifications a RAW32 program
needs to make it work with the RAW32 loader.

Use the batch file MAKEL.BAT to assemble the loader. Use MAKEM.BAT to
assemble a RAW32 'module'. A module is just an MZ EXE with a different
extension (RX, to avoid accidental starting of a module) and normally
consists only of 32-bit code. This as opposed to a normal RAW32 program which
has the 16-bit part of RAW32 linked to it.
When LOADER.EXE is started without arguments it tries to load the file
MODULE.RX. If you want it to load another module you can change the default
name in LOADER.ASM or give the module name as an argument, e.g. 'LOADER
FLATVID.RX'.
When making a module you have to follow the following rules:
1.)
Add the following lines to the RAW32 program source
TASM:
before including raw32.inc:
NO_EXTERN       =       0
include raw32mod.inc

NASM:
before including raw32.inc:
%define NO_EXTERN       0
after including raw32.inc:
segment code32
%include "raw32mod.inc"

Note the place of the segment directive in a NASM module source. Note also
that segment attributes are omitted; they are specified in RAW32.INC.
2.)
The relocation code of the loader treats the address to which a relocation
item points as a dword. Therefore you must always use 32-bit operands when
loading a segment name (which results in a relocation item) as in
  mov eax,code32
or
  mov var1,seg var2
where code32 is a segment name and var1 a 32-bit variable. If you use a
16-bit operand as the target, 2 bytes in the code will be overwritten by the
relocation code of the loader...
3.)
TASM:
After the 'end' directive you must put 'main' so that the linker will fill in
that label's offset as the entry point in the MZ header. The loader jumps to
the entry point indicated by that field in the header.
NASM:
Change the 'main' label into '..start'.
4.)
The 'main' (NASM: '..start') label must be within the first 64KB of the RAW32
program. If you have more than 64KB of static data in your program, just put
it somewhere after 'main' ;)
5.)
Because a RAW32 module is loaded in extended mem you can't directly pass
variables in your program's segment to RM BIOS routines. First you have to
allocate a buffer in low memory with getlomem, then copy the data to that
buffer and then pass the address of that buffer to the BIOS routine. Again,
look at FLATVID.ASM in the directory LOADER for an example.
6.)
Relocation items may occur only within the 64KB in a segment and within the
1MB of the module. TLINK won't link your program if this is not the case, so
you won't be surprised, but at least you have been warned.

NASM specific:
7.)
Because NASM uses a stricter (better, I think) convention to distinguish
between offset constants and memory references, all the references to labels
in RAW32.ASM have to be placed between square brackets. This, because in a
RAW32 module these labels refer to far pointers instead of labels. E.g. 'jmp
exit' has to be changed into 'jmp far [exit]' and 'call putstr' into 'call
far [putstr]'.
8.) 
If the module is very large you need an appropriate compile of NASM, e.g., a
DJGPP compiled NASM. Don't forget to get CWSDPMI.EXE. If you're desperate use
NASMW.EXE...

Using the loader under Windows NT/dosemu
When you want to write a module that works also under Windows NT/dosemu you
must not forget that you can't make use of flat memory model style addres-
sing, i.e., making use of segment wraparound. This means that you can't
access memory below the module relative to code32sel. However, the solution
is simple; when addressing memory below the module, calculate the linear
address and use zerosel.


Links

NASM - Netwide Assembler
http://www.web-sites.co.uk/nasm/index.html
ftp://ftp.simtel.net/pub/simtelnet/msdos/asmutl/

CWSDPMI - DPMI host
ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2misc/

ALINK - Linker
http://alink.home.dhs.org
which redirects to
http://www.geocities.com/SiliconValley/Network/4311/

386SWAT - Debugger
http://www.sudleyplace.com/swat/swat.htm


Credits

Give credit were credit is due, so credits go to
Thomas Pytel (Tran)     tran@phantom.com                - START32
Adam Seychell           s921880@minyos.xx.rmit.edu.au   - DOS32 v1.2
Desmond Germans (Simm)  germans@cs.vu.nl                - SYSTEM64 v1.210
                        npgerman@nat.vu.nl
                        nigerman@nat.vu.nl
David Jurgens                                           - HelpPC 2.10
Emil Vaara              ezy@hem.passagen.se             - HelpPC update
Ralf Brown              ralf@pobox.com                  - The interrupt list
Herman Dullink          csg669@wing.rug.nl              - D32
                        herman.dullink@prgbbs.idn.nl
Charles Sandmann        sandmann@clio.rice.edu          - CWSDPMI


Final words

If you make something with RAW32 please let me know, I am very interested. If
you have questions about PM programming in general do not mail me but
subscribe to the protected mode mailing list.
To subscribe:
  send mail to: pmode-l-request@phys.uu.nl
  subject: none
  body: subscribe pmode-l email@yourisp.name
To unsubscribe:
  send mail to: pmode-l-request@phys.uu.nl
  subject: none
  body: unsubscribe pmode-l email@yourisp.name
Note: That's PMODE-L (not PMODE-1). Use the 'end' command at the end of your
email to prevent junk at the end of your message to be interpreted as
commands (several free email providers add a signature at the end of your
emails). Use pmode-l@phys.uu.nl to send email to others in the list.
At the risk of labouring the obvious send a mail to Majordomo@phys.uu.nl with
in the body the command 'help' to know which commands the mail server
recognizes.

If you have questions about RAW32 you can send them to me. I would be happy
to answer them :) You can contact me via email at dbjh@gmx.net

Danil Hrchner

<end of file>