Protected Mode Programming

Protected Mode Guide

Part Of The TiTan OS Development Project

Author : Amgad Magdy Madkour

Operating System Sector : Boot

Programming Language : Assembly

Revision : 2

Last Update : 8/8/2002

Preface :

This is almost a starting guide to start knowing about protected mode programming and a little bit about history and evolution of the protected mode starting from 286 machine. This guide is also a part of the TiTan OS project developed by our group.

Introduction :

Protected mode programming started to appear in the 286 computers, Intel cooperation engineers wanted to support extra features over the previous versions such as the 8086. First there is an important point that has to be clarified which is the different types of modes supported by the computer. First there is the (Real Mode) , (Protected Mode) and (V86 Mode). Before 286 computers only supported one type of mode which was Real Mode. After that starting from 286, computers started to support more than Real Mode, for example 286 now supports Real Mode and Protected mode .Although Protected Mode was limited in the 286 it was still considered as major enhancement on previous computers.

Some basic Terms:

->Logical Address: Contains offset address and segment value Ranging from 0000 to FFFFH

->Physical Address: Is the actual address consisting of 20 bits ranging from 0000H to FFFFH

->Segment Base Address : It is the segment value multiply 10h

(Note: To change from logical to physical address just add a zero to the end of the value of the code segment and then add to it the offset)

Protected Mode Advantages:

Memory protection which does not allow programs to overwrite operating system data stored in memory
Accessing up to 4GB of Memory which is considered as one of the most advantages of protected mode over real mode. In previous computers such as 8086, it only had 1 MB of memory.
Virtual Memory which is used by most Operating systems today , implemented by paging mechanism in the Memory Management unit is also made available in protected mode. This mechanism helps in tricking the computer in thinking that it has actually 4GB of memory by having it on the hard disk instead on a physical chip of memory.
Use of MMU , which started in protected mode (286+), and it includes 2 important concepts which are paging and segmentation.
Systems are now Multitasking thanks to context switching of processes

80386 Registers :

In the 386 + we have now extended general purpose registers which are 32 bit instead of 16 bit long . Besides that in the 386 there was 2 additional registers added which were (fs) and (gs) which are 16 bit registers which are used to access more than one register at the same time without reloading the register like before.

Also now the flag register is 32 bit instead of 16 bit .386 also added some few registers

CR0-CR4 : Four control registers were added which were used to set configurations in many modes specially concerning us , protected mode
DR0-DR7 : Debug Registers
TR : Task register

Protected Mode And Protected Mode Segments :

Now enough introduction and lets go into the real stuff. First of all you have to know that in real mode segments are different than in protected mode , in protected mode segment registers are called ( Selectors ) . this is due to the fact that the value in the segment is an index to something called a (Segment descriptor) .

* One important point has to be clear , in real mode when we wanted to get the physical address we would add a zero to the segment register and add the IP(instruction pointer) on the result , well in protected mode its different , here we have a segment that acts as an index to an address in a table which there contains the physical address. Also in real mode segments are 64k long and between each segment and segment there is a 16 byte long space , in protected mode segments can be as big as 4GB long and you can put them in any place in memory

The selector contains three main fields

15 3 2 1 0

|-------------------------------------------------|

| Index | TL | RPL |

|-------------------------------------------------|

where RPL : Request Privilege level

TL : Table Indicator (GDT=0 , LDT=1)

Index : Index into the descriptive table

The three TL’s are GDT, LDT or IDT but we don’t use IDT

GDT =0

LDT =1

(Note : Segment selectors have nothing to do with where it resides in memory !, the index is responsible for such determination)

Descriptors :

A Descriptor is basically an entry in the descriptor table (explained below), it simply contains base address and limit of certain segments , also

It contains basic privilege level bits of the segment being described.

There are two types of descriptors, code/data descriptors and system descriptors.

The descriptor is built for every piece of code or data by the operating system or more precisely for each segment.

Descriptor Table :

Computer systems contain in protected mode what is called a descriptor tables .There are 3 types of descriptor tables

GDT : Global descriptor table
LDT : Local descriptor table
IDT : Interrupt descriptor table

The difference between them is mostly logical .In global descriptor table , it contains descriptors

which describes segments of the operating system as explained earlier.

On the other hand local descriptor table contains descriptors that describe the applications of the operating system.

The interrupt descriptor table contains descriptors of the interrupts of the system.

Of course we would note that these data structures as I like to call them are only present in protected mode.

In real mode we used normal linear flat addressing mechanism where in protected mode the descriptors in the descriptive table

Contains a logical address to memory.

Descriptor tables place in memory is determined by special registers , each for its kind , we have

1. GDTR : for global descriptor table

2. LDTR : for local descriptor table

3. IDTR : for interrupt descriptor table

To determine the place of the GDT and LDT the GDTR and LDTR Registers are used . These registers are 48 bits wide (6bytes) .

The descriptor table address is held by the descriptor table registers. For example the GDTR holds the address of the GDT .

GDTR is 48-bits long , meaning 6 bytes long , first 2 bytes contains the LIMIT which is used to identify the size of the GDT in bytes. If LIMIT is 00FFH then the table is 256 bytes ( 2 power 8 ) . The GDT can be up to 65,536 bytes long because it can hold 16 bits (2 power 16) .The other four bytes of the GDTR which is labeled generally as BASE locates the beginning of the GDT in memory. This 32 bit address can make the GDT be placed anywhere in memory.

There is also an important point about descriptors , as we know that GDT contains information about code or data of the system stored in it. Descriptors in the GDT are eight bytes long , thus if the size of the table is 256 bytes then the table contains 32 descriptor (256/8). The number of descriptors can increase if the LIMIT in the GDTR increases. The value of the BASE and LIMIT must be loaded before we change from real mode to protected mode and once we are in protected mode the location cannot change of the GDT in memory.

The GDT limit is a fixed value: number of descriptors * 8 - 1

It is also important to note the difference between the GDT and LDT at this point , at the moment concerning our protected mode setting up LDT is not important as GTD , GDT keeps information about different parts of memory .

Switching to protected mode :

This is perhaps the first part or programmatically changing/checking real mode to protected mode .

First step is to set what is called a null descriptor in order to use it a size template , meaning that we use it just to get its size which that size

Would be used to define the size of the other descriptors we would use , so its just a template ..

It is defined as followed :

gdt: dw 0 ; would contain segment limit from bits 0-15 of the descriptor

dw 0 ;would contain base , which is bits 0-15

db 0 ;contains base from bits 16-23

db 0 ;contains type of segment , either code/data or system

db 0 ;flags from 16-19 (Privilege level etc. )

db 0 ;Base address from 24-31

As you can see by this we covered all the descriptor bits from 0-32 bits.

Secondly we would define the GDT code and data segments our OS would be using

For code segment it would be like this :

CODE equ $-gdt ; as if u are saying sizeof(gdt) which is the null descriptor size , which is the same as this !

gdtcode: dw 0xFFFF ; limit , here it is the maximum limit of 16 bit segment

dw 0 ;base address , gets sets from beginning , shown latter

db 0 ;contains base from bits 16-23

db 0x9A ; code segment , which is present and readable only

db 0xCF ; flags are granular and 32-bit

db 0 ;Base address from 24-31

The data would be almost the same :

DATA equ $-gdt ; as if u are saying sizeof(gdt) which is the null descriptor size , which is the same as this !

gdtdata: dw 0xFFFF ; limit , here it is the maximum limit of 16 bit segment

dw 0 ;base address , gets sets from beginning , shown latter

db 0 ;contains base from bits 16-23

db 0x92 ; data segment , which is present and writable

db 0xCF ; flags are granular and 32-bit

db 0 ;Base address from 24-31

We have to note that code segment is only readable and data is writable. this document doesn’t go into describing each bit of the privilege , if u want to know it then check the intel reference manual for details about each bit of the privilege levels and permissions

The next thing is to set the interrupt table , it is as simple as the global descriptor table, but the descriptor is different :

idt: dw handle ; handler of the interrupt

dw CODE ; its default segment

db 0 ; reserved for word count

db 0x8E ; flags : 32 bit , Ring 0 , Interrupt gate

dw 0 ; entry point of the interrupt from bits 16-31

We will notice that this piece of code is repeated 32 times to indicate the 32 reserved interrupts we should set

Now that we know what our descriptors would look like lets set them now ..

xor ebx,ebx

mov bx,cs

shl ebx,4

mov eax,ebx

mov [gdtcode + 2],ax

mov [gdtdata + 2],ax

shr eax,16

mov [gdtcode + 4],al

mov [gdtdata + 4],al

mov [gdtcode + 7],ah

mov [gdtdata + 7],ah

In this section , our first step is to initialize the ebx register by performing an XOR operation which is a fast operation instead of using a MOV operation to initialize the register . Next we copy the value of the code segment in bx and then shift its value 1 hex which would give us the segment base address. (ie: if the value of the segment was 1234 now it is 12340) . next we take the value in the ax register and save it as the base address of the code segment in the descriptor table of the gdtcode , which is actually the global descriptor table . then in the next step we would shift by 16 bit which is equal to 4 HEX values , then save the rest of the base address in the proper place in the descriptor at position 4 and 7 .

Example : assume cs is 1234 and then we made the shift which would make it 12340 , but we have in ax 2340 only (16 bit) and the 1 is in the extended part of the ebx register … so we save this part in the base address place from bits 0-15 , then we shift the register by 16 bit which is 4 HEX which makes the register look like 0001 , because now the one came from the extended back to the old part again , that makes al = 01 HEX and ah =00HEX . so we set in the global descriptor table that bits 16-23 which is 8 bits wide would contain 01 HEX which seems very logical . now we set the rest which is bits from 24-31 with ah which contains 00 HEX

Thus we have the base address of the segment stored in the descriptor as a 32 bit base address with a value of ((00012340)) ..

Next after setting the global descriptor table , its time to point to it , as we said before we would use the GDTR and IDTR in our case to refer to them .

In NASM we do not have GDTR and IDTR so we would define them as followed

gdtr: dw gdt_end – gdt -1 ; define limit

dd gdt ;define linear/physical address of GDT

idtr: dw idt_end – idt -1 ;define limit

dd idt ;define linear/physical address of IDT

What we are trying to say here is that gdtr and idtr contain the gdt and idt limit which is the start of them minus the end minus 1

And also we store the 32 bit linear/physical address of both of them too , that makes each segment equals 48 bits wide

(( dw = 16 bit and dd =32 bit ))

Now to realy set these values we do the following :

add ebx,gdt ;ebx contains the 32 bit base address which we will add on the 32 bit gdt address

mov [gdtr+2],ebx

add ebx, idt – gdt

mov [idtr + 2 ],ebx

This is a very simple step in which we obtain the physical address by adding the base address of the segment to the offset

((In our case the content of ebx is 00012340n which would be added to whatever offset of gdt is and the result would be stored in ebx as the souce register

and after that we save that in the gdtr position of the linear/physical address . Same applies to idt with the difference that we subtract idt from gdt , meaning subtract the offset of idt from gdt to obtain the the actual address of idt , then this address would be saved in the linear/physical part of the idtr.))

An important step at this point is to disable all interrupts before we load the gdt and idt , so we declare the following

cli ;clear interrupt flags

Next we load the gdt and idt via there registers gdtr and idtr because they contain pointers to the :

lgdt [gdtr] ; load global descriptor table

lidt [idtr] ; load interrupt descriptor table

As we said before that we have 4 control registers (CR0-CR3) , we would be caring about CR0 first of all . The CR0 contains 5 lower bits which are called MSW which we said was the machine status word . the bits are

R: indicates what co processor is used , 1 is for an 80387 and 0 for a 80287

TS: automatically gets set when switching from one task to another

EM (Emulate): Set to 1 to indicate software emulator is used to do numerical co processing instead of the hardware

MP(Math Present): Set to 1 to indicate numerical coprocessor is present in the micro computer .One of the MP or EM must be set only.

PE (Protected Mode Enable):

This is the most important bit that we check to see if the system is in real mode or in protected mode , when starting machine this is PE=0 and to enter protected mode PE=1 , and can be back to real mode by setting it to PE=0

*Out of the 5 bits rage , is a bit called PG which enables Paging (PG=1)

Now comes the most important part which is enabling the protected mode bit in the control register CR0 , this is a very easy step that is done in 3 lines :

mov eax,cr0 ; move the content to eax to be modified

or al,1 ; enable first bit of cr0 be performing an OR operation

mov cr0,eax ; put content back to cr0

We must after that perform a jump operation in order to clear any operations that may have been occurring in real mode, or better yet to be precise the purpose of the jump is to clear the pre fetch queue . This is done as followed

jmp CODE:promode ; CODE is the segment we are working with and promode is the offset/label to go to

Congratulations ! now we are in protected mode ! we are working now in a 32 bit environment , now we will just initialize the data segment , stack segment and the fs and gs register with the content of the DATA selector we defined . ( remember segments are called selectors in protected mode )

[BITS 32] ; Has to be written to indicate 32 Bit environment

promode: mov ax,DATA

mov ds,ax

mov ss,ax

mov es,ax

mov fs,ax

mov gs,ax

By this step we have finished setting a complete 32 bit protected mode environment . Congratulations !

Conclusion :

I hope I have cleared out some of the most important points about protected mode in this documentation . I have demonstrated how to program your way intro protected mode by setting your required tables and what bits to enable in order to be in protected mode .

In the end I would gladly like to hear any comments about this document . this document was meant as a beginners tutorial about protected mode .

Reference : Intel Programming Reference

Christopher Geise “Protected Mode”

TiTan OS Notes :

This File Depends On:

None

This File Is Used By:

Kernel Initialization Structure

Any questions and comments about what is written would be appreciated and I would receive them on my mail , amgadmadkour@hotmail.com