[Info-vax] x86S Specification
Dan Cross
cross at spitfire.i.gajendra.net
Sun Nov 3 20:08:45 EST 2024
In article <672813c5$0$719$14726298 at news.sunsite.dk>,
Arne Vajhøj <arne at vajhoej.dk> wrote:
>On 11/3/2024 11:38 AM, Camiel Vanderhoeven wrote:
>> Arne Vajhøj wrote:
>>> On 11/3/2024 9:06 AM, Camiel Vanderhoeven wrote:
>>>> Arne Vajhøj wrote:
>>>>> x86-64 in long mode only support 2 modes in PTE's, so
>>>>> VMS x86-64 is a hardware 2 mode OS 4 mode OS - U in ring 3,
>>>>> S, E and K in ring 0.
>>>>
>>>> Not exactly.
>>>>
>>>> Ring 3 is used for Exec, Super, and User
>>>>
>>>> Ring 0 is used for kernel and for transitions between modes (SWIS)
>>>>
>>>> Running Exec and Super in ring 0 would blow away the separation
>>>> (which, I might add, is there more for stability than for security,
>>>> before I unintentionally re-start that debate)
>>>
>>> You are more afraid that DCL or RMS would step on VMS than
>>> applications would step on DCL or RMS?
>>
>> No, certainly not. That is why we have a separate set of page tables for
>> each mode. For instance, a page that has kernel write / exec read
>> protections is represented by the following PTEs in these 4 sets of page
>> tables:
>>
>> kernel mode: S(upervisor) W(riteable)
>> exec mode: U(ser) R(eadable)
>> super mode: not present
>> user mode: not present
>
>The more I think about the more fascinating it sounds.
>
>So if I write a C program with:
>
>char __align(13) buf[8192];
>
>and the C code call SYS$SETPRT with PRT$C_UREW on that, then
>it works like.
>
>logical/application level:
>
>1 page of 8 KB with:
> logK : write
> logE : write
> logS : read
> logU : read
>
>physical/hardware level:
>
>2 pages of 4 KB each in four different page tables:
>
>logK => page table with: physK : write, physU : ? (should not matter)
>logE => page table with: physK : write, physU : write
>logS => page table with: physK : write, physU : read
>logU => page table with: physK : write, physU : read
It's pretty obvious that VMS has to use multiple page tables to
emulate systems with multiple protection modes on systems that
don't have such things in hardware. There's no other reasonable
artictecture.
On x86 in long mode, specifically, page table entries have bits
for readability (the "P", for "Present", bit implies that the
page is readable, unless memory protection keys are used, in
which case a page can be marked write- or execute-only),
writability (if set, the page is writeable; otherwise not); and
non-executability (if NX is set, the page is not executable,
otherwise it is).
Separately, there is a bit for whether the page is accessible
from userspace or not (the U/S bit): if set, the page can be
accessed from ring 3, in accordance with the other permission
bits, otherwise not. By default, page-level write permission
bits are ignored for supervisor mode stores (that is, stores
from any ring other than ring 3) unless the the `WP` bit in
control register CR0 is set; if CR0.WP is set and the page is
not marked writable, then the kernel can't write to it, unless
it the same page is also mapped with suitable permissions at
some other address.
A number of bits in CR4 and a handful of MSRs will also affect
behavior around page permission enforcement.
- Dan C.
More information about the Info-vax
mailing list