[Info-vax] Intel junk...Kernel-memory-leaking Intel processor design flaw forces Linux, Windows redesign

Sat Jan 6 14:45:48 EST 2018

On Saturday, 6 January 2018 16:30:46 UTC, Johnny Billquist  wrote:
> On 2018-01-06 17:11, Alan Browne wrote:
> > On 2018-01-06 11:06, Jan-Erik Soderholm wrote:
> >> Den 2018-01-06 kl. 16:27, skrev Alan Browne:
> >>> On 2018-01-05 16:00, DaveFroble wrote:
> >>>> Alan Browne wrote:
> >>>>> On 2018-01-05 09:15, DaveFroble wrote:
> >>>>>> Jan-Erik Soderholm wrote:
> >>>>>
> >>>>>>> Becuse the designers, for performance reasons, has mapped kernel 
> >>>>>>> memory
> >>>>>>> into the user process address space and relies on the OS to check
> >>>>>>> protection before any kernel memory (or code) is accessed.
> >>>>>>>
> >>>>>>> The issue with the current issues is that the hardware (the CPU) 
> >>>>>>> does
> >>>>>>> these accesses in hardware "under the hood" without control by 
> >>>>>>> the OS.
> >>>>>>>
> >>>>>>> If you map your kernel memory in another way that uses the hardware
> >>>>>>> protection facilities, you are (as I understand) safe, at the cost
> >>>>>>> of worse performance to switch between user and kernel mode.
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> As I wrote, someone dropped the ball on this one.
> >>>>>>
> >>>>>> Speculative execution is part of the HW, not software.  It appears 
> >>>>>> the HW doesn't follow it's own rules.  Or, perhaps I don't 
> >>>>>> actually understand the problem?
> >>>>>
> >>>>> At least as well as I do.  These are very complex mechanisms and 
> >>>>> complexity is usually where you're most likely to get problems.
> >>>>>
> >>>>> In this case the h/w implementation didn't reflect the design goal.
> >>>>>
> >>>>> This means intel had very poor design review and abysmal testing of 
> >>>>> security features.
> >>>>>
> >>>>
> >>>> There seems a whole bunch of us "speculating" about things we 
> >>>> probably don't know enough about.
> >>>
> >>> I am very certain that they either did not design the testing 
> >>> correctly or didn't test per the test plan correctly.  Or a bad 
> >>> scenario: they saw it and carpeted it.
> >>>
> >>>>
> >>>> :-)
> >>>>
> >>>> It seems to me that before memory is fetched into cache, the CPU 
> >>>> should be determining whether it should indeed be fetching that 
> >>>> memory.  Yeah, 
> >>>
> >>> The CPU memory controller is (usually) the arbiter of whether a fetch 
> >>> is "legal" in the privilege scheme - so if something is allowed to be 
> >>> fetched, then it is fetched.  So (hierarchically) the fetch goes to 
> >>> the decoding pipeline(s) -and- is simultaneously copied to the cache. 
> >>> At that point the MC has "allowed" the fetch.  Writes to memory are 
> >>> also written to cache. The issue seems to be that post fetch from 
> >>> Kernel assigned memory, the cache makes some privileged data 
> >>> available to lower priority tasks after the context switch.  That is 
> >>> the gist.
> >>>
> >>
> >> As I understand, the CPU fetched prived data "under the hood" even before
> >> the processor has decided that it was prived data. When the user process
> >> get an "slap on the hand", the tracks was already in the cache.
> >>
> >> There was never any "context switch" involved at all, it was way below
> >> such constructs. Everything was done from user level "under the radar"
> >> from the point of view of the any OS (or the protection facilities in
> >> the CPU itself, it also seems).
> > 
> > I didn't mean OS level CS but privilege level switching.  Sorry for the 
> > ambiguity.
> 
> Nothing like that either. You just speculatively read data that is in 
> theYOUR address space, but protected from user access. The speculative 
> execution fetches the data, even though you are not allowed to read it.
> No CS, not even any privilege level change happened.
> 
> You just make a speculative execution of an instruction that would fail 
> because you do not have the right to access the memory, but since it is 
> speculative, the trap never happens, as it later turns out that the 
> speculation was wrong. However, the data was still read, since the 
> speculative execution itself actually bypass the protection. The 
> protection trap will only hit you if the instruction is decided that it 
> actually would happen.
> But unfortunately, the cache will still hold the read data that you were 
> not supposed to see.
> And then they figured out a clever way of mining the contents of the cache.
> 
> One could argue that the cache should be invalidated in such a scenario, 
> but that is not happening either.
> 
>    Johnny
> 
> -- 
> Johnny Billquist                  || "I'm on a bus
>                                    ||  on a psychedelic trip
> email: bqt at softjar.se             ||  Reading murder books
> pdp is alive!                     ||  tryin' to stay hip" - B. Idol

One could argue, and smart people have argued, that
side effects of speculative instructions should not
become visible until such time as the instruction in 
question is confirmed as one which will be executed 
(nb in cases like this, an instruction which is 
initially executed is not always an instruction which 
completes e.g. it may be interrupted because of an 
exception or whatever, and therein lies another world 
of fun where great engineering care is always needed 
but is not always used).

A speculative fetch from main memory directly into a 
(shadowed) CPU register, a reference which bypasses
cache because it's been told to (e.g. because the 
reference in question is to a region declared 
non-cacheable) can in general be discarded when the
instruction is discarded because the speculation
turned out to be wrong. And the affected shadowed 
register never gets to be visible. And on a good
day there won't be any visible side effects.

A speculative fetch from main memory into a register,
one which does go into cache, has side effects which
cannot always be discarded when the speculation/prediction
turns out to be wrong. But hey, the unnecessary and
inappropriate prefetching means it's faster in the
performance tests, innit. What could possibly go wrong,
nobody will ever notice even if anything does go wring.

Well apparently people not only noticed, they found a
way of making the manufacturers and the media notice,

Going back a decade or four, some readers may remember 
devices whose state changed when the device was read 
e.g. some Control and Status Registers would clear a 
"data available" bit if a particular device register 
was read. 

That change of device state was a "side effect" which
would have been a Bad Thing if a speculative read had 
got as far as a real-world device. But most "modern" 
computers don't have that particular challenge with 
CSRs, in part because memory has different (predictable) 
behaviour; whatever you write into memory, you get 
the same back. Otherwise it's not memory. OK there 
might sometimes be side effects, and misunderstandings,
and that's where all this fun starts.

There's more to it than that, such as the difference
between a genuine RISC architecture (where load and
store are the usual ways of addressing memory) and
a non-RISC architecture (VAX, legacy x86) where an 
incoming instruction may have multiple references to 
memory locations within one particular instruction.
Other factors to consider include the different
possible ways of architecting and implementing on-chip 
caches. 

Is modern x86 a RISC machine? For spin purposes it is 
both RISC and non-RISC. Software was x86-compatible,
performance is largely RISC-class, and who cares if
checks that are possible to do properly on RISC 
can't realistically on modern x86.

Corrections and clarifications very welcome.

I quite like the Computerphile video referenced earlier
in this here thread. A bit less hype, a bit more detail,
than much of the coverage around at the moment. Respect
is due.

Inevitably, when one chip builder makes almost all the 
chips in the volume market, then that chip builder's 
design problem(s) will be liable to affect almost all
the products in their volume market.