[Info-vax] OpenVMS async I/O, fast vs. slow

Sat Nov 11 17:37:32 EST 2023

On 2023-11-11 16:06, Dan Cross wrote:
> In article <uij9oc$st3$1 at news.misty.com>,
> Johnny Billquist  <bqt at softjar.se> wrote:
>> On 2023-11-09 17:50, Dan Cross wrote:
>>> In article <uig3nn$2ke$2 at news.misty.com>,
>>> Johnny Billquist  <bqt at softjar.se> wrote:
>>>> On 2023-11-08 03:00, Dan Cross wrote:
>>>>> [snip]
>>>>> Yes.  See below.
>>>>
>>>> :-)
>>>>
>>>> And yes, I know how the cache coherency protocols work. Another thing
>>>> that was covered already when I was studying at University.
>>>
>>> Cool.  MESI wasn't presented until 1984, so you must have seen
>>> it pretty early on.
>>
>> I think I was looking at it in 1995 (what made you think I would have
>> looked at it close to 1984? How old do you think I am??? :) ).
> 
> *cough* Uh, ahem...sorry.... *cough*

Well, it's kindof funny. :-)

>> My
>> professor was specifically focused on CPU caches. He was previously the
>> chief architect for high-end server division at SUN (Erik Hagersten if
>> you want to look him up).
>> I didn't even start at University until 1989.
> 
> Gosh, I think like some other folks I just ASSumed you were
> older given your PDP-11 interests.  Apologies!

Admittedly I'm not young anymore either. I started playing around on 
PDP-11s when they were still kindof widespread.

>>>> The owning CPU then tries to
>>>> release the lock, at which time it also access the data, writes it, at
>>>> which point the other CPUs will still have it in shared, and the owning
>>>> CPU gets it as owned.
>>>
>>> Hey now?  Seems like most spinlock releases on x86 are simply
>>> going to be a memory barrier followed by a store.  That'll push
>>> the line in the local into MODIFIED and invalidate all other
>>> caches.
>>
>> Why would it invalidate other caches?
> 
> Because that's what the protocol says it must do?  :-)

If you want to move it to MODIFIED, then yes, you need to invalidate all 
other CPU caches. But there is no good reason to do that in general.

To quote the wikipedia article:

"Owned
This cache is one of several with a valid copy of the cache line, but 
has the exclusive right to make changes to it—other caches may read but 
not write the cache line. When this cache changes data on the cache 
line, it must broadcast those changes to all other caches sharing the 
line. The introduction of the Owned state allows dirty sharing of data, 
i.e., a modified cache block can be moved around various caches without 
updating main memory. The cache line may be changed to the Modified 
state after invalidating all shared copies, or changed to the Shared 
state by writing the modifications back to main memory. Owned cache 
lines must respond to a snoop request with data."

Note the "when the cache changes the data on the cache line, it must 
broadcast those changes to all other caches sharing the line".

So basically - no. If you have a cache in owned state, and you change 
the content, you already have the data modified, but you own it, and any 
changes you broadcast to anyone else who also have the data in their 
cache. All others have the cache line in SHARED state, so they cannot 
modify it.

But eventually you need to flush it, at which point it goes back to main 
memory, and your copy is invalidated. Others still have it shared. Of 
course, someone else might want to try and write, at which point the 
ownership needs to move over there. More handshaking...

Basically, OWNED is to allow sharing of dirty cache without it being in 
main memory. You're the one who even reminded me about it. :D

>> It could just push the change into
>> their caches as well. At which point the local CPU would have "owned"
>> and others "shared".
> 
> Well, no....  Even in MOESI, a write hit puts an OWNED cache
> line into MODIFIED state, and a probe write hit puts the line
> into INVALID state from any state.  A probe read hit can move a
> line from MODIFIED into OWNED (presumably that's when it pushes
> its modified contents to other caches).

No. OWNED means you already modified it. You can modify it additional 
times if you want to. It don't change anything in cache state.

But anyone else who have that same line in SHARED needs to either be 
invalidated or updated. But as noted above, the normal thing is to update.

You don't get to OWNED state until you actually do modify the cache. 
It't not that modifying the cache moves out out of OWNED. It moves you 
INTO OWNED.

>>>>>>>> (And then we had the PDP-11/74, which had to implement cache coherency
>>>>>>>> between CPUs without any hardware support...)
>>>>>>>
>>>>>>> Sounds rough. :-/
>>>>>>
>>>>>> It is. CPU was modified to actually always step around the cache for one
>>>>>> instruction (ASRB - used for spin locks), and then you manually turn on
>>>>>> and off cache bypass on a per-page basis, or in general of the CPU,
>>>>>> depending on what is being done, in order to not get into issues of
>>>>>> cache inconsistency.
>>>>>
>>>>> This implies that stores were in a total order, then, and
>>>>> these uncached instructions were serializing with respect to
>>>>> other CPUs?
>>>>
>>>> The uncached instruction is basically there in order to be able to
>>>> implement a spin lock that works as you would expect. Once you have the
>>>> lock, then you either deal with data which is known to be shared, in
>>>> which case you need to run with cache disabled, or you are dealing with
>>>> data you know is not shared, in which case you can allow caching to work
>>>> as normal.
>>>>
>>>> No data access to shared resources are allowed to be done without
>>>> getting the lock first.
>>>
>>> Sure.  But suppose I use the uncached instructions to implement
>>> a lock around a shared data structure; I use the uncached instr
>>> to grab a lock, I modify (say) a counter in "normal" memory with
>>> a "normal" instruction and then I use the uncached instruction
>>> to release the lock.  But what about the counter value?  Is its
>>> new value --- the update to which was protected by the lock ---
>>> immediately visible to all other CPUs?
>>
>> Like I said - if you are dealing with shared data, even after you get
>> the lock, you then need to turn off the cache while working on it. So
>> basically, such updates would immediately hit main memory. And any reads
>> of that data would also be done with cache disabled, so you get the
>> actual data. So updates always immediately visible to all CPUs.
> 
> Oh, I understand now: I had missed that if you wanted the data
> protected by the lock to be immediately visible to the other
> CPUs you had to disable caching (which, I presume, would flush
> cache contents back to RAM).  Indeed, it's impressive that they
> were able to do that back then.

The PDP-11 never had write-back caching. Only write-through. So no need 
to flush as such.
But you want to bypass (as well as invalidate) what's in there, in case 
physical memory holds something else.

   Johnny