[Info-vax] OpenVMS I64 V8.1 "Evaluation Release"?

Thu Mar 22 18:02:05 EDT 2012

On Mar 22, 8:51 pm, glen herrmannsfeldt <g... at ugcs.caltech.edu> wrote:
> Johnny Billquist <b... at softjar.se> wrote:
>
> (snip)
>
> >> Well, if you put it that way, IA32 has a 45 bit virtual address
> >> space, which should have been plenty big enough. That is, 16 bit
> >> segment selectors minus the local/global bit and ring bits,
> >> and 32 bit offsets.
> > I don't know exactly how the virtual addresses look on the IA32 so I
> > can't make more explicit comments. But if it actually forms a 45-bit
> > virtual address space, then sure. But it depends on how the virtual
> > address is calculated. Maybe someone can make a more accurate comment,
> > if we want to pursue that.
>
> IA32 still has the segment selector system that was added with
> the 80286. While there were many complaints about the size of 64K
> segments, that isn't so much of a problem at 4GB. A task can
> have up to 8192 segments, each up to 4GB.
>
> >> Also, many IA32 processors have a 36 bit physical address, again
> >> plenty big enough for most people even now.
> > Right. But the physical address space becomes a question for the OS
> > allocation and resource utilization. Good in its own way, but it
> > won't allow your program to use more memory space than what you
> > can address in your virtual address space.
>
> The 32 bit MMU that came after the above mentioned system and
> before the 36 bit address bus did complicate things, but with the
> appropriate OS support, it could have been done.
>
> (snip, I wrote)
>
> >> Having a large virtual address space is nice, but you can't
> >> practically run programs using (not just allocating, but actually
> >> referencing) 8, 16, or 32 times the physical address space.
> > You perhaps can't use all of it at the same time, for various reasons.
> > But you might definitely want to spread your usage out over a larger
> > address space than 32 bits allows.
>
> Maybe 2 or 3 times, but not 16 or 32. Note that disks haven't
> gotten faster nearly as fast as processors, especially latency.
>
> If you do something, such as matrix inversion, that makes many
> passes through a large array you quickly find that you can't
> do it if it is bigger than physical memory.

It's somewhat strange to say you *can't* do it. You *can* do it,
people have been doing it for years and sometimes still do it. In a
demand paged environment the application will run to completion
[assuming sufficient pagefile etc], and it will get the same answers
as it would with lots more real RAM. The application may take a while
longer to run when the available physical memory is significantly less
than the needed physical memory.

Please note the words "available" and "needed".

This is just the way virtual memory systems work (when done right).

Almost no real application *needs* all its virtual address space to be
in physical memory in quick succession. Many will work fine with a
small portion in physical memory at any given time. That's why demand
paging is useful (and why swapping a whole process is less useful).

In real world systems the physical memory is shared between the
application of interest, what the OS needs, and the memory needed by
any other applications on the system at the time. There aren't many
virtual memory applications that will fail because they don't have
enough physical memory available, which is fortunate really.

>
> >> The rule for many years, and maybe still not so far off, is that
> >> the swap space should be twice the physical memory size. (Also,
> >> that was when memory was allocated out of backing store. Most now
> >> don't require that.)
> > That has not been true for over 10 years on any system. It's actually a
> > remnant from when memory was managed in a different way in Unix, and
> > originally the rule was that you needed 3 times physical memory in swap.
>
> Ones I worked with, it was usually 2, but 3 probably also would have
> been fine.
>
> > The reason for the rule, if you want to know, was that way back,
> > physical memory was handled somewhat similar to cache, and swap was
> > regarded as "memory". So, when a program started, it was allocated room
> > in swap. If swap was full, the program could not run. And when running,
> > pages from swap was read into physical memory as needed. (And paged out
> > again if needed.)
>
> The first system that I remember this on was OS/2, I believe 1.2
> but maybe not until 2.0. If you ran a program from floppy, it required
> that the swap space exist, as you might take the floppy out.
>
> Well, using the executable file as backing store for itself is a
> slightly different question, but for many years they didn't even
> do that. Allocating in swap avoids the potential deadlock when the
> system finds no available page frames on the swap device, and needs
> to page something out. It made the OS simpler, at a cost in swap space.
> (And the ability to sell more swap storage.)
>
> > This should make it pretty obvious that you needed more swap than
> > physical memory, by some margin, or you could start observing effects
> > like a program not being able to run because there was no memory, but
> > you could at the same time see that there was plenty of free physical
> > memory. A very silly situation.
>
> When main memory was much smaller, that was much less likely to
> be true, but yes.
>
> > No system today works that way. You allocate memory, and it can be in
> > either swap, or physical memory. You do not *have* to have space
> > allocated in swap to be able to run. You don't even need to have any
> > swap at all today.
>
> Reminds me of wondering if any processors could run entirely off
> built-in cache, with no external memory at all.

If I remember rightly, some (all?) Alpha processors start up by
reading code from a serial ROM into the on-chip cache, and the system
setup code continues from there.

Whether you could do anything *useful* just from on-chip cache is a
different question. The PDP11 era of being able to do useful things in
32KW seems to have got lost somewhere.

>
> >> If you consider that there are other things (like the OS, other
> >> programs and disk buffers) using physical memory, you really
> >> won't want a single program to use more than 4GB virtual on
> >> a machine with 4GB real memory. Without virtual memory, you
> >> would probably be limited to about 3GB on a 4GB machine.
> > It's called paging, and every "modern" OS does it, all the time,
> > for all programs. Not a single program you are running today are
> > all in memory at the same time. Only parts of it is.
>
> As I noted above, it isn't hard to write a program, working with
> a large matrix, which does pretty much require it all in memory.
>
> With matrix operations, they tend to run sequentially through
> blocks of memory. I once wrote a very large finite-state
> automaton that pretty much went randomly through about 1GB of
> memory. No hope at all to swap.

But systems that swap rather than page also mostly got lost in the
PDP11 era. OK there was a brief period in the life of some UNIXes when
they'd swap and not page, but that wasn't a long term success - as you
noted earlier, in the wrong circumstances an application might well
fail to run even if the system appeared to have enough free memory for
useful work to be done. For this and other reasons, demand paging is
in general more useful than swapping, if you have the choice
[exceptions may well have applied in the past, but it's hard to see
where they'd apply nowadays].

>
> > So, even if you are running a program that is 4 GB in size,
> > it will not be using 4 GB of physical memory at any time.
>
> Maybe for the programs you write...

No program, no processor, ever uses a whole 4GB of physical memory at
one time. Not one. How could it? The data bus isn't wide enough for a
start :)

So, it becomes a question of how much of that 4GB (or whatever) does
it need, at what access time. The access time options typically are a
compromise between cost and size (faster is more expensive) and
include on chip cache time (fastest, but most expensive per GB), local
main memory time, remote-NUMA memory time, soft page fault time (from
in-memory cache), hard page fault time (from disk or network -
biggest, slowest, cheapest per GB). They will all work (in the right
circumstances) and the application will get the same results, only the
timing will be different.

In any real system with 4GB of physical memory, no real application
ever gets to use the whole 4GB at once. The OS wants some, for non-
pageable OS code and data, for a start.

Some applications may well have poor "locality of access". But once
again that affects performance, not whether it will run or not.
"Locality of access" effects apply at various levels - a program needs
good locality of access on a fine scale to make best use of on-chip
caches, and on a coarse scale it needs good locality of access to
avoid excessive unnecessary paging.

Application code such as matrix arithmetic can be written either from
first principles (which may lead to poor locality of access) or with a
bit more care to use techniques such as "tiling" to get better
locality of access (which may affect on-chip cache behaviour, or
paging behaviour, or both).

Much of this is basic computer and OS and application design stuff,
whatever chip is involved, whatever OS is involved.

>
> > And if your program is only using 100 KB, the odds are that
> > not all 100 KB will be in physical memory either.
>
> -- glen