[Info-vax] The Future of Server Hardware?

Wed Oct 3 09:44:16 EDT 2012

On 2012-10-03 01:04:10 +0000, JF Mezei said:

> On 12-10-02 18:53, David Froble wrote:
> 
>> For the disks, the MicroVAX 3100 systems had a rather stupid 50 pin SCSI
>> controller, and the speed was, well, "not speed".  Now, not that I was a
>> fan of the BI bus, but the 6000 class systems had much better
>> capabilities at moving data.  CPU speed is not everything in a system.
> 
> But today, does an IBM mainframe use a bus that is faster than
> PCI-Express ?
> 
> In terms of disks, wouldn't mainframes and commodity servers use the
> same type of disk array interfaces ?
> 
> Or is this a case of modern mainframes using the same types of
> interfaces, but surporting multiple instances whereas comodity servers
> only have one PCI-Express and one interface card to FC or whatever type
> of interface to a disk array ?

Computers inherently have common components.  Probably even roller 
casters in common, too.

The question is not whether there are components that are common, it's 
whether a particular design makes what you have and need to do, well, 
fast.

Computers are hierarchies of trade-offs between component prices and 
bandwidth and latency.  Processors and on-chip cache are extremely 
expensive and silly-fast.  Cache is very expensive, and very fast.  
Main memory is expensive, and fast.  SSD is relatively moderately 
priced, and adequately fast for various disks.  Traditional 
rotating-rust disk is cheap and glacial.  Then you get into near-line 
storage and other forms of Slow.  You can size for and make everything 
on-chip cache, but that's hugely expensive.  Nobody could afford to buy 
that.  Or you can make stuff heavily rotating rust and cheap and 
glacial, and performance would stink large.

What's this hierarchy stuff?  Within the memory hierarchy in an 
Itanium, for instance, the processor stalls when data is not available. 
"Loads and add instructions complete within one cycle when the data is 
available in level one cache, which is the preferred case. Loads from 
level two cache complete in 6 cycles, roughly 13 cycles for level three 
cache, and 209 cycles for loads from main memory. For obvious reasons, 
prefetching and otherwise keeping data in the appropriate caches 
reduces latency. And avoids stalls."  - 
<http://labs.hoffmanlabs.com/node/160>   You can see how much slower 
main memory is here, in various of the Itanium designs.  And other 
processors are no different here; Alpha saw cost and speed tradeoffs 
here, too.  Interestingly, EV7 went for much smaller caches because the 
EV7 design had a much faster off-chip connection to memory - EV7 main 
memory speeds were faster than EV68 level 1 cache accesses.  Details: 
<http://labs.hoffmanlabs.com/node/20> - and again, there are tradeoffs 
inherent in making parts faster (or slower).  (Which gets back to that 
hypothetical AlphaServer DS15L discussion - could you even fit the 
necessary amount of RDRAM into the 1U box, and keep the RDRAM and the 
EV7 from frying itself, and then - and very importantly - would enough 
folks buy these boxes to make it worth the costs and effort of building 
it?)

As for your current preferred focus on disk storage, reading a 
rotating-rust disk is inherently glacially slow.  Writing is glacial, 
too.  I don't know the instruction-comparison count off-hand, but it 
wouldn't surprise me to hear it was on the order of a million 
instructions worth of time to get a sector read from rust and into main 
memory, or longer.  Processing an alignment fault is roughly 10,000 to 
15,000 instructions, after all.  Reading a disk a record at a time, off 
disk, RMS-style, is particularly stupid-slow.  (Why did I now hear the 
song "RMS Style" done as "Gangnam Style"?  Nevermind.)  RMS tries and 
does fairly well here with performance and with caching, but it'd be 
far faster to map (or page) the file into memory as a wad of blocks, 
and unmarshall the data in main memory rather than through RMS.  RMS 
tries to do that, sort of.  But VMS applications aren't expecting any 
unmarshalling.  And with OpenVMS on Itanium (or on Alpha or VAX, for 
that matter), the CPU gets to do all this dreck work.

Cray had a clever system for rolling in from glacial rotating-rust 
storage, an eon ago.  The CPU detected the location of the disk heads, 
did the math to figure out where that sector should be in main memory, 
and striped the contents of the disk into memory from the current disk 
head location.  This in place of reading the sectors of rotating-rust 
sequentially, with all the glacial disk head seeks and the glacial 
waits as the disks spun the target sectors around under the heads that 
was entailed.  It wouldn't surprise me to find this algorithm used in 
other boxes.

Mainframes have very fast (and very intelligent) I/O channels, usually 
work to offload the CPU of mundane housekeeping, and very aggressively 
RAID and cache the (usually massive) storage to contend with its usual 
glacial speed.  And lots and lots (and lots) of rotating-rust disks, 
where the outboard controllers can deal with predicting accesses, 
selecting victims for flushes, and otherwise getting the data in from 
glacial storage and unmarshalled, and marshalled and out "quickly."

One of the "jokes" I was told, back when I was dealing with IBM: last 
year's mainframe model is this year's channel controller model.  Which 
reminded me of a similar joke around VAX consoles, though the VAX 
consoles were never part of the main I/O path.  But I digress.

A good computer design balances these disparate factors, and allows you 
to reach a build price that will be sufficiently profitable for what 
you think you can sell the box for; costs and revenue and production 
volumes all factor into component and system designs.

Remember that Computer Architecture class I keep suggesting?  Well, 
that's how you learn most of this stuff.  (Not necessarily the 
financial stuff, but as you price out the components, it'll become 
apparent why everybody uses the same rotating-rust storage widgets.  
Well, unless you have the budget for shiploads of SLC NOR flash memory. 
 And you might still want shiploads of flash.  IBM offers flash-based 
SSD caches in their XIV, and probably in the DS8000 storage 
controllers.  And IIRC, HP has recently added support for SSD flash on 
the i2 boxes, too.  All part of the go-fast.)

Anyway... Review one of the Computer Architecture classes that was 
mentioned earlier.  That grounding will help you learn about and 
understand how the pieces of a modern computer fit together, and then 
the costs and trade-offs involved.  It'll can help you understand how 
to write faster programs, too.

-- 
Pure Personal Opinion | HoffmanLabs LLC