[Info-vax] The Future of Server Hardware?

glen herrmannsfeldt gah at ugcs.caltech.edu
Wed Oct 3 15:18:58 EDT 2012


Stephen Hoffman <seaohveh at hoffmanlabs.invalid> wrote:
> On 2012-10-03 01:04:10 +0000, JF Mezei said:
>> On 12-10-02 18:53, David Froble wrote:
 
(snip) 
>> But today, does an IBM mainframe use a bus that is faster than
>> PCI-Express ?
 
>> In terms of disks, wouldn't mainframes and commodity servers use the
>> same type of disk array interfaces ?
 
>> Or is this a case of modern mainframes using the same types of
>> interfaces, but surporting multiple instances whereas comodity servers
>> only have one PCI-Express and one interface card to FC or whatever type
>> of interface to a disk array ?


> Computers inherently have common components.  Probably even roller 
> casters in common, too.

> The question is not whether there are components that are common, it's 
> whether a particular design makes what you have and need to do, well, 
> fast.

> Computers are hierarchies of trade-offs between component prices and 
> bandwidth and latency.  Processors and on-chip cache are extremely 
> expensive and silly-fast.  Cache is very expensive, and very fast.  
> Main memory is expensive, and fast.  SSD is relatively moderately 
> priced, and adequately fast for various disks.  Traditional 
> rotating-rust disk is cheap and glacial.  Then you get into near-line 
> storage and other forms of Slow.  You can size for and make everything 
> on-chip cache, but that's hugely expensive.  Nobody could afford to buy 
> that.  Or you can make stuff heavily rotating rust and cheap and 
> glacial, and performance would stink large.

And for a server that will spend a larger fraction of its time
actually computing, the tradeoffs move up a little. Larger caches.

(snip on caching)

> <http://labs.hoffmanlabs.com/node/160>   You can see how much slower 
> main memory is here, in various of the Itanium designs.  And other 
> processors are no different here; Alpha saw cost and speed tradeoffs 
> here, too.  Interestingly, EV7 went for much smaller caches because the 
> EV7 design had a much faster off-chip connection to memory - EV7 main 
> memory speeds were faster than EV68 level 1 cache accesses.  Details: 
> <http://labs.hoffmanlabs.com/node/20> - and again, there are tradeoffs 
> inherent in making parts faster (or slower).  

(snip)

> As for your current preferred focus on disk storage, reading a 
> rotating-rust disk is inherently glacially slow.  Writing is glacial, 
> too.  I don't know the instruction-comparison count off-hand, but it 
> wouldn't surprise me to hear it was on the order of a million 
> instructions worth of time to get a sector read from rust and into main 
> memory, or longer.  

For many years, 3600 RPM disks have been common, so 8.33ms average
rotational delay. For servers, now, 7200RPM might be more usual, 
ro 4.16ms. A million instructions doesn't sound far off.

(snip)

> Cray had a clever system for rolling in from glacial rotating-rust 
> storage, an eon ago.  The CPU detected the location of the disk heads, 
> did the math to figure out where that sector should be in main memory, 
> and striped the contents of the disk into memory from the current disk 
> head location.  This in place of reading the sectors of rotating-rust 
> sequentially, with all the glacial disk head seeks and the glacial 
> waits as the disks spun the target sectors around under the heads that 
> was entailed.  It wouldn't surprise me to find this algorithm used in 
> other boxes.

I believe that many current disks do this internally. When asked
to read a sector, seek to the appropriate track, then start reading
into cache. When you read the needed sector, return it, then finish
reading the track. When requests come in for nearby sectors, they
are already in the cache.

> Mainframes have very fast (and very intelligent) I/O channels, usually 
> work to offload the CPU of mundane housekeeping, and very aggressively 
> RAID and cache the (usually massive) storage to contend with its usual 
> glacial speed.  And lots and lots (and lots) of rotating-rust disks, 
> where the outboard controllers can deal with predicting accesses, 
> selecting victims for flushes, and otherwise getting the data in from 
> glacial storage and unmarshalled, and marshalled and out "quickly."

Yes, one of the things that distinguises mainframe systems from
smaller ones is the number of disks attached. Some can be seeking
while others are transfering data. 

> One of the "jokes" I was told, back when I was dealing with IBM: last 
> year's mainframe model is this year's channel controller model.  Which 
> reminded me of a similar joke around VAX consoles, though the VAX 
> consoles were never part of the main I/O path.  But I digress.

Many of the smaller ones used the same hardware for CPU and channel.
It is, at least, known that 370/158 and 370/168 were used as channel
controllers for the 30xx series. Many were leased, so when users go
up to the next level the old ones come back. Being able to use them
makes a lot of sense.

> A good computer design balances these disparate factors, and allows you 
> to reach a build price that will be sufficiently profitable for what 
> you think you can sell the box for; costs and revenue and production 
> volumes all factor into component and system designs.

> Remember that Computer Architecture class I keep suggesting?  Well, 
> that's how you learn most of this stuff.  

You don't need the class, just read Hennessy and Patterson's
"Computer Architecture: a Quantitative Approach." 

Well, a new edition comes out about every five years, and some
things go away as new ones are added. The used prices of the older
ones are low enough that you can just buy all of them. So far,
I have never bought the newest one. 

> (Not necessarily the 
> financial stuff, but as you price out the components, it'll become 
> apparent why everybody uses the same rotating-rust storage widgets.  

Hennessy and Patterson discuss some pricing. Not in too much 
detail, but you really can't do it right without some pricing.

> Well, unless you have the budget for shiploads of SLC NOR flash memory. 
> And you might still want shiploads of flash.  IBM offers flash-based 
> SSD caches in their XIV, and probably in the DS8000 storage 
> controllers.  And IIRC, HP has recently added support for SSD flash on 
> the i2 boxes, too.  All part of the go-fast.)

Buy the books.  Much cheaper than taking actual classes.
(Which probably use those books.)

-- glen



More information about the Info-vax mailing list