[Info-vax] VMware

Tue Dec 10 21:08:50 EST 2019

On 12/10/2019 7:27 PM, Bob Gezelter wrote:
> On Tuesday, December 10, 2019 at 1:32:41 PM UTC-5, Grant Taylor wrote:
>> On 12/10/19 5:05 AM, Bob Gezelter wrote:
>>> However useful VM migration, it is not a functional replacement for
>>> OpenVMS clusters. VM migration allows controlled workload migration,
>>> in the event of an uncontrolled system failure, e.g. complete
>>> power-failure without warning or system destruction. migration will
>>> not have sufficient time to execute a migration.
>>
>> I don't completely agree with this.
>>
>> VMware has a High Availability mode where the same VM / system image is
>> running concurrently on multiple disparate physical hosts.  One of which
>> will be connected to the outside world.  The other is disconnected and
>> receiving real time updates from the first.  As in VMware is replicating
>> memory, processor, and disk state near real time.  This means that when
>> one physical host falls out of the rack, the other physical host takes
>> over and continues running the VM with the exception that it is now
>> master and the VM is connected to the world.  As such, even established
>> network connections continue on the alternate physical host.
>>
>> VMware has, and apparently other hypervisors have, the ability to move
>> running VMs from one host to another host in a manner that is almost
>> unperceptible to clients.  Packets don't drop.  They may have a slight
>> increase in latency /during/ the transition.  But it really looks like a
>> momentary congestion in a router buffer somewhere.
>>
>>
>>
>> --
>> Grant. . . .
>> unix || die
>
> Grant,
>
> With all due respect, I want to see the fine-grain details on that
> implementation. Particularly the part about "packets do not drop".
> Ensuring granularity of file update is also quite a challenge.
>
> There is a large difference between "rarely are packets lost" and
> "packets are never lost". Pre-loading other virtual instances and
> keeping memory state updated them updated is one thing, ensuring mass
> storage state is something else.

A huge difference.  Not an insurmountable problem.  When communications 
are structured such that a transaction is not considered complete until 
verified as so by the receiver.  Also having complete re-start of 
transactions built into the apps sort of makes this problem much smaller.

Everyone does that, right?

:-)

> I will not even get into questions like the state of attached
> non-storage peripherals, e.g. RNGs.
>
> My general advice is to deeply verify the precise nature of the
> implementation and its limitations before relying on it.

Agreed.

> A while back, I was at an user group event where there was a
> presentation on VM migration. The speaker made a statement that
> failover migration would handle all cases. Being from New York City,
> I inquired about a scenario we had experienced a few years earlier.

Reminds me of the young lady who declared that when I send out an 
inquiry over the internet, I would ALWAYS get a reply.  I casually 
mentioned backhoe operators, communication failures, and her tripping 
over the power cable, again.

:-)

> A Boeing 767 doing between 150 and 200 knots comes through your
> machine room window. How long does it take to traverse the 24 inches
> between front of the cabinet and the back of the cabinet. Even that
> scenario does not include the fact that the infrastructure connecting
> one VM host to another has likely been severed before the VM host
> frame is hit.

Don't you just hate it when the real world intrudes ....

-- 
David Froble                       Tel: 724-529-0450
Dave Froble Enterprises, Inc.      E-Mail: davef at tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA  15486