[Info-vax] VMware

Wed Dec 11 08:32:09 EST 2019

On Tuesday, December 10, 2019 at 10:00:27 PM UTC-5, Grant Taylor wrote:
> On 12/10/19 5:27 PM, Bob Gezelter wrote:
> > With all due respect, I want to see the fine-grain details on 
> > that implementation.
> 
> Please see my reply from ~ 7:35.  (Adjust hour accordingly for your time 
> zone.)
> 
> I think that's about as granular as I can get without going and looking 
> things up.
> 
> I have no problem with you wanting to see the fine-grain details.  I 
> asked very similar questions 10+ years ago.  Hence why I have the 
> understanding that I do.  Also why it's now only a high level detail.
> 
> > Particularly the part about "packets do not drop".
> 
> I've routinely moved VMs between hosts without dropping packets.  I do 
> see latency at the epoch of the transition increase momentarily (usually 
> just one packet).  But the packet does make it through and is not dropped.
> 
> Frequently latency is something like this:
> 
> 1–3 ms
> 1–3 ms
> 1–3 ms
> 9–12 ms
> 1–3 ms
> 1–3 ms
> 1–3 ms
> 
> No packet drop.
> 
> TCP sessions continue without retransmissions.
> 
> > Ensuring granularity of file update is also quite a challenge.
> 
> Why?  (Please see my other message about what happens.)
> 
> > There is a large difference between "rarely are packets lost" and 
> > "packets are never lost". Pre-loading other virtual instances and 
> > keeping memory state updated them updated is one thing, ensuring mass 
> > storage state is something else.
> 
> All hosts in the cluster have access to the same storage.  So anything 
> written on one host is readable by other hosts.  Part of the migration 
> ensures that cached data is synced to disk and / or copied as part of 
> the memory for the system.
> 
> So there's no "mass storage state" to keep in sync because it is the 
> same back end storage.
> 
> > I will not even get into questions like the state of attached 
> > non-storage peripherals, e.g. RNGs.
> 
> Those would be the types of things that would prevent migration between 
> hosts.
> 
> Though, I think that VMware has an option to allow USB peripherals to be 
> used across the network.
> 
> If not VMware, there are other OS level solutions to allow some 
> peripherals to be used across the network.
> 
> I've personally used remote (TCP based) serial ports for fax servers. 
> The modem is physically connected to a network attached DigiBoard (or 
> the likes) and the VM is free to move from host to host to host because 
> it's TCP connection to the serial port is still in tact.
> 
> Given that faxing is time sensitive serial audio / data (depending on 
> the modem) there may be an issue with the momentary increased latency. 
> I don't know if that would ride through a migration or if it would rely 
> on error detection and correction in the modem / fax level.
> 
> > My general advice is to deeply verify the precise nature of the 
> > implementation and its limitations before relying on it.
> 
> I think that's a wonderful idea.
> 
> > A while back, I was at an user group event where there was a 
> > presentation on VM migration. The speaker made a statement that 
> > failover migration would handle all cases. Being from New York City, 
> > I inquired about a scenario we had experienced a few years earlier.
> 
> ~chuckle~
> 
> Absolutes are usually a problem in one way or another.  ;-)
> 
> > A Boeing 767 doing between 150 and 200 knots comes through your machine 
> > room window. How long does it take to traverse the 24 inches between 
> > front of the cabinet and the back of the cabinet. Even that scenario 
> > does not include the fact that the infrastructure connecting one 
> > VM host to another has likely been severed before the VM host frame 
> > is hit.
> 
> I think that's a valid question.  I think it's an EXTREMELY ATYPICAL 
> failure scenario.  But it is decidedly within the "all cases" absolute 
> the speaker set themselves up for.
> 
> I think that would be very difficult to protect against.
> 
> I would question, what about a data center in an adjacent building that 
> you can extend the LAN / SAN / etc. into.  Though it could also 
> experience a similar problem (fate sharing).
> 
> When you start talking about failures that can take out multiple 
> buildings in close proximity to each other, you REALLY need an EXTREMELY 
> robust solution.
> 
> I do think that VMware has some solutions that can work over extended 
> distances.
> 
> 
> 
> -- 
> Grant. . . .
> unix || die

Grant,

Your post proves my point.

I do not disagree that within the context of "controlled" VM migration between hosts, it is possible to accomplish the migration without loss of packets or I/O inconsistency.

It is the uncontrolled case to which I referred. 

Of course, in the controlled case, the connection to the switch can be blocked/queued AND acknowledged to prevent packet(s) from being caught during the transition. Alternatively, the MAC address can be changed and the packets queued at the new host. A similar argument applies to I/O. In a controlled case, active I/O cam be completed before the transfer.

Otherwise, one needs facilities not present in x86 (e.g., lock-step execution as was implemented on some fault tolerant architectures in the past). As an example, modern hardware RNGs make precise execution profiles on modern systems unlikely.

- Bob Gezelter, http://www.rlgsc.com