[Info-vax] Linux 40 GbE and 100 GbE NIC Performance

Dirk Munk munk at home.nl
Sat Jan 24 17:56:34 EST 2015


Stephen Hoffman wrote:
> On 2015-01-24 03:59:57 +0000, David Froble said:
>
>> Stephen Hoffman wrote:
>>>
>>> and whether there might be a need to dedicate one or more cores for
>>> the NIC processing?
>>
>> This is an interesting statement.
>>
>> It's been a while since there was "The VAX".  Different world.
>
> Yes, many things are different now.  Some are not.  Dedicating
> processors to tasks is not particularly new, though.  There were VAX
> processors embedded on some DEC Ethernet boards, and in a few printers.
> There were dedicated CPUs on some Q-bus boards.   There were boards that
> depended entirely on the host VAX, too.  Intel is now aiming Xeon at
> embedded systems, with some changes mentioned below, too. Dedicating a
> processor to a task or to a device withi  a VMS SMP configuration has
> been possible for a while, too.
>
>> This might have been questionable when it was one core per socket.
>> With multiple cores per socket, and with some of the on chip
>> communications, for some purposes planning on having a "network
>> processor" might be a very good idea.
>
> Recent Intel Xeon integrated the PCIe onto the processor, and has
> provided a way for PCIe devices to initiate and perform DMA into and out
> of processor cache directly; into and out of level 3 shared cache. What
> Intel refers to as DDIO.  This helps for some tasks as it is faster than
> DMA into memory (which then has to get loaded into processor cache), but
> then the code operating on the core has to figure out what to do with
> the data and either evict the data or transfer it into memory.  At least
> for now, DDIO is local, and is tied to the PCIe buses associated with a
> particular socket and its cores, but it wouldn't surprise me to see
> Intel extend this and allow DDIO operations into remote sockets and
> cores across QPI.  ~250 Gbps in lab tests, reportedly.
>
> Alternatively, there's been some discussion of TCP offload engine (TOE)
> support for OpenVMS over the years, but that hardware configuration
> hasn't ever become available and supported on OpenVMS AFAIK.

We had a small discussion on this matter not too long ago.

Even cheap Realtek GBe interfaces have offload engines. I would say that 
these days any proper network interface has offload engines. Next to the 
standard TCP offload engine, a modern Ethernet NIC may also have an 
iSCSI offload engine.

Fibrechannel interfaces also have quite a lot of RAM. I once tested 
TCP/IP over fibrechannel, now that is fast...... With IPv6 you can have 
a TCP packet size of 16MB.

In my view a proper network interface (for whatever type of network) 
should have a powerful cpu for all kind of offload tasks, sufficient 
buffer memory, and DMA access.



>
> There are also changes proposed to and wholesale replacements for TCP
> and IP, but that's fodder for another time.
>
>> Now, there might be a very good idea for VMS on x86.  Bring network
>> speeds up to date.
>
> Alpha and Itanium already have some processing that can be dedicated to
> a core, either for device-specific I/O processing — where fast path can
> be used to distribute I/O processing, and fast I/O is intended to be
> faster than $qio — or for the dedicated lock manager, for instance.
>
> OpenVMS network I/O is presently quite a bit slower than RHEL network
> I/O — that's using the VCI path on VMS, too — and that's before any
> consideration of the performance necessary for wire-speed operations of
> 40 GbE and 100 GbE NICs.
>
>




More information about the Info-vax mailing list