[Info-vax] Linux 40 GbE and 100 GbE NIC Performance

Tue Jan 27 10:27:37 EST 2015

In article <4c42e$54c42324$5ed4324a$38838 at news.ziggo.nl>,
	Dirk Munk <munk at home.nl> writes:
> Stephen Hoffman wrote:
>> On 2015-01-24 03:59:57 +0000, David Froble said:
>>
>>> Stephen Hoffman wrote:
>>>>
>>>> and whether there might be a need to dedicate one or more cores for
>>>> the NIC processing?
>>>
>>> This is an interesting statement.
>>>
>>> It's been a while since there was "The VAX".  Different world.
>>
>> Yes, many things are different now.  Some are not.  Dedicating
>> processors to tasks is not particularly new, though.  There were VAX
>> processors embedded on some DEC Ethernet boards, and in a few printers.
>> There were dedicated CPUs on some Q-bus boards.   There were boards that
>> depended entirely on the host VAX, too.  Intel is now aiming Xeon at
>> embedded systems, with some changes mentioned below, too. Dedicating a
>> processor to a task or to a device withi  a VMS SMP configuration has
>> been possible for a while, too.
>>
>>> This might have been questionable when it was one core per socket.
>>> With multiple cores per socket, and with some of the on chip
>>> communications, for some purposes planning on having a "network
>>> processor" might be a very good idea.
>>
>> Recent Intel Xeon integrated the PCIe onto the processor, and has
>> provided a way for PCIe devices to initiate and perform DMA into and out
>> of processor cache directly; into and out of level 3 shared cache. What
>> Intel refers to as DDIO.  This helps for some tasks as it is faster than
>> DMA into memory (which then has to get loaded into processor cache), but
>> then the code operating on the core has to figure out what to do with
>> the data and either evict the data or transfer it into memory.  At least
>> for now, DDIO is local, and is tied to the PCIe buses associated with a
>> particular socket and its cores, but it wouldn't surprise me to see
>> Intel extend this and allow DDIO operations into remote sockets and
>> cores across QPI.  ~250 Gbps in lab tests, reportedly.
>>
>> Alternatively, there's been some discussion of TCP offload engine (TOE)
>> support for OpenVMS over the years, but that hardware configuration
>> hasn't ever become available and supported on OpenVMS AFAIK.
> 
> We had a small discussion on this matter not too long ago.
> 
> Even cheap Realtek GBe interfaces have offload engines. I would say that 
> these days any proper network interface has offload engines. Next to the 
> standard TCP offload engine, a modern Ethernet NIC may also have an 
> iSCSI offload engine.
> 
> Fibrechannel interfaces also have quite a lot of RAM. I once tested 
> TCP/IP over fibrechannel, now that is fast...... With IPv6 you can have 
> a TCP packet size of 16MB.
> 
> In my view a proper network interface (for whatever type of network) 
> should have a powerful cpu for all kind of offload tasks, sufficient 
> buffer memory, and DMA access.
> 

This is not a new concept.  DEUNA Ethernet had a T-11 Processor, DELUA
had an M68000.  And I worked with a PC ISA BUS board from TRWIND 30 years
ago that had a 80186 and ran TCP/IP natively on the network card. (to be
honest, at the time, this turned out to be losing proposition and  most
of the cards ended out being used in dumb mode with the processor removed.)

The only thing that still gets me is I don't think any current machine has
an internal bus speed as fast as these network speeds so how to they fill
the pipe?

bill

-- 
Bill Gunshannon          |  de-moc-ra-cy (di mok' ra see) n.  Three wolves
billg999 at cs.scranton.edu |  and a sheep voting on what's for dinner.
University of Scranton   |
Scranton, Pennsylvania   |         #include <std.disclaimer.h>