[Info-vax] Facebook service outage

John Wallace johnwallace4 at yahoo.co.uk
Tue Oct 5 13:58:07 EDT 2021


On 05/10/2021 01:22, chris wrote:
> On 10/05/21 00:16, Scott Dorsey wrote:
>> On 10/4/2021 6:04 PM, Dave Froble wrote:
>>>
>>> For those who don't get out much, how about explaining what "BGP routing
>>> announcements" are.
>>>
>>> I could try looking it up, but, you used it, so you explain what it 
>>> is ...
>>
>> If you are old you remember when there were all kinds of different ways
>> that backbone sites used to decide where to route packets too.  RIP, GGP,
>> EGP, there were a bunch of different ways to decide what the "best" path
>> from here to there is.
>>
>> That's all gone now.  There is one way to transfer routing data around,
>> and it is BGP.  There is no more picking up the phone and calling up
>> up Jon Postel to ask if he thought it was better to route through here
>> or there.  There are no more static routing tables that need manual 
>> updating
>> at inopportune moments.  BGP just works, and it works well when it's 
>> fed good
>> data.
>>
>> Unfortunately BGP was designed in an era when ISPs could trust one 
>> another,
>> and that's not really the case any more.  It is possible to publish bad
>> routing data for sites you don't like and get other sites to accept them.
>> --scott
> 
> If you have routing tables that are working, wouldn't it be a good idea
> to verify any new routes before admitted them to the working set tables,
> or perhaps it does that already ?. Some sort of fail over mechanism.
> 
> Doesn't seem very robust as is...
> 
> Chris
> 

There was a time when wise people, including some round here, used to 
advocate various radical but untrendy tactics such as keeping the 
"production" network as separate as possible from the "remote 
management" network, in the interests of robustness and defence in depth.

Presumably that's been a rather dated concept in some circles in recent 
years?

If some network config update screws up both the "production" network 
and the "remote management" network it sounds rather as though someone 
forgot to check their architecture (procedures, etc) for single points 
of failure and shared failure modes.




More information about the Info-vax mailing list