[arin-ppml] Routing Research Group is about to decide its scalable routing recommendation

Fri Dec 18 04:20:54 EST 2009

Short version:    Why no-one is seriously suggesting fixing the
                  routing scaling problem by upgrading BGP alone.

                  Constraints on solutions posed by the need for
                  voluntary adoption.

                  My objections to requiring all hosts to do more
                  Routing and Addressing work than at present.  At
                  present, they don't do any - unless they are
                  mobile, in which case they do just a little.

                  I think current host stack and app functionality
                  and the current BGP-based DFZ will be fine
                  in the long-term future provided we have:

                   1 - A core-edge separation scheme such as Ivip.

                   2 - We use the TTR approach to mobility - based
                       on the core-edge separation scheme, to
                       scalably support global mobility for both IPv4
                       and IPv6 in a way which requires no changes
                       to non-mobile hosts.

Hi Leo,

Opinions on scalable routing vary widely.  Here's my response.

>> the IRTF RRG is about to decide on its recommendation to the
>> IETF about scalable routing.
> 
> I'm hoping you can fill in a gap for me, as I don't follow this
> work very closely.
> 
> In looking at the problems with the current Internet it appears
> most of the blame is laid at the feet of BGP4.  BGP has a number
> of properties that lead to scaling issues, some well documented,
> some not so well documented.

Can you mention those which you think are not well documented?

> All the solutions I see proposed though are fundamental changes to
> how we do addressing.  Locator-Identifier separation ideas, alternate
> lookup databases (e.g. DNS), translation solutions, including simple
> NAT and address embedding.

Core-edge separation schemes are sometimes also known as "map encap"
or (incorrectly) "locator/identifier separation" schemes.  In my next
message to Bill Herrin I discuss what I think is the incorrect use of
"locator/identifier separation" for core-edge separation schemes.
HIP is a real "locator/identifier separation" scheme.

Core-edge separation schemes (LISP, APT, Ivip and TRRP) don't alter
the functions of hosts or most internal routers.  They just provide a
new set of end-user address prefixes which are portable to any ISP
with an ETR, and which are not advertised directly in the DFZ.  A
covering prefix, including many such longer prefixes, is advertised
by special ITRs ("Open ITRs in the DFZ" is the Ivip term, "Proxy
Tunnel Routers" is the LISP term) so these routers collect packets
sent by hosts in networks without ITRs and tunnel those packets to
the correct address. APT and TRRP have functionally similar
arrangements for packets sent from networks without ITRs.

>From the point of view of host stacks and applications, the adoption
of a core-edge separation scheme is not a change to addressing or to
anything else.  A true "locator/identifier separation" scheme such as
HIP certainly does change the host's functionality regarding addressing.

> What I haven't seen is anything that makes the leap from "BGP is
> broken" to "the whole architecture must be changed".  More specifically,
> I haven't seen anyone look at BGPv5, or a brand new replacement
> routing protocol.  It seems that improving the system and fixing
> some of the known issues may be useful if nothing else as a stopgap,
> and yet no one seems to be working seriously on the issue.

The first difficulty is that a new routing protocol can never be
introduced to replace BGP4 unless it is fully backwards compatible -
and no-one has devised such a thing AFAIK.

The second is that it is pretty tricky to come up with a protocol for
the Internet's interdomain routing system which could cope with the
growth in the number of separately advertised end-user networks.

There could be millions or billions of separate prefixes which
end-user networks, including mobile devices, need to keep no matter
where they physically connect to the Net.

> I also haven't seen any analysis on how much of the issues with BGP
> scaling come from things that have been lumped on to BGP, like MPLS
> VPN's.  That is, if we hadn't added these things to BGP would we
> have no scaling problems at the current time?

AFAIK, the problems are inherent in trying to make 200k (my estimate
- 123k was a minimum figure two years ago - see my message to Bill)
or more DFZ routers maintain BGP conversations with all their
neighbours - involving a separate conversation about each advertised
prefix, of which there are currently (http://bgp.potaroo.net/) about
300k with a doubling time of 4 or 5 years.  The goal is to be able to
accommodate millions or billions of separate, multihomable, portable,
subnets of end-user address space in a scalable manner.

Even if the DFZ routers' FIBs could handle this, there's no way the
RIB and the global BGP control plane can scale to these numbers.

All this was pretty much agreed in the RAWS workshop in late 2006.

In my view, there are two major classes of proposal:

  1 - Those which can work with IPv4 and IPv6 and have a hope of
      being accepted widely enough on a voluntary basis.

  2 - Those which don't - they either don't work for IPv4 or they
      involve host changes or some other major thing which means
      they can't be adopted widely enough on a voluntary basis.

I am only interested in the first set.  No-one has figured out a way
to replace BGP4 in a manner which could be voluntarily adopted - that
is, in a way which would generate immediate nett benefits for early
adoptors, without significant disruption to anyone.

The first class, I think, consists of the core-edge separation
schemes LISP-ALT, APT, Ivip and Bill Herrin's TRRP.

My list of constraints imposed by the need for widespread voluntary
adoption is here:

  http://www.firstpr.com.au/ip/ivip/RRG-2009/constraints/

In RRG discussions, I have had some support for this list being
accurate - though of course no-one likes having such constraints.
No-one has suggested that this list is flawed.

Many of the proposals now being made for the RRG process seem to
involve host changes - specifically making the host responsible for
more routing and addressing things than in the past.  This is for
*every* host, not just for mobile hosts.

Even if it were possible to change host stack and/or app
functionality, for IPv4 and/or IPv6, I would still argue against it.
 As I wrote to the RRG nearly two weeks ago:

  http://www.firstpr.com.au/ip/ivip/RRG-2009/host-responsibilities/

there are major objections to requiring all hosts to be more complex
and to require them to handle the extra management traffic which
comes with their new Routing and Addressing responsibilities.

Even if these objections could be overcome, I would still object to
it because it significantly slows down the ability to send the first
packet of user data.  This slowness depends on the RTT between the
hosts, and is greatly exacerbated by a lost packet in the initial
management exchange which must precede (AFAIK) the packet which
actually contains the user traffic packet.

These objections are at odds with some of the current RRG proposals,
which involve new host functions along the lines of HIP.  I haven't
had any bites yet, but I am sure I will when I raise these objections
during the discussions of the next two months.

I think it will be a pretty good long-term arrangement to keep BGP4
running the interdomain routing system, as long as the number of
prefixes doesn't grow too much.  I think the routing and addressing
system should provide hosts with IP addresses they can actually use,
rather than requiring all hosts to manage physical and logical
addresses (locator address and identifier address) themselves.

In the case of mobility, I think hosts need to a little extra work.
The best way I can think of doing it is the TTR mobility approach:

  http://www.firstpr.com.au/ip/ivip/RRG-2009/host-responsibilities/

which would work with any core-edge separation scheme.  Mapping
changes are only required when the MN moves a large distance, such as
1000km or more - not whenever it changes its physical address.

My scalable routing proposal - Ivip - is summarised here:

  http://www.firstpr.com.au/ip/ivip/Ivip-summary.pdf  10371 words
  http://www.firstpr.com.au/ip/ivip/Ivip-RRG.html      2100 words
  http://www.firstpr.com.au/ip/ivip/Ivip-RRG-1kw.html   995 words

  - Robin