[ppml] Oxymorons (was: Geo PI)

Mon Mar 20 23:30:29 EST 2006

Vince,

I enjoyed your talk at GROW (IETF's Global Routing Operations WG) today.
I was going back through my many as-yet-unread mailing list messages, and
thought I'd respond to some of your points here.  Further comments
inline...

On 02/16/06 at 3:04pm -0800, Vince Fuller <vaf at cisco.com> wrote:

> These messages attempt to explain why "geo-topo addresses" and "provider-
> independent addresses" make no sense if the goal is a scalable Internet
> routing system built on the (flawed) ipv6 routing architecture (fatally
> flawed in the sense that it confuses the concept of transport identifier
> and routing locator into a single "address").

First off, I'm not opposed to ID/locator splits, either full (requiring
some degree of IPv6 redesign) or partial (aka shim6).  However, I'm from
an operational background, and my first instinct is to see if we can make
something work from the tools we have...

> Those who propose "geo-topological" addressing, an oxymoron if ever there
> were one, are effectively dictating how the network topology is to be
> organized, with rather profound implications for provider business models.
> If addresses are assigned in this manner, then service providers whose
> networks span multiple address assignment domains (connect to more than
> one city or however the geographic areas are split up) must:
>
>   a) connect to all designated interconnection facilities associated with
>      the address assignment authorities in the geographic areas they wish
>      to serve

IMO "designated interconnection facilities" are unnecessary.  RIRs have
managed to stay out of the routability game thus far, and I think they
should and can continue to do so.

> and
>
>   1) carry all more-specific routes for all providers in all of the cities
>      that they serve (which eliminates aggregation)

It only eliminates aggregation *for that city*.  For the rest of the
Internet, the providers routers will see only aggregated routes.  There is
still a much higher degree of aggregation compared to the current IPv4 PI
swamp.

> or
>
>   2) provide free transit service for any customer of a competitor in a
>      geographic area whose addresses are aggregated

IMO generally undesirable, but it is indeed one option...

> or
>
>   3) enter into a settlement agreement (which implies a regulatory regime
>      unprecedented in the Internet business) with all other providers in
>      geographic areas which they serve

And IMO this would only happen if we completely fail to solve the problem
ourselves and "let" Congress "solve" it for us.

> > Forcing people to join an unnecessary IX is not the way
> > to solve the problem of regional aggregation of routes.
> > This is a purely technical problem which can be solved
> > by the RIR practices in allocating IPv6 addresses. If they
> > would allocate addresses in a geo-topological manner then
> > end users and ISPs would be free to aggregate routes
> > outside of their region without any involvement of governments
> > or any requirement to join consortia or IXes. It does
> > require the users of such geo-topological addresses to
> > ensure that in THEIR region, there is sufficient
> > interconnectivity (physical and policy) between ISPs for
> > the addressing to work. But that does not need to be determined
> > or managed centrally.

I agree wholeheartedly with Michael here...

> In the interests of demonstrating why "geo-topo" addressing can't possibly
> work without radical changes to the business and regulatory models of the
> Internet, consider the simple example of a provider who has connections
> to two popular "geo-topo" addressing domains, say the Bay Area and the
> DC area. Let's say that 10.0.0.0/8 is the "geo-topo" address block in the
> Bay Area and 172.16.0.0/12 is the "geo-topo" block in the DC area. This
> provider has four customers in the Bay Area:
>
>   10.1.1.0/24
>   10.10.4.0/22
>   10.100.8.0/21
>   10.200.0.0/16
>
> How is the provider supposed to make use of the 10.0.0.0/8 aggregate? Does
> he advertise it to other providers in the DC area or anywhere else where
> he offers service (Asia, Europe, etc.)? By doing so, he is stating that he
> can provide connectivity to all hosts which are numbered in that address
> range. But he only provides transit service to the address ranges associated
> with his customers.

IMO the provider should not advertise the 10.0.0.0/8 aggregate to anyone
but his own transit customers.  The obvious next question is, how does the
provider deal with peers?  IMO, there are a few different options:

 1) Only provide local routes to peers.  This means telling peers who
don't have a global presence of their own that they must also buy transit
connectivity to someone who can carry their traffic to the Bay Area.
Obviously, they could have both local peering and non-local transit to the
same peer/provider.
 2) For peers where only providing local routes is not an option, set up
a multihop or tunneled BGP session from a Bay Area router to the peer's
nearest routers.

> For him to provide connectivity to all the address range, he must
>
>   a) have full routing connectivity to all other providers that have
>      addresses in the same range; this implies that he connects to all IXs
>      within the region and maintains a full-mesh of routing information
>      (today, BGP sessions) to all of these providers

This is already the case for all transit-free networks (on a global basis,
not necessarily regionally).  Every transit-free network peers (somehow,
either at IXs or privately) with every other transit-free network.  So I
would restate your requirement as:

 a) have full routing connectivity to all other transit-free providers
that have addresses in the same range; this implies sufficient peering
density to provide at least one (and probably two) IX or private peering
sessions in each aggregated region.

It *does not* require connectivity to all IXs: private peering is still
sufficient, and peering with a transit-buying network's transit provider
is just as good (for reachability) as peering with that network at an IX.

> and
>
>   b) must be willing to provide connectivity to all sites within the region
>      to any place that he advertises the prefix 10.0.0.0/8 through routing
>      exchanges; if he advertises this prefix to non-customers, it implies
>      that he is will provide free transit to his competitors' customers
>      which are numbered out of this block

True.  As stated above, I suspect providers will only advertise aggregated
routes to their transit customers in most cases.

> Both of these requirements defy business sense,

After my modifications above, I don't believe either requirement is
unreasonable.  One central theme that I think people miss is that not
everyone has to aggregate at the same level to ensure reachability.  If a
provider has insufficient peering density to aggregate at a metro level,
they should be able to aggregate at a regional or continental level.  As
long as we assign the PI blocks sensibly, this should be perfectly
feasible.  (Now I recognize that different levels of aggregation will have
TE implications, but I believe those implications are manageable within
the existing TE toolkit, and much more manageable than attempting to
reproduce your TE with something like shim6.)

> I'm not sure how I can make this much more clear. It seems appropriate to
> re-state Dave's quote Yakov:
>
>   "Addressing can follow topology or topology can follow addressing.
>    Choose one."
>
> and I'd offer a corollary:
>
>   Transit relationships (i.e money) must follow topological relationships
>   (and thus addressing); the alternative is some combination of inefficient
>   or non-scalable routing, black holes, settlements, regulation, or other
>   undesirable things.

So I guess in my case my alternative is "other undesirable things", which
I maintain are more desirable than non-scalable routing (allowing an
IPv6 PI swamp) and significantly easier to implement than yet another IP
re-design (though if we can achieve some degree of ID/locator split
incrementally, great).  The "undesirable things" I see in geographically
based PI aggregation are (to summarize):

 - In order to reach networks that have been geographically aggregated,
the remote network must either peer within the aggregated region, buy
transit from someone who does, or set up multihop or tunneled eBGP peers
with a router in the aggregated region.
 - Networks wishing to begin aggregating a region's routes must ensure
that they have sufficient peering density within that region (or must make
alternate arrangements) to ensure their peers meet the above requirement,
or must decide on a case-by-case basis if providing free transit is a
superior alternative.  They must ensure that the benefits of aggregation
outweigh the costs of the peering changes required to maintain full
reachability.
 - Networks whose peers begin aggregating must consider the implications
of a possible switch from hot-potato to cold-potato routing, whereby they
carry the traffic to the destination region before handing it off to the
destination network.  This may actually drive networks to aggregate in
order to keep up with or get ahead of their peers and minimize the
fraction of traffic they must carry to/from the region.
 - Multihomed transit customers whose transit providers have different
aggregation policies will need to consider those policies when setting up
TE policies.

However, the benefits of this approach are considerable:

 - Not much must change in the near term.  The only requirement up front
is that RIRs who decide to allocate PI space do so in a manner amenable to
later aggregation.
 - Network operators can make their own decision whether to aggregate.
If, to take Jason's example, an NSP with 150k internal prefixes decides
they want to aggregate sooner, another NSP with 50k internal prefixes and
a more aggressive hardware refresh cycle might decide not to aggregate at
all, or to do so later on.
 - Operators need only aggregate when the scaling issues of carrying PI
space start to manifest themselves.  When they do, it becomes a business
decision for each operator how to deal with the scaling issues, either by
upgrading hardware or aggregating or both.  This ensures that operators
will make the decision based on the economics in place at the time, which
may be *much* different than they are now.
 - End-user networks can maintain provider independence and use
traditional TE tools to influence inbound traffic, even after their
providers start aggregating.  (The precise metrics may need to change, but
the overall methods needn't.)  If an end-user network needs/wants to use
their PI space somewhere far from the location it was mapped to on
assignment, they can still do so, it just won't get aggregated.

So in conclusion, I would say that we (particularly the RIR community)
should seriously consider in the near future whether to tweak our IPv6 PI
assignment policy to choose the specific prefix to allocate based on the
geographic location of the applicant.  Making such a simple change will
have absolutely no negative impact on the recipients of the PI netblocks,
and will set us up with the ability to aggregate as needed in the future
to deal with the scalability issues that will inevitably arise in any
system based on the current routing paradigm.

-Scott