guideline for name-based web hosting justification

Alec H. Peterson ahp at hilander.com
Tue Sep 12 10:50:05 EDT 2000


Mury wrote:
> 
> I also find it interesting that in your presentation to the 11th NANOG
> meeting that you did with Avi Freedman (Isn't he working for Digital
> Island now?  Or one of the other distributed content providers) you are
> supporting a technology that not only assigns an IP address to a web site
> but assigns multiple IP addresses to a single site.  Perhaps I didn't
> decipher your presentation correctly, but it sure seems like you are
> supporting performance/service level issues above and beyond IP
> conservation.  Ah, I hear it coming, that each distributed node can handle
> multiple distributed sites off of a single IP.  Very true.  Do you know
> what the ratio of managed sites to in-service systems is?  How many
> locations is Akamai in?  I really don't know what the IP "waste" ratio is.
> But the point is you are supporting performance at the expense of IP
> addresses however large or small that may be.

I hardly see what a single presentation I did with Avi several years ago has
to do with the issue at hand.  As it happens, I can count on one hand the
number of conversations I've had with Avi this year.

> 
> In addition, you even argue against yourself.  You say, "For example,
> don't do all of the parsing at once at the end of the day; modify the
> server to keep a running tally of a customer's usage and have it write
> that alone to a file on the disk every time it changes.  Far more
> efficient.  That's just off the top of my head, and probably not a really
> efficient way to do it."
> 
> What?!  How can it be *far more efficient* and then in the next line it's
> *not a really efficient*"  Can you see why I'm not very thrilled with your
> off the cuff and seemingly inexperienced comments?

I stand by what I said.  There is 'far more efficient' which is (sometimes)
quite different from 'optimally efficient'.

The fact that I may not have experience with specifically parsing WWW log
files by no means implies that I have no experience doing that sort of thing
in other applications.  See, standard WWW transfer logs have tons of data in
them that does not relate to calculating bandwidth utilization.  That extra
data all has to be looked at before the bandwidth numbers can even be
retrieved.  Let's look at a line of a standard Apache transfer log:

128.220.221.16 - - [05/Mar/1998:18:20:32 -0500] "GET / HTTP/1.0" 200 1195

Now depending on how you count there are 6 fields on that one line of log
file, and the number of bytes transfered number is the very last field.  So
that means that one way or another you need to look at each of the fields in
the file and check if it's the right one before you can even get the
appropriate data.  I have to agree that parsing that logfile for bandwidth
utilization is a major pain.

But what if we changed the log file format to just look like this:

128.220.221.16 1195

Or perhaps an even better way would be to write over the same line in the
file again and again every time, so your utilization program just has to
look at the file once to see how much has been used.  Granted you can't just
use Apache's mod_log_config for that, but it isn't a lot more work than
that.

My second statement about it not being a 'really efficient way to do it'
meant to say that the 30 seconds I spent thinking of how to make the parsing
process more efficient was probably not sufficient to come up with the
optimal solution.  Perhaps I should have said 'probably not optimally
efficient' instead.  Sorry about that.

> 
> By making light of some real issues that were brought up it sure seems
> like your statements are hypocritical.  Now like I said, I'm not the
> smartest guy out here, so if I've badly misrepresented things I apologize
> in advance.

I didn't mean to say it was no big deal.  Making the changes I proposed
would certainly take some work.  However, contrary to what some other people
said, the problem is not insoluble.

My point was that I can't stand excuses for doing 'the right thing',
especially when people insist on working against an organization that is
only trying to help.  ARIN is not making these policy changes to make
everybodys' lives more difficult.  ARIN is making the changes because it has
a responsibility to stretch IP space in its region of the world as far as
possible.

Also, as I tried to say before people on the 'net have come up with some
truly brilliant ways to deal with the issues that face us when they need
to.  I really think it would be a far better use of our time here if we all
put our heads together to try and figure out a feasible way for everybody to
use name-based virutal hosts in as many applications as possible than
arguing about how hard it is.  Then, if we as a group find that it is truly
not possible then we can state that (from experience, as opposed to just
from theoretical conjecture) at the next ARIN meeting and recommend an
appropriate policy change.

> 
> Bottom line, for every one out there saying it's no big deal to do single
> IP virtual hosting I would like to see a solution that does not sacrifice
> reliability, accountability, quality of service, and functionality.  I
> hate it when people (even smart people) start voicing opinions on things
> they don't understand.

You may think that just because I don't run a web hosting outfit today I
don't understand the issues, and you're welcome to think that.  It is true
that I don't know how every single web hosting outfit out there accounts for
usage, but I daresay you probably don't know that either.  You know how you
do things, and that's all you need to know.  This is the exact reason why I
or you alone are not responsible for creating ARIN policies.  It is done by
member participation in ARIN.  And the general idea is that ARIN and its
members benefit from having a hand in shaping what happens to IP allocation
policy.

Speaking to your request for a solution to your accounting woes, I really
don't think you want that from ARIN.  See, if that happens then people will
start screaming about how ARIN dictates the way people must do business,
which gets into another rat-hole that we really don't want to go down. 
There are many ways to skin this cat.

So I will say again, instead of arguing with me about how easy or not easy
this problem is to deal with, why don't we try actually solving the issues? 
And if they are not solvable then we will know we have tried our best and we
can report those findings at the next ARIN meeting in an effort to get the
recently adopted policy changed.

And FYI, demanding a solution to your specific problem without providing any
suggestions of your own is not the best way to engage help from others.

> 
> I'm also not stubborn.  I'm not running things the way I do because it's
> my way, but because they work, they are scalable, they are functionable,
> and we have zero down time.  I've tried Microsoft IIS.  It doesn't work.
> Well doh, of course it works, but not for a company that demands uptime
> and security and a fast and simple database.  I have to reboot co-located
> IIS machines all the time.  My BSDI/Apache/MySQL/Perl/PHP/Raven boxes have
> had zero downtime in the last 3 years.  That is not an invitation to hack
> or DOS my network.  But thanks for thinking about me.

Not a bad setup.  I don't really see what I said before would not apply to
this setup.

> 
> And like I said before, when appropriate we have assigned multiple sites
> to a single IP.  We actually do it by sending all requests into a CGI
> script that grabs the HTTP_HOST env variable and creates the customized
> web site on the fly with MySQL.  So yes, we are trying to conserve IP
> addresses, we are not greedy, whiny bastards trying to screw the Internet
> up for everyone else.

Nobody ever said you were, and I truly resent having words put into my
mouth.  Please refrain from doing so in the future.

If you recall, I was addressing a specific post where a person was demanding
specific solutions to every problem that this policy change would be
causing.  I, for one, don't respond well to demands for help.  And as I also
said, other people have solved these problems, and even think that the
policy was a pretty good idea.  In fact some of them operate some of the
largest web farms in the world.  So regardless of how little or much I may
know about web hosting, there are people out there who know far more than
both of us about it who have managed to make things work.

> 
> Cool!  Now we all know how to do name based hosting... er, wait... what
> about all those HTTP/1.0 browsers!?  You don't think they exist any
> more?  Check this out.  In fairness I sampled all my virtual hosts off of
> one server from a selective time period.  All my logs files are in the
> www.domain.com format.  Here are my commands and results:
> 
> webserver3: {17} % grep 'HTTP/1.1' www.*.com | wc -l
>   400441
> webserver3: {18} % grep 'HTTP/1.0' www.*.com | wc -l
>   375412
> 
> 48.4% of the browsers out there that accessed my customers' sites used
> HTTP/1.0.  For the uninitiated the 1.0 version of the HTTP protocol does
> NOT support name based hosting.

That's the first number I've seen on the subject that is greater that 2%,
and I will confess it does concern me a great deal.

Does anybody else have any numbers they'd like to share?

> 
> Can I tell all my customers to call you when their online business drops
> by almost 50%.  By the way, can you use a shared IP for secure server
> certificates?

No, you can't, which is why there are exceptions to the policy.  Granted
there isn't a specific exception for SSL, which I think is one place where
the group (myself included) erred in Calgary last March.

> 
> I don't want to see any more comments that I should be doing things
> smarter and better.  I want to see explanations of how I can accomplish
> the things that you say are so easy.  Like I said I'm not stubborn... show
> me the way.  If you can't, then please refrain from making popular
> political statements that don't affect YOUR business and your customers'
> business.

I never meant to trivialize the changes.  I merely meant to point out to
those who said they were not only non-trivial but were impossible that in
fact it was not impossible.

> 
> PS.  If you are such an advocate for IP conservation why do you have a
> whole block?  I can't tell how many IPs you are wasting because your
> provider has not swipped your block.  But you have multiple web sites
> running on multiple IPs!  What's your excuse?
> 
> Name:    gw1.hilander.com
> Address:  216.241.32.33

This is actually its own machine.

> 
> Name:    virthost.hilander.com
> Address:  216.241.32.35
> 
> Name:    ramirez.hilander.com
> Address:  216.241.32.34

Hey, nobody's perfect.  I'll have to look into changing that.  Thanks for
pointing it out.

> 
> Pretty interesting web sites I might add.

Thanks for looking around, I spent years writing it.

Look, bottom line is that name-based virtual hosts have the ability to
stretch our IP utilization even further (and the way IPv6 is looking means
we'll really need to do this).  Moreover, if you think the name-based
virtual hosting policy should be changed or repealed, then by all means
start participating in the process to make that happen.

And finally, there may well be some websites out there that cannot be
handled any way except for giving them their own IP address.  I don't know
this for sure, but I'd say it's a pretty good guess.

Similarly, there are some dial-up users out there who insist on having a
static IP address.  ISPs are free to do that, _JUST AS LONG AS MOST OF THEIR
LOW-END CLIENTS USE DYNAMIC IP ADDRESSES_.  This can easily be extended to
virtual hosting.  And I agree that this should be stated specifically, but I
really think ARIN's true intent was to change the default mom-and-pop
hosting account from a dedicated IP address to a name-based virtual host.

So perhaps the policy should be re-worded to state that for providers who
sell 'cheap' web-hosting for domains that get relatively few hits per month
they should use name-based virtual hosting for those clients?  From what I
recall from the discussion in Calgary those were the accounts the policy was
targeted at...

Alec

-- 
Alec H. Peterson - ahp at hilander.com
Staff Scientist
CenterGate Research Group - http://www.centergate.com
"Technology so advanced, even _we_ don't understand it!"



More information about the ARIN-PPML mailing list