Thursday, 16 April 2009

OneTel Mark II - some more comments

Let us be philosophical for a moment. When you use the internet, what is actually happening? Think about the following things that you may or may not do each week:

  • Send and receive emails
  • Watch some Youtube movies
  • Do a few Google searches
  • Read some blogs
  • Read a few deadtree newspapers online
  • Watch some snippets from the Channel 7 news
  • Listen to TripleJ online
  • Download some "privacy" movies and music
  • Upload photos to FacePlant
  • Chat via Skype
  • Twitter
What happens when you read a web page, even one as boringly static as mine with no video, no dancing icons and pop-up ads and very few photos?

Well, when you click on something, your browser makes a "request" to a far away "server" for information. The data that you are looking for, whether it be text, photos, porn, music, movies or pizza recipes will probably reside on something like this - a bank of servers in a rack in a data centre. Each of those black, slab-like things is a separate server. The ones at the bottom, due to their shape, are commonly called "pizza boxes", because they are as thick as a pizza box, and can be stacked up like empty pizza boxes.


If you put lots of these racks together, you get a "data centre", or a "server farm", or a very expensive, noisy, fairly cold room. The real estate that you are looking at below can rent for thousands of dollars per square metre. Per month. With electricity billed separately.


Things don't have to be this fancy though. You can do it on the cheap. An associate of mine once saw how Fairfax delivers its web content, like the SMH. They described a room in the Fairfax offices with old PCs stacked up on shelves. We're talking about those old white PCs that were all the rage in 1999. There's nothing wrong with that approach - that's how Google does it. However, I think Google know what they are doing, whereas Fairfax were just being cheap.

In reality, most servers are little more than souped-up PCs. They might have 2 processes or 4 processes rather than 1, and they are quite likely to have a lot more memory than the thing sitting on your desk. They might also be attached to a lot more storage. But the main thing that generally differentiates a server from a PC is that it has redundancy built into it - if something fails at 2am, like a power supply, the second power supply takes over so your web site or mail server does not come to a screaming halt. That's it really.

Now, these servers have a network connection in the back, just like your PC. Whilst your PC or laptop will be connected to a dinky little router somewhere in your house, these servers are more likely to be attached to a big switch - like the one shown in my last post. Unlike your PC, which is running a 100MB connection, these things will have a 1GB connection. If the IT staff are fancy, they might even plug in two 1GB connections and duplex them to create a 2GB pipe into the server.

Wow.

So if I get a 100MB connection at home, I'll be able to download porn lots faster?

No. Not necessarily.

For starters, let's assume the server you are collecting your porn from has a 1GB connection to the data centre network. What if that data centre only has a 1GB connection to the Internet, and that link is being shared by 500 servers in the data centre? Do you think that you are going to get a nice, clean 100MB link direct to your server?

Not a hope in hell.

Then you have to consider all the other hairy-palmed miscreants sitting at home doing the same thing. If 100 of them all decide to hit that server at the same time, you are all contesting for the 1GB network pipe that the server has (assuming all the other servers in the data centre are powered down). At most, you are going to be given a theoretically maximum slice of 10MB - and you won't get that, because 1GB links never run anywhere near 1GB - past a certain limit, congestion kicks in and that is that. Things won't go any faster.

Then you have to consider things like file sizes, disk speeds and all sorts of other hardware issues. If you try to download lots of small files that add up to say 100MB, they will take a lot longer to download than one big file of 100MB. Trust me on that.

You also have to consider the hard disks that hold the file you are after. They can only throw off data so quickly - if lots of you are seeking files that are held in lots of different locations on the hard drive, you are going to have to wait for those files to be delivered.

So you see kiddies, in many cases, a faster network may do nothing to speed things up. The limitations may all be at the other end, where the data is being delivered from. The pipe in the middle may have little or no influence on how quickly your stuff is delivered.

Yes, there are lots of tricks that you can pull to speed up data delivery, and companies like Google and other huge concerns use them all the time. You can spread the data across lots of servers, you can use cache servers, you can copy that data to data centres spread all around the world so that your request doesn't have to travel to Timbuktu in order to get the file you want. But for many sites, such tricks are not possible.

Why don't site owners just buy more bandwidth?

Because it is expensive. ISPs and data centres charge you a lot for bandwidth. Unless your site is producing a lot of revenue, it does not make economic sense to put in a huge pipe. You put in the bare minimum that makes your site acceptably fast for most users, and nothing more. Only government agencies, who are spending our money, are silly enough to buy excess bandwidth that may never be used. They get given a budget for bandwidth, and they spend it regardless of whether the traffic warrants it or not.

No comments: