Introduction to Proxy Servers

Do you have a growing family at home slowly eating away at your bandwidth? Maybe you're a web surfing fanatic looking for a little more speed? If you answered yes to either, a caching proxy is for you. This simple addition to your home network can provide you with additional bandwidth by reducing common internet bandwidth usage. Normally these types of proxies are found in the commercial world, but they're just as useful at home. Below is an image of a traditional multi-computer home network.


Traditional Home Network

So what is a caching proxy server? The concept is pretty simple: when a request is made to a website, that content is then saved locally on the local caching proxy server. When another request for the same data is made by any machine on your network, that data is retrieved from your local proxy rather than the internet. The content can be anything from regular website content to a file you downloaded. For those with multiple computers in a single household, the bandwidth savings really adds up with patches and multi computer driver updates. The change to the network configuration is really quite small:


Home Network with Proxy Server

At this point many are likely asking how much this costs. If you read my previous article, you would know the answer right away: "It's free and it's on Linux". I suppose I need to preface that last comment with the qualification that you need some old "junky but functional" hardware lying around. There are many different Linux solutions we can deploy to achieve this goal. For this article I have chosen a solution of Arch Linux, Shorewall, and Squid.

We selected Arch Linux because it is a rolling release and has the latest and greatest packages. If you are not familiar with the phrase "rolling release", in Linux it indicated a distribution that keeps you up-to-date with the latest software updates via the package manager. You will never have to re-install or upgrade your server from one release version to the next with this style of distribution. The great part about a rolling release on a proxy/firewall setup is that once it's set up and working correctly, you will not have to go back and completely overhaul the server when a newer distribution update comes out.

Along with the different types of OS and application solutions, there are also multiple ways to set up a caching proxy. My preferred setup is a transparent caching proxy. A transparent proxy does not require you to make any additional changes to the client computers on your network. You utilize the proxy server as your home gateway, allowing the proxy server to automatically forward the ports to Squid. The second way to utilize Squid would be to set up your client machines to utilize the proxy server via the proxy settings in your browser. Although this may be the easiest way to set up a proxy server, it requires you to make changes for any machine that attaches to your network. The table below shows what I selected for my transparent caching proxy server.

Test Proxy System
Component Description
Processor Intel Pentium 4 3.06GHz
(3.06GHz, 130nm, 512K cache, Single-core + Hyper-Threading, 70W)
Memory 2x256MB PC800 RDRAM
Motherboard Asus P4T
Hard Drives 120GB Western Digital SATA
Video Card ATI Radeon 7000
Operating Systems Arch Linux (32-bit)
Network Cards Onboard Intel Gigabit
PCI 100Mbit 3Com 3c905C-TX

I could have selected older equipment, but this is what I had laying around the house. As seen in the table, one of the hardware requirements for a transparent proxy is to have two network cards or a dual port network card. We recommend against using wireless for either of the connections to the proxy server, and a Gigabit Ethernet connection from the proxy to the rest of the network is ideal. (The connection to your broadband link can be 100Mbit without imposing any bottleneck.) Another quick suggestion: If you download a fair amount of files, it may be a wise idea to utilize at least a 120GB HDD. The idea is that the more space you have, the longer you can keep your files stored on your proxy server. With storage being so cheap, you could easily add a 500GB or larger drive for under $100.

Now that we have our hardware and a good idea what we want to set up, it's time to get installing. I'll try to keep this portion simple and to the point, although if you have questions later feel free to post a comment.

Proxy Server How To
Comments Locked

96 Comments

View All Comments

  • ChrisRice - Tuesday, May 11, 2010 - link

    Freebsd would certainly be my second choice in home firewall systems "First in the corporate scene". That being said I've always been a fan of having the newer packages of Arch compared to Deb order to get many new features that you would be without in a Deb environment. As far as bugs/security holes because its a rolling compared to the bugs/security holes on a Distro with a slow moving release system, I think they both have their own downsides.
  • mfenn - Tuesday, May 11, 2010 - link

    I agree with dezza that Arch should *not* be used in a "set it and forget it" box. The great thing about Debian or Red Hat is that you can choose to only receive security updates. The maintainers also backport security fixes for the supported life of the release (which for RHEL is 7 years!). Arch only provides the upstream package versions, so if you want the latest security fixes, you also get the latest functionality-killing bugs. Also, for somebody who isn't religiously running "pacman -Syu" every week or so, Arch will quickly fall into the dist. upgrade hell that you get with other distros. You've got to realize that rolling release doesn't eliminate the dist. upgrade problem, it just allows the user to spread the problems across a longer span of time (e.g. I can update every month for a year and encounter 1 problem each time, or I can upgrade every year and encounter 12 problems). For an infrequently updated system (i.e. one build by any reader of this article because let's face it, if they were Linux geeks, they would have one already) you *will* have upgrade problems. In summary, a growing trend in the Linux community is to treat Arch as a panacea, which it most certainly is not. It's great for some things (desktops for tinkerers, development with the latest and greatest, supporting oddball hardware), but a server distro it isn't.
  • KaarlisK - Tuesday, May 11, 2010 - link

    About x1 in x16 slots:
    Could you please test that? :D You have the motherboard.

    And why is cache_memline always half the RAM? Even if you have, for example, 8GB?
  • JarredWalton - Tuesday, May 11, 2010 - link

    I think Chris assumes you don't have that much RAM. You probably only need to use half the RAM or leave 1GB free, or you can get by with just caching 512MB in RAM. I have my proxy set as 2GB RAM, and so most of the data comes across from the proxy at GbE speeds. If it goes to HDD than the speed will drop to around 50MB/s, which is still plenty fast.
  • KaarlisK - Tuesday, May 11, 2010 - link

    Thanx for the explanation!
    So basically my usage pattern can determine the cache size, and there IS use from a large cache, as your 2GB example shows.
  • ChrisRice - Tuesday, May 11, 2010 - link

    I would recommend starting with a smaller cache then tweaking up. I run a 256MB ram cache and that works just fine for me. That being said if I had more ram on the hardware I am using, I would run at least 1GB.
  • KaarlisK - Wednesday, May 12, 2010 - link

    Thanks for the reply.
    And thanks again for the article! This might finally be the way to sneak... a kind-of replacement for WSUS on a certain network. I know it's horrible, but I do not really have a choice.
  • enterco - Tuesday, May 11, 2010 - link

    Hi!

    I would like to make few observations:
    - many owners of an old PC suitable for a caching proxy are using ATX motherboards and enclosures, making the proxy a 'big noisy box', just good to keep in the basement, if you have one.
    - an old P4 computer will add enough bucks to the electricity bill
    - a typical computer user is not familiar with the requirements of configuring a Linux box, and will avoid this kind of setup.
    - the most bandwidth hungry applications in a home is not a HTTP download, but P2P transfers.

    I, personally, don't have the basement, and I don't want the noise made by such a box, neither to waste space or money in electricity in my home. So, instead of bringing back to life an old machine, I would prefer to configure QOS on a wireless router.
  • pkoi - Sunday, May 16, 2010 - link

    Go the VMware way.
  • medys - Tuesday, May 11, 2010 - link

    These days virtualisation is the answer :)

    I know that most of people do not need so much as I do and a lot of them do not care about backups, but in case you do, there is a great way to have everything you need in one box :)

    Get some semi old PC with at least 2GB of RAM (4GB is recomended).
    Install a distribution of linux on it that can run virtualisation software (VirtualBOX, vmware server, KVM).
    Configure the linux as NAS server.
    Install virtualisation software.
    Create virtual machines for anything you want :) router, proxy, LAMP, application server etc....

Log in

Don't have an account? Sign up now