Proxy Server How To

Start by installing Arch Linux (or your chosen distribution) onto the hardware you selected. If you are in need of a little assistance with the installation, I recommend using this wiki guide and then set up yaourt. Once you have completed your standard Linux installation you need to ensure your network is configured properly. In the case of my transparent proxy, I plugged one network port directly into my cable router and allowed it to grab and IP address via DHCP. The second adapter is then given an IP address of your choice (I chose 10.4.20.1; other common IP addresses would be 192.168.x.x).

At this point you will want to test your network configuration. Start with trying to get out to the internet. If this works, plug your secondary network adapter into whatever switch/router you have available. Take your desktop or laptop that's plugged into the same switch and assign it an IP address in your 10.4.20.x range. (For DHCP setups, see below.) You should now be able to ping your new proxy server (10.4.20.1) from your desktop/laptop. As a quick note for the users who only have a wireless cable modem, it is okay to have both interfaces of your proxy server and desktop plugged into the same cable modem hub.

Now that we have the configuration of the network cards complete, we just need to do a quick installation and configuration of Shorewall/Squid. That may sound like a daunting task to the Linux initiate, but this is actually very simple. First go ahead and install both Squid and Shorewall. Arch has both readily available in the package repository (from a command prompt: yaourt –S shorewall squid). If you are not utilizing Arch, you can download the packages manually from www.shorewall.net and www.squid-cache.org.

Whether you installed Arch Linux or another distribution as your base OS, Shorewall has one simple command to get it set up: cp /usr/share/shorewall/Samples/two-interfaces/* /etc/shorewall. (This copies the base two-NIC example to your live Shorewall directory, which saves a lot of manual work.) Make a quick edit to /etc/shorewall/shorewall.conf and change the Startup_Enabled to yes and you now have a functioning Shorewall. The only thing you need to do for Shorewall at this point is add the following rule into the /etc/shorewall/rules file: REDIRECT loc 3128 tcp www. Start Shorewall by typing: shorewall start from the command line, and add it to your boot process by putting shorewall into the DAEMONS section of /etc/rc.conf.

Now that Shorewall is fully functional and configured, we need to configure Squid. I found a short wiki guide that will assist with the initial set up of Squid. Once you have completed the configuration in the wiki guide, you need to pay close attention to a few configuration settings located in /etc/squid/squid.conf. The cache_memline should be set to half of your installed ram on your proxy server. In my case I have 512MB of total memory so I configured cache_mem to 256. The other setting that you need to pay attention to is maximum_object_size. This setting is the maximum file size your proxy will retain. I set my maximum size to 2048MB in order to retain everything up to a CD ISO. Be cautious of using 2048 if you have anything less than a 120gb drive as your storage space could be gone in the matter of a few days. To get the caching proxy in place and running, the most important line to add is http_port 3128 transparent. The key here is the addition of "transparent", which turns squid into a caching proxy that won't require any additional configuration on your client PCs.

If you followed all of the directions correctly, you're now ready to configure all the machines on your network with a 10.4.20.x IP address with the gateway set as 10.4.20.1. Don't forget to configure your DNS as well (in /etc/resolve.conf). Now that you have everything fired up give your new proxy a spin around the internet. If you would like to do a good test, download a decent size file (i.e. larger than 1MB). Once the download is complete, you should be able to download it again a second time and get LAN speeds on the download. If you have multiple computers, use another machine on your network and attempt to download the same file and you should again see LAN download speeds.

Proxy Server with DHCP

Although I wanted to keep this short and to the point, a common question inevitably comes up: what if you still want to use DHCP? There are a few ways to tackle this issue. If you're lucky enough to have a router/cable modem that will allow you to change what IP addresses it assigns to the network, simply change it over to your new 10.4.20.x subnet and have it assign the gateway of 10.4.20.1. If this is not the case, you will need to disable DHCP on your router and install the DHCP server package (in Arch: pacman –S dhcp). The configuration can be a bit of a hassle, so here's my /etc/dhcpd.conf.

Start the DHCP service on your proxy (/etc/rc.d/dhcpd start) and test DHCP on your desktop/laptop. Assuming all goes well, add dhcpd to your DAEMONS in /etc/rc.conf. If you happen to reboot your Linux box, after a minute or so your proxy should be back up and running.

Introduction to Proxy Servers Linux Neophyte Troubleshooting
POST A COMMENT

97 Comments

View All Comments

  • eleon - Tuesday, May 11, 2010 - link

    I really encourage everyone to use or try linux, and to reuse old hardware. but this concept is the wrong solution in so many ways.

    The main advantage of caching proxies is not to save bandwith, it is to reduce the downloaded data volume.
    If you have the problem that the bandwith of your internet connection isn't shared between your client and/or applications fair enough, you need to think about QOS not a proxy server.

    a rolling release distribution for a router???? I use archlinux myself on my laptop, and I like the rolling release cycle and the cutting edge packages on my Desktop. but it's really the wrong distribution for a infrastructure-box like a router. your argument that you never have to care about updating anymore is wrong. I would say you have to care/worry everytime you are updating! The advantage of a distribution with stable releases is that you set up the box, and if it's up and running you have only securityupdates. this means only minor updates and there are no configuration changes. with a rolling release you have major versionupdates and there is a greater chance that your config isn't working after updating a package. so there a two szenarios: you update frequently and risk everytime to break the system (which provides your internet-access). or you don't update, und your router/firewall may have serious security-issues. so using a rolling-release-distro on a router isn't a good idea at all!

    use a pc that needs more than 100W for this? maybe you should think about investing this energy-costs in a faster internetconnection?

    I was thinking about a caching proxy myself, but for a shared G3 connection which has a data volume limitation of 6GB/Month. in this area a caching proxy can make sense, and you can add something like ziproxy to reduce the transmitted data by compressing the pictures. but one youtube video produces more traffic than 100 pictures. so whats the point, and squid doesn't cache dynamic content like flashvideos.

    so for your problem/goal to have a "fast surfing experience" while your family is doing what ever on the internet, you solution is QOS, which can handle this very effectively. use embedded hardware to be energyefficent, and use a specialized router distribution ( openwrt, pfsense ,... http://en.wikipedia.org/wiki/List_of_router_or_fir... ) so that you don't spend lot's of hours to get it running, which is really inefficient too.

    but if your goal is to learn something about linux, your family proxy project is the way to go! :)
    Reply
  • Dravic - Tuesday, May 11, 2010 - link

    My reply was similar to yours a qos solution is what would fit best in this situaion, unless you're dealing with usage caps or low bandwidth service. I've tried this several times over the past ~7 years at home and the browsing experience was noticeably slower when using a proxy. The extra latency of even a hashed disk look up of an object is slower then just gettng the object on a broadband connection.

    But I was told this just wasnt "true" .. well see

    On a saturated link i can see where a proxy would help because your not going over the link, but that is the job of qos. I'd like to see FULL page load metrics for both types of data retrieval (while link saturaded and unencumbered).
    Reply
  • JarredWalton - Tuesday, May 11, 2010 - link

    I'm not sure some of you are on the same page as me. First, my particular setup was done purely for initial testing. As I comment (multiple times), it's complete overkill--both from a hardware performance as well as a power requirement perspective. From the conclusion:

    'Our only recommendation is that you consider the cost of electricity compared with the hardware. Sure, Linux will run fine on "free" old hardware, but a proxy server will generally need to be up and running 24/7, so you don't want to have a box sucking down 100W (or more) if you can avoid it.'

    We're not saying you need to do Arch, or you need high-end hardware. In fact I'm going to try setting up a proxy with a CULV and Atom laptop to see how that works.

    As far as QoS, we never even mentioned that. The point of a caching proxy is to avoid going out to the Internet multiple times for the same data. For me in particular, where I review lots of laptops that need frequent updates, and I have to get new video drivers regularly, the idea of a proxy means that I can speed up the process for quite a few things. I'm not worried about "saving bandwidth" in the way you're discussing, though if you had a plan that charged you for downloading over a certain amount it might be useful. I'm interested in speeding up patching and such.

    Hence, the comments about wishing Steam would work with my proxy... as it stands, I have to manually copy updated files from one PC to another, or else let each download the latest updates manually. L4D2 has had a few 200MB+ updates recently, and I'm sure I've downloaded that on various PCs/laptops at least four times. At 1.5MB/s, it can take a while, especially if I just wanted to play a quick game.

    Everything we discussed in this particular article can easily be applied to Red Hat, Debian, SuSE, Ubuntu, or whatever favorite distro you choose. As a typical non-Linux user, it amazes me how much time people spend arguing over the benefits of their chosen distribution. It's attitudes like that that frighten away potential converts more than anything. Instead of arguing about why one of our specific configurations was bad, why not point out the good?

    Linux can do all this and save time on downloading patches and updates for multiple computers, and you can even get a faster surfing experience on frequently visited sites. You can run it on old or new hardware, and in fact a nettop with a USB adapter might be the ideal way of doing this from a power perspective. And all of this is free, assuming you have the necessary hardware. Pointing out flaws we already list in the article (i.e. the power concern) is a waste of time. I put it as the last sentence figuring that if nothing else, people would read the conclusion and see our discussion of power concerns.
    Reply
  • michal1980 - Tuesday, May 11, 2010 - link

    I get your point Anandtech guru's. And the article is fine. But it seems like you guys are deaf right now.

    For most users, even power users, the question remains, why? What REALLY benefits will I see for all this new up keep.

    IMHO, for a home user this proxy is equal to the killer nic. Might work, but the money at the end of the day is better spent elsewhere.
    Reply
  • dezza - Wednesday, May 12, 2010 - link

    With an Atom PC or any small form factor PC that has at least 1GHz or whatever depending on the services you will be running - You will be better left off with combining DHCP/Proxy so you have one connection open always to gateway/proxy .. And instead of auto-detect explicitly define it ..

    http://www.broadband-help.com/articles/networking/...

    I found this ..

    Brings a few points into the light once again .. Static content is the only thing that is affected, which is of course a big part, but since many big sites uses systems like imageGet()'ers etc. in PHP/ASP and thumbnail() functions - Your proxy can't touch this (MC Hammer) ..

    Again .. Chris, I respect your article and I agree that ArchLinux is a great distribution (In my case for bleeding-edge workstation) - I love reading anandtech's hardware articles as well and this is the main reason for having it in my feeder, but I will patiently wait while more of these articles get to the surface so we give feedback and maybe even come with suggestions or help you in forging them .. Would be lovely to extend this site with some killer articles on software/programming etc. I never doubt your quality of hardware articles and I think indeed you wrote a decent article. This is no bashing.
    Reply
  • dezza - Wednesday, May 12, 2010 - link

    http://tools.ietf.org/html/rfc3143

    another official rfc documenting problems with the proxy ..

    Not even on my work where we have 8000 clients connected to the internet and using BitTorrent heavily (We have BitTorrent shaping/filtering with encryption support) we would benefit anything from using a proxy.

    Also with a proxy you will have to scale your proxy tremendously with another 1000 users I/O performance of the proxy server drops incredibly ..
    Reply
  • jamyryals - Tuesday, May 11, 2010 - link

    They are linux experts. This means they know too much to actually read the article.

    Jarred, you are on point with the distro v distro comment.
    Reply
  • eleon - Wednesday, May 12, 2010 - link

    "Do you have a growing family at home slowly eating away at your bandwidth? Maybe you're a web surfing fanatic looking for a little more speed? If you answered yes to either, a caching proxy is for you."

    That the first paragraph of this article and that's the first thing readers will see. and I really doubt that a caching proxy is the right solution. A caching proxy won't help if one client use the whole bandwidth with bittorent. It will only have a benefit if you have multiple downloads of the same static (http or ftp) content, and that's not the scenario families are dealing with. And if you really have some big updates or Servicepacks, so why not only downloading them one and share between the client. So this maybe a solution for you special needs, but obviously not for a "normal" family. So it's right that you didn't mentioned QOS, but if someone is eating away your bandwidth you need QOS!

    and replied to my comment:
    "I'm not worried about "saving bandwidth" in the way you're discussing, though if you had a plan that charged you for downloading over a certain amount it might be useful. I'm interested in speeding up patching and such."
    PLEASE differentiate between"bandwidth" and "transfer-volume". I didn't talk about saving bandwidth, I said that proxyserver can be a solution for reducing the transfer-volume, that is something completly different. As long as you don't distinguish between this two things you will never understand what you can do with QOS, and what you can do with a caching proxy.

    and I didn't start a discussion about distribution X is better than distribution Y.
    I really love Archlinux.
    But said Archlinux is a good choice for this proxy, because it has a rolling release cycle.
    My comment about archlinux only relies to this, because it shows that you have no idea about the advantages of distributions with stable releases (+security updates), and releases with rolling release cycle. And in my opinion it is really irresponsible to recommend a rolling release distro for a router/firewall/proxy. (the reasons for that are in my first post).

    So please don't get me wrong, if this is satifying your needs, it's perfect and I'm happy for you.
    but if someone can answer your questions "Do you have a growing family at home slowly eating away at your bandwidth? Maybe you're a web surfing fanatic looking for a little more speed?" with Yes, he or she wouldn't be happy with a caching proxy. It isn't a direct solution for this, it will maybe help in a indirect way in some special situation if you download the same big files by http (ftp) multiple times.
    So if you answer this questions with "Yes" you should consider QOS.

    My main concern is, that your solution is not effective! You can improve the efficiency by low-power or even embedded hardware, a special router-distribution which will minimize the setup time, so it will be efficient on multiple levels, but it won't change the fact, that it is really ineffective in solving the problems of "eaten bandwith", and slow websurfing experience.

    Running QOS an embedded hardware would be effective and efficient. (and many SOHO- and even consumer-routers support it out of the box, and if not, many are supported by alternative firmware-distributions like openwrt, dd-wrt,... ) so if you have these problems/needs, this probably would be the way to go.
    Reply
  • JarredWalton - Wednesday, May 12, 2010 - link

    But QoS won't give you more speed, it will just prioritize bandwidth. A caching proxy, on the other hand, can actually boost page load speeds a lot (though not always). It's not for everyone, and I suppose part of the problem is I view things from my world while Chris has his own idea on things. Anyway, you're still getting caught up on what is essentially a hook to the article. Read that paragraph this way, and it's just a less dramatic restatement of Chris' paragraph:

    "Do you find your web surfing experience to be slower than you'd like? Do you have lots of PCs and do you frequently download the same file on multiple computers? If so, you might want to consider reading this article about proxy servers and what they can do, because it might be something that will help alleviate some of your bandwidth congestion."

    Call it over-exuberance on the part of the author or whatever. Just because someone likes the idea of proxies and writes an article -- OMG it's on AnandTech so it must be true! -- doesn't mean it's the right solution for every single situation. Given this is a Linux article, I personally thought it was more of an interesting idea that may be useful to some of our readership. I know the caching of Windows Updates is definitely useful for me, even though I have a relatively fast 16Mbit download speed.
    Reply
  • epi 1:10,000 - Tuesday, May 11, 2010 - link

    It would be nice if someone could review a realtime av scanning proxy w/ caching. Has anyone tried SafeSquid, or dansguard squid w/ clamav? Reply

Log in

Don't have an account? Sign up now