Bad, vile and meaningless: apt-proxy -- a staggering failure from Alan's clob

apt-proxy shames the very word proxy

I spent good 1 year or so using apt-proxy. The first version written in Perl had some issues with concurrency. For instance, if two computers were updating the same file, only the other appeared to actually be downloading it. The other computer would get its chance at the file only after the first download completed, at which time it would move at LAN speed.

They said it was ugly. They said it was broken. They abandoned it. Enter apt-proxy-v2, written using the twisted framework. It's Python, it's supposed to be better. Cleaner, leaner, faster. So, how come it turned out to be a crappile?

Firstly, apt-get updates through apt-proxy are hellishly slow because it "validates" everything, in this case meaning it bunzip2s the Packages files launching some 6 bunzip2s in parallel. I don't have beefy hardware on my gateway. It is a 450 MHz celeron. The unnecessary bunzip2 takes seconds, and so apt-get update takes good 30 seconds instead of, say, 5.

What's weirder, copying the data from local disk to network is also rather heavy. The celeron was maxed out copying some 500 kB/s. That's some waste of CPU right there. I see lots of mmap() and munmap() action going on, so, I guess this just proves that memory allocation is slow... While it moves the data, the process grows larger, sometimes up to 8 MB or so, but eventually it appears to free all that RAM.

But sometimes, it goes into memory eating rampage. It's a curious leak, too. Normally, long-running daemons lose memory slowly over time, a kilobyte here, a megabyte there. But this one keeps it all well in shape until some critical moment comes, at which point it eats some 150 MB of RAM instantly and stops leaking. While leaking, it's consuming 100% of CPU and does not respond to requests.

I think it says something about twisted that no-one has been able to debug the memory leak in 2 years. In fact, the ChangeLog entries contain funny statements like "a memory leak has been fixed" or "new version of twisted corrects a memory leak" but here it is, two years later, still leaking memory.

So here's my recommendation -- if you are one of those hapless people using apt-proxy, switch to squid unless you actually need its spiffy multiple-sources-as-one aggregation features. I didn't, I just wanted to spare some bandwidth for updating the numerous machines in my LAN. Configure squid for some 2-3 GB storage, change the cache replacement policy from lru to heap LFUDA, and set maximum cacheable object size to 400 MB. That's your apt-proxy, and it doubles as a great webcache to boot!