What software do you use for mirroring repositories for your local network?
Posted by blingmuppet@reddit | sysadmin | View on Reddit | 23 comments
Looking around to see what's good and what's not, and also would like to check my thoughts so far are reasonable.
But basically, what do you use for mirroring remote repos?
Background: Some 200 EL and Debian based machines. The usual OS repos, plus some third party ones (grafana, mariadb, docker etc). We've had some patching failures recently because one or more repos have been down at the time of patching, or mirrors blocked by geo-ip. We have good bandwidth, so speed isn't the major issue here, but I think I'd like to mirror locally for reliability above all. I just want to be able to mirror remotely and make that available to internal machines. Smart features like deduplication would be nice, but not essential. I'd like it to have a clear interface that is fairly self-explanatory so we don't need to spend much time learning to use it.
I've looked at so far:
Pulp: Seems like the learning curve is very steep, and doesn't provide a pretty Webui (I did see some third party options are available, but some seem very out of date)
Repomanager: I'm liking this one the best so far, although it's been indexing debian base for some 20 hours now, so I have some concerns about performance.
Foreman: Using it just for repo management seems overkill. It's huge and complicated to install (requires 20G of ram and 4x cpus before the installer will even run!)
uyuni: We use it already, but clients need to provide a token to access its repos. Uyuni, like Spacewalk before it, likes to manage subscriptions and push its own .repos out. Historically we've had issues with these tokens expiring and blocking repo access so I'm a little cautious about using it for this.
rsync & scripts: I think we want something a little more sophisticated than simply rsyncing remove
picklednull@reddit
For Debian, a debmirror script. For others, rsync scripts. And those rsync scripts have never failed over years and years, but the piece of crap commercial offering known as Red Hat Satellite on the other hand…
blingmuppet@reddit (OP)
Thanks. I'm leaning towards debmirror and reposync (EL) / rsync now. My hope for a solid mirror manager seem to be unfulfilled, and that's probably because of a lack of need.
picklednull@reddit
The issue is not really managing the mirror data but having that "single pane of glass" visibility into your estate's patching status...
If you have to prove yourself in an audit I'm not sure how convinced auditors will be with your custom scripts and maybe random Ansible dumps of package state.
blingmuppet@reddit (OP)
Not sure I understand this - A comparison of the local mirror against the upstream should be sufficient to prove the packages haven't been interfered with - or at least, are no more vulnerable than fetching them from the distro's mirrors directly, or using any third party repo manager.
Hotshot55@reddit
Same, it's annoying as fuck.
blingmuppet@reddit (OP)
Isn't it?
It's fixable by highstating the machine, and probably doing a scheduled highstate for all the machines before patching would solve it, but I just want a plain, unrestricted https repo...
Hotshot55@reddit
We modified the part that handles the token check and changed the time from 30 minutes to 15 and it's helped some, but it's still a shitty bandaid fix which is par for the course with it.
hlamark@reddit
A lot of people use orcharhino, which is a downstream product of foreman/katello/pulp like Satellite, but supports mostly all enterprise Linux distributions including RHEL and Debian. Additionally it provides errata based patch management for Debian.
blingmuppet@reddit (OP)
Thanks for the suggestion - another thing I hadn't heard of.
roiki11@reddit
We use satellite. It's a bit much but it does the job quite nicely. And allows us to do versioned and incremental exports. Did the reposync+apache for a while. But it tends to be a bit cumbersome in the long run. Also tried the container approach but having to export the entire catalog every time you want to update gets old fast. So incremental exports is super nice to have.
blingmuppet@reddit (OP)
Thanks, good to understand what others are using.
We tried Satellite after it changed codebase but couldn't get on with it, and instead went to Uyuni which forked from spacewalk, the upstream of old-Satellite.
That does do repo syncing, but wants to manage every aspect including restricting access to the repos and historically we had failures where the token required to access Uyuni's repo expired. I want an unrestricted repo that "just works" via https
roiki11@reddit
I looked into uyuni but didn't like what I saw. The dependency on salt and having to install client put me off. And working with redhat repos is just easier with satellite.
blingmuppet@reddit (OP)
Fair enough on both points. the salt bit is very transparent now, and is very stable.
But yes, I can imagine with subscriptions on RHEL that using their software which was built for it is easier. Uyuni does support them, but I've never tried it.
pdp10@reddit
Quite a long time ago, we used
mrepo
for CentOS and RHEL.In the recent era, we decided we could give up explicit mirroring in exchange for much less engineer attention and higher storage and bandwidth efficiency, and just use plain Squid caching of unencrypted upstream repo accesses. Squid uses maybe 50MiB memory and we give it a total of maybe 8GB high-endurance storage across peer caches. We do outbound access control from servers, within Squid.
We still do store artifacts for reproducible builds.
blingmuppet@reddit (OP)
Thanks.
I do understand that approach and it definitely has its appeal, but we've had several missed patches recently because /one/ repo has been down out of half a dozen that it's become a bit of a thing for me, so I do want to try a local mirror, with weighted fall through to internet mirrors.
Hopefully I can restrict what's mirrored enough not to use too much storage.
jaskij@reddit
I haven't done this myself, but have seen squid proxy recommended in the past. It is not repository mirroring software, but rather a plain HTTP/HTTPS/FTP caching proxy.
Ssakaa@reddit
On the redhat side, you can't beat satellite (which is just foreman, pulp, candlepin, et. al.). On the debian side... do you really need a full mirror? I've used
apt-cacher-ng
a fair bit at home to good effect. Won't save you from the remote end being completely down for a long period, but will hide interruptions past the first pull.ThinkMarket7640@reddit
Satellite or a generic artifact manager like Sonatype Nexus / Artifactory.
blingmuppet@reddit (OP)
Modern Satellite is downstream of Foreman&Katello which I have tried, but is far too large for just a repo mirror. (20G+4cpu minimum requirements for the installer to run!) Uyuni, which I mentioned, is a fork of old Spacewalk, which was upstream of Satellite. But I don't want to use that for this for the reasons given.
I did look at Sonatype but again it's pretty heavy for just a repo manager.
We did use Artifactory historically, but for internal artifacts rather than as a repo mirror. There wasn't a lot of love for it when we did dump it though, so not keep to pick it up again.
Thanks for the suggestions though. I'm currently leaning more towards rolling our own using debmirror and reposync
BalbusNihil496@reddit
Try using Ansible with a custom module for mirroring. It's flexible and scalable.
blingmuppet@reddit (OP)
Good suggestion. Thanks, I'll look into that.
jtwyrrpirate@reddit
I have used the reposync utility in the past with good results, RHEL example here: https://access.redhat.com/solutions/7019225
(If you don't have a login, you can make a free red hat developer acct)
blingmuppet@reddit (OP)
Thank you - I'd overlooked that entirely and it was useful to read up on it. Unfortunately, it does seem to be RPM based only, though, and I need to be able to support debian and potentially other repos as well.