how do you handle user management on a large number of linux boxes?
Posted by baconwrappedapple@reddit | linuxadmin | View on Reddit | 65 comments
I'm looking for more detailed answers than "we use AD"
Do you bind to AD? How do you handle SSH keys? Right now we're using our config management tool to push out accounts and SSH keys to 500+ linux machines instead of a directory service. It's bonkers.
dewyke@reddit
We have a slightly unusual setup. We use packages to distribute users, with pre and post-install scripts doing a lot of the setup and the package carrying the user’s public keys (SSH, and GPG), setting their passwords (for sudo) etc.
The user packages are dependencies of a main package, so to revoke a user you reconfigure that main package to conflict with the user, cut a new version and the orchestration tool pushes it out to all the servers (>2,000).
It’s weird, but it works surprisingly well. It has the advantage of meaning admin logins work independently of the machine’s ability to contact any central server, be that a domain controller or whatever.
SuperQue@reddit
I know a company like that. Most of the config management was done via deb packages. Weird, but not the worst thing I've seen.
This is the primary reason. When your "users" are critical for service debugging, you don't want to first be debugging auth flows.
Having AD/LDAP for a source of truth for users and groups is fine. But for direct server login it is something left to the weird world of Windows admins.
altodor@reddit
I have a weird Windows admin hat in my pile. I still support the use of ansible playbooks for this. The SSSD configs for servers is delicate and I've had it break so bad I now have a dedicated user in AD setup so weird it can only be used for unfucking the linux boxes that have SSSD break.
SuperQue@reddit
Yup. This is why in all of the production, 5-nines services, environments I've worked at we don't use SSSD.
I put the Windows hat in the corner in 1999, and I haven't seen it since 2002 or so.
segagamer@reddit
WinBind seems to work far more reliably than SSSD. I had SSSD randomly lose domain connection far too regularly and for seemingly no reason, yet I've not had to reauth WinBind once.
picklednull@reddit
wtf that sounds like a really weird setup. I don't even see any real benefit to it compared to the other "standard" options... If not bind to a central directory, just directly managing these via Ansible (or whatever) is simple enough.
Is there any benefit to this?
dewyke@reddit
We don’t use domain controllers (for a bunch of reasons) and this works well in our environment. It’s not perfect, for sure, and it struck me as odd at first too but it’s surprisingly effective.
videoman2@reddit
Don’t forget you can setup a CA for ssh to trust keys that get signed by a CA. Set it up so a users key is signed and only valid for 7-days at a time. A few technologies support automation of this- teleport, hashicorp, etc. Should also be able to create a CRL that can be pushed out in the event of an emergency.
Z3t4@reddit
Can't you use crl to revoke keys instead?
kdiffily@reddit
What is crl?
Z3t4@reddit
certificate revocation list, its url usually included on the user cert itself. A CA can revoke an generate certificate, to do so it publishes a crl, which is a list of rervoked certificates signed by the ca.
os400@reddit
Certificate lifetime should be measured in hours at most, with minutes being even better. Gate certificate issuance behind SSO with strong authentication. That way you can rely on expiration as a passive revocation mechanism while avoiding the massive problems that come with CRLs.
Yupsec@reddit
This is fact.
robin-thoni@reddit
Smallstep's step-ca is also a great tool for this purpose
Carvtographer@reddit
We do simple SSSD configs, using direct AD bind instead of LDAP. It seems to be working fairly well, and haven't had any major issues in the past year or so we've implemented it. We then use AD groups to map sudoers for the IT team and the local user (if needed), folder access, etc. We havent touched any GPO-related things through AD, so I can't comment on that.
We have a mix of local scripts and Ansible playbook pushes that are run on newly "imaged" linux boxes. The scripts manually add the public SSH key for our Ansible accounts as well as prompt the passwords needed. Then depending on the usage of the machine, we have Task Templates (using Semaphore) to run different playbooks on a schedule.
Probably not the most straight-forward, automated way - but it's been working for our deployments. We also are not doing over 500+ machines; currently at around 50-100, but I assume we'll get there in no time now that management is seeing how well we can configure these.
Utilizing AD was the single most important parts of our deployments, since we technically had no way to remove user access after they have been terminated if they were using some local accounts.
baconwrappedapple@reddit (OP)
Do you distribute SSH keys for your users? How do you make sure those get revoked? if someone's password no longer works but the SSH key is there you can't lock them out
kdiffily@reddit
To revoke you’d have to run a script removing their public keys from all machines they could access.
UsedToLikeThisStuff@reddit
I’m not sure if others will mention it, but if you have it bound to AD you can use GSSAPI to connect with a valid Kerberos ticket, which makes it easy to invalidate rather than pushing around SSH keys. You can also store ssh keys in your AD schema.
sudoRooten@reddit
I tried setting this up awhile ago but was having trouble. Like the Kerberos ticket wasn't presenting the correct info or format that AD needed. To troubleshoot, I installed ssh server on a windows machine and Kerberos auth worked fine. So it's something specific with the Linux machines joined to AD. Need to try setting this up again.
UsedToLikeThisStuff@reddit
I believe you neeed a host keytab in the right location.
Master_of_Disguises@reddit
If they all mount the same (network) home directory, you only need to add ssh key to that user's authorized keys file and they'll be able to move around the entire network with that one key
Carvtographer@reddit
We normally don't manage SSH keys for users needing access to these machines, nor are there any really other machines they could need to get access to on our network from their desktop.
Thankfully our environment is a regulated one, so we don't have people SSH'ing into or out of the devices. By default, they are also not given sudo access to make changes, they put in a ticket with us for us to push libraries/changes/packages to their machines, or are given "temp" sudo with very basic perms, which are then revoked automatically after X hours. These machines are for users that are on-site and have to badge into the building to get to the desktops.
One of our Ansible playbooks pulls query information about SSH keys, etc., in the event for logging.
rautenkranzmt@reddit
You can use AD to distribute ssh keys.
Independent-Mail1493@reddit
I used to use LDAP with the OpenSSH LDAP public key schema extension. This allows you to store public keys in the LDAP schema. Once you have this set up you have to add a script to each system to look up the user's public key and configure OpenSSH to use it.
TasksRandom@reddit
For system-level accounts (bin, sync, lp, proxy, www-data, nobody, backup, etc) keeping them in /etc/[passwd,shadow,group] is a no-brainer. Use your configuration management to push them out as needed.
For generic, non-privileged, interactive users, if you have access to LDAP, or the means to set up an LDAP server, it's the natural answer. Whether that LDAP server is openldap, part of AD, ipa, etc. is left as an exercise for you.
If you don't have LDAP for whatever reason--or it's not viable due to network instability, machine mobility, ... -- I've had success with libnss-extrausers on debian-ish systems. It allows you to set up a separate passwd/shadow/group inside var which is stacked to be checked after (normally) the standard BSD flat file databases. Your config management tool can push out these additional files as needed to stay up to date.
Souper_User_Do@reddit
Just gonna save this post for later.. <3
miksu103@reddit
If you use Entra ID check out Azure ARC and or just bare SSH authentication with Microsoft Entra ID. I'm just implementing it for our setup, although at a much smaller scale. In short a user will use Azure cli on their workstation to request an ssh key. Azure will generate a one hour signed SSH key for the user to use for this specific machine. This can be used with normal SSH connectivity, or combined with Azure ARC to tunnel without exposing any ports.
crankysysadmin@reddit
can you do bare ssh authentication with entra id with on-prem servers without azure arc? we are not a big azure shop, but we do use entra id for a lot of stuff internally.
jrandom_42@reddit
The comment you're responding to already explained how to do that: configure SSH certificate auth on your Linux boxes with Azure as a CA, then your Entra ID users can generate an Azure-signed keypair to log in with that's valid for 60 minutes.
crankysysadmin@reddit
id love to find a recipe for this. googling didnt help. everything assumes arc
miksu103@reddit
https://learn.microsoft.com/en-us/entra/identity/devices/howto-vm-sign-in-azure-ad-linux
Just apt install aadsshlogin. Then that gets enrolled with a credential that you can get in your azure portal. I did it last week as a proof of concept, but cannot find the exact command. My memory still says it was not a full ARC installation. Just logging in the aadsshlogin package.
jrandom_42@reddit
I just looked this up and I didn't realize it was free. We ditched Arc for patch management last year on account of pricing, but AAD SSH login (I presume with all the usual MFA options?) would be chef's kiss. I gotta forward this to some guys in the morning.
u/crankysysadmin did you know Arc SSH was free when you said you were looking for a solution without it?
jrandom_42@reddit
I'm guessing users have a PowerShell script that generates a fresh keypair and sends the public key to Trusted Signing to turn it into a certificate that allows Linux host login.
I guess there might not be a copy-pastable example of that out there, but you could probably 'vibe code' something to get you started.
master_reboot@reddit
Check out IPA. Its what I use. Not the best but better than nothing
Specific-Local6073@reddit
LDAP was invented exactly for use cases like this.
rankinrez@reddit
SSH certificates are worth a look.
I hear good things about both of these:
https://smallstep.com/docs/step-ca/
https://goteleport.com/ssh-server-access/
ISortaStudyHistory@reddit
MicroFocus (formerly Novell) has a product called ASAM. Requires eDir.
Samantha_Cruz@reddit
they are called open text now
_mick_s@reddit
If you don't want to use AD there's also freeipa/redhat IDM.
It works pretty well and has better support for Linux access and sudo rule management than AD.
jigga_wutt@reddit
Yup, we use freeipa to manage devs and contractors that need various levels of SSH access. Also as a DNS server. I don't love it, but it's an option, and it works.
YOLO4JESUS420SWAG@reddit
This is what we do because we have separation of duties. We collaborated with our AD team to set up a trust, then centrally manage via ad groups and ipa external groups sudo/hbac.
If I could daily drive our AD then I would bind directly personally, instead centrally managing sssd via puppet/chef. Oh well.
dahid@reddit
This
idkau@reddit
SSSD and Ansible.
If a user is let go, they are blocked from even accessing the infrastructure so their SSH keys would be useless.
Chewbakka-Wakka@reddit
Look into Kerberus auth... AD is based on that.
Thamagorian@reddit
We have a user account database software, which I think was created by ourselves, which is then synced to and Microsoft AD, and a openLDAP. The linux workstations I help manage are all using Kerberos (not SSSD, but PAM) for login and accessing remote storage or servers. There is one part who has their own domain, which are using samba ad and ssh keys, which has been managed by puppet, but they are working to moving it over to ansible. Our solution with Kerberos is probably not a great solution as it seems not to be working with newer versions of Linux distros than we are currently using for our workstations.
fr3nchP1ckler@reddit
We use Centrify by Delinea Software, not sure what it costs but we just have to install their package on our hosts and it binds seamlessly with our AD environment. It has some cool features as well like being able to set crons for root, copy files onto the systems, configure access management, and dynamically register its IP with DNS all through GPOs.
We don’t manage SSH keys for users though so not sure if they have a module that can assist with that or not.
TheTomCorp@reddit
The use case for us is an HPC service, we have one datacenter with all of our stuff in it. A user will have an AD account from IT, we have OpenLdap servers all of our machines point to using sssd, they do passthrough authentication to the AD servers using ldaps. All of our /home is an nfs network share so we have a "new user script" to make an account, make keys, set quotas. No need to distribute if it's a shared file system.
michaelpaoli@reddit
One can integrate AD into LDAP, so can then leverage that to do single-sign-on for most platforms (most *nix/Microsoft/Apple), and the AD can be hosted on Microsoft or Linux.
Yes, there very much are ways to do that. Notably also AD can accommodate additional data to well handle *nix, and then LDAP can leverage that. E.g. on the AD side, mostly supplement AD login name with *nix UID. Additionally, probably also group memberships (primary and supplemental) - but that could be on the AD side or the LDAP side. Likewise UID/GID name mapping, etc. But use AD for anything requiring user's password authentication. Also, MFA can be added/enforced with AD (or per account) (or on the LDAP side)
Various possible ways, e.g. have policy, monitor, enforce. Can also (dis)allow use of ssh keys on on per-user (or per-group) basis. One can also do ssh certs - and can issue those to users when they authenticate to AD or LDAP, and certs set with expiration times - can be very short (e.g. 30s, just to allow single login from fresh (re)authentication), or can be longer periods, e.g. hour, 8 hours, 10 hours, 12 hours, 24 hours, week, month, etc. Can also do ssh certs for applications - notably to better manage and enforce their rotations - though that's not the only possible way.
Not necessarily bonkers if it's sufficiently well done and automated, but typically preferable to use centrally managed authentication, e.g. LDAP or AD via LDAP.
And of course, for "LDAP", everything there actually using ldaps on the wire and with proper certs and management thereof - none of that on the clear across networks (and preferably even if local/internal and not using any physical network).
Anyway, been in environments where this has been highly well done, and including going back decades.
Alas, beware that some Linux distros have dropped support of LDAP (most notably so they can sell you their own commercial proprietary licensed sh*t instead).
Remember also, PAM is your friend - much can also be done and/or customized there as may be appropriate.
UsedToLikeThisStuff@reddit
I assume you’re talking about RHEL deprecating openldap-server? That’s a bit misleading, RHEL continues to work with LDAP as a client, you just need to use the open source FreeIPA for server (or RH identity server if you want to pay for support).
PE1NUT@reddit
We run a pair of redundant, replicating OpenLDAP servers, serving secure ldap (ldaps). We created our own CA and push the certificate to all the servers using Ansible, and sign the certificates on the LDAP servers with that - this stems back from the day that SSL certificates were pretty expensive. This setup has been running for nearly two decades without issues.
User administration is usually done through LAM (ldap account manager).
Users can change their own password through the EXOP option.
Fortunately, no AD involved here at all.
linuxfighter_haea@reddit
With a jumphost who can access all others with user-agent and principales
chock-a-block@reddit
No one uses AD directly. You can use the weird, broken ldap Microsoft has.
Freeipa is a good choice. It is not without its flaws, though.
The other choice is Kerberos with an ldap backend. Very reliable. Scaling will never be a problem.
hselomein@reddit
I have 137 Linux machines using ad directly.
CombJelliesAreCool@reddit
Can you elaborate on hunting ghosts?
chock-a-block@reddit
Among other things, sssd caching is an enigma wrapped in a mystery.
LOTS of interactive servers make the freeipa magic happen. Sometimes they fall over.
gordonmessmer@reddit
Very many sites use AD directly. I've worked in some.
SuperQue@reddit
I'm sorry, that sounds horrible.
a_cc_a@reddit
We are looking into https://github.com/himmelblau-idm/himmelblau.
myownalias@reddit
Jumpcloud is another option.
rottgrub@reddit
Look into FreeIPA. It's easy to roll out, easy to use, and stable. I've been using it to manage around 150 machines for the last 5 years with zero outages or issues.
Only caveat is to run the IPA server on a RHEL style linux, like Alma or Rocky. It's not well supported on Debian based distros like Ubuntu or Mint. Debian based clients are fine.
GertVanAntwerpen@reddit
SSSD, directly coupled to AD, works good and stable (although it’s a bit slow). Handling SSH keys is the users personal responsibility
SuperQue@reddit
The question is more, users login for what.
For a very long time now the places I've worked at the "Linux boxes" are servers. Users are software engineers logging in to debug.
We did essentially "config management tool to push out accounts and SSH keys" for thousands and thousands of servers.
The main reason is we never want a network service in the critical path for debugging. It's always the first thing to go when there is a network or server issue that can degrade the qualiy of the system. By having everything pre-populated, we have the best chance of being able to access systems when things are degraded.
However, the modern thing to do is use SSH certificates, not keys. Instead of pre-distributing keys, the server authenticates the user based on a short-to-medium term certificate, minted by a high quality directory service. Tools like Cashier can be used.
kyleh0@reddit
In smaller environments I've typically just used ssh keys, depends on exactly what I'm trying to secure. There are a ton of potential use cases that can't be blanket answered I don't think.
NL_Gray-Fox@reddit
Kerberos, LDAP, sssd.
SSH public keys are stored in ldap (SSH has a setting to fetch the keys through a script, the script is an ldap query.
Bebop-n-Rocksteady@reddit
LDAP for users then sync users to platforms such as Foxpass or Teleport for SSH key management.