Please don't touch DNS
Posted by Nstraclassic@reddit | talesfromtechsupport | View on Reddit | 78 comments
This is more of a rant but maybe someone will find comedy in my pain.
Quick background: We hired a new L1 tech a couple weeks ago. He's super green so needs a lot of handholding but other than that he's been great at absorbing lower level tickets and he's been catching on quick. I've been working on a DC migration for a couple weeks and today at noon we had the final cutover scheduled afrer decomissioning 1 of the 3 DCs on Monday.
This morning one of their users called in reporting a few users having connection issues. Our new L1 took the call and started troubleshooting. He grabbed me a couple times asking about how their DNS and DHCP is set up so I gave him the IP for their new server but after an hour of them being on the phone I started getting a little nervous..
I checked in again and apparently at some point the end user decided he was going to start setting static IPs and DNS on workstations per some ancient internal doc he found. I told my L1 to get him to fucking stop because he doesn't know what he's doing and then got pulled to put out another fire. Didn't hear any more so assumed (big mistake) the message got through because no more issues got reported.
I called their PoC to confirm the cutover and server reboots and started transfering roles, removing services etc. from the old server. I called them back after the final reboot, did some checks and was ready to say the project was done until 10 minutes later the PoC called back frantic saying everything is down. I walked her through checking the adapter settings on one of the workstations and sure enough it had a static IP within the DHCP scope and DNS was set to the server I had just decommissioned....
I asked my L1 what the fuck happened this morning and he said Johnny ran around to every single workstation and "fixed" the issue. I told our PoC and said I'm on my way over... 3 hours later the 2 of us finished unfucking the entire building of \~20 users, I apologized for not being more aware of what the 2 of them were up to contemplated driviny car off a bridge.
Please, for the love of god don't touch DNS settings
RenderedKnave@reddit
to his credit, he did RTFM, it's just that the FM was F'n wrong
OldGeekWeirdo@reddit
Let this be a lesson - purge outdated docs.
rickbb80@reddit
Doc's? What doc's?
Honest_Relation4095@reddit
That's almost impossible. You may purge them from known locations, that doesn't mean someone still has a local copy or even a printout and may even circulate them. Even announcing document updates through company-wide emails doesnt always work
faithfulheresy@reddit
As someone who tried to get some form of document control in place at a SME, it's utterly fucking impossible unless you (re)build everything from the ground up and literally don't allow personal storage.
Some of the dumbest shit I have ever seen.
Honest_Relation4095@reddit
The other attempted way is mandatory trainings, that let employees know about the official storage location of the latest document version.
Puzzleheaded-Joke-97@reddit
Don't forget all those "This one trick" videos that can bypass written docs!
Rathmun@reddit
Start scheduling company wide meetings about them. When someone inevitably complains that the meeting should be an email, respond with "They used to be. No one read them."
soberdude@reddit
Should I follow the DOS or Acorn instructions?
peterdeg@reddit
Copilot will still go and find every instance you missed though.
VexingRaven@reddit
*SharePoint Search will find it. You can find old docs just fine without copilot.
decreed_it@reddit
Which one positive use case actually. Hunt and kill.
dreaminginteal@reddit
With most docs, that means all docs. Almost everything is obsolete the moment it is written down...
bemenaker@reddit
Yet ITIL demands everything be written down
Particular-Way8801@reddit
Printed doc from 97' hanging around and being the bible
TheFluffiestRedditor@reddit
Hey, it was only out of date by an hour, give them some credit
OldGeekWeirdo@reddit
Out of date in the sense it wouldn't work, but it sounds like it was out of date from how the customer was intended to run by quite a bit.
ttlanhil@reddit
That's assuming you have control of them
If it's a document you don't want anymore, that means a few random employees have already downloaded it, made their own notes on what everything really means, and shared those notes with a few colleagues...
OldGeekWeirdo@reddit
You still have to make the effort, or else someone will find it at the wrong time. It's not an absolute fix, but you can tilt the odds in your favor.
ttlanhil@reddit
Yep. Need to do it, just assume a user has usered at any possible point!
Ackapus@reddit
It's not DNS.
It's never DNS.
It was DNS.
syntaxerror53@reddit
Darn Network Settings.
sqfreak@reddit
It's not DNS There's no way it's DNS It was DNS
syntaxerror53@reddit
Is that Damn Numpty Son? /s
faithfulheresy@reddit
So many times I have had this exact discussion.
Literally first 15 seconds of fault finding and I'm going "It's DNS", and everyone looked at me like I'm a madman, so I excused myself and found other work to do.
Two days later (yes, seriously!) they figure out that it was DNS and did exactly what I had suggested nearly 50 hours earlier.
Stryker_One@reddit
It may never be Lupus, but it's always DNS.
cactuarknight@reddit
Except that 1 time that it was actually Lupus.
Stryker_One@reddit
And that one time that it wasn't actually DNS. The exceptions that prove the rules.
Robbins-Min313@reddit
Oh no, this sounds like it's heading toward disaster! What exactly did the L1 end up doing to the DNS during your DC migration - did he try to "fix" something that wasn't actually broken?
Polenicus@reddit
I work in support for IP camera security systems. We fix cameras, software, and servers. What we don’t fix, SPECIFICALLY, are networks. We tell them ‘your network must be good, pings must be consistent, and these ranges need to be open.”
That’s IT. No magic, just a handful of ports, and the damn thing can manage 4 sent and 4 received.
The amount of network fuckery I’ve seen where they scream that’s unreasonable. From a wired Cat5e network. “You need to adjust your software to make it work!”
Dude, your pings are failing 99% of the packets. You can’t run a goddamned hi resolution security cam on a connection that can’t even load Google!
THEN they demand we fix it.
We don’t set up networks. We don’t troubleshoot networks. We don’t fix networks. Our software doesn’t do any networking, it just runs on a Windows server with a network connection.
There is no fight you be will get from and end user like a network fight. As far as they are concerned, they are GOING to do it wrong, and it’s YOUR job to make it work.
Oddly enough it has never once gone that way, no matter the drink they raise.
LeomundsTinyButt_@reddit
I would kill for IT on my employer to do just that. My VPN connection drops all the damn time, which sucks extra hard when you're running long-lived processes on SSH terminals. I've asked them to please just tell me what they need. They don't need to mess with my home network, I can do that. I just need to know what the hell it is this custom VPN software wants... The answer? "We don't support employees' home networks" sigh.
Looks like I'll have to reverse-engineer the damn requirements. But I will die on this hill: if I find the problem and it's just some firewall/NAT thing IT could have told me about, the time I waste on it is getting added to my work hours.
fresh-dork@reddit
screen and you just have to reconnect
LeomundsTinyButt_@reddit
I know... I never seem to get around to setting it up, and I should. But in this case it's not a full solution, because I also use vscode in remote SSH mode, and I don't think there's a way around that one. It tries to reconnect for a while, then just gives up and asks to reload the screen. So off I go to copy the whole file I'm working on, reload screen, paste back the changes. And if, god forbid, I've changed multiple files since the last save, off to the "Notepad transfer area" I go.
GetSecure@reddit
Sounds like you should start installing your own network and charge more. What you described is entirely predictable and exactly what I expect would happen when you piggy back on their own network.
Mr_ToDo@reddit
Oh god. That way lies IOT all running off wireless
I get the idea, but how many business are going to OK putting up a second physical network just to get their IOT of the day running?
nobjangler@reddit
We do this in the POS world. We require every merchant to use our router/switches/cell backup and if they don't we have a nice long agreement with multiple initial sections that says how we need it to operate and if it doesn't we can't guarantee it (we mainly need this when dealing things like cafe's inside banks where we aren't allowed to replace their network and such).
Ich_mag_Kartoffeln@reddit
A whole separate network?!? We can't afford that! Just add it to our existing network.
What do you mean your cameras don't support 10BASE2?
Roguefem-76@reddit
Well, if the appliance you sell them doesn't work when they plug it in then clearly it's your job to rewire their house, duh! cUsToMeR sErViCe!!
EkriirkE@reddit
Who is Johnny in this story?
commentsrnice2@reddit
Boss’ son aka Mr I know better than the expert
ImedgeQc@reddit
He got a silver hand.
Ibe_Lost@reddit
Yeah had somewhat similar. Set kids new laptops up during covidt locked down IPs etc on home network. They went to school and took the IT awhile to realize why it wouldnt connect 100% to their almost fully open high school network.
SemtaCert@reddit
"the end user decided he was going to start setting static IPs and DNS on workstations per some ancient internal doc he found"
How does the end user have access to chsnfe IP and DNS settings?
Nstraclassic@reddit (OP)
It was the owner's son who's also an employee so he had an admin password..
Money4Nothing2000@reddit
He altered the DNS lookup, pray that he doesn't alter it further.
GuessSecure4640@reddit
Why not give him a local admin on his PC instead of domain admin?
SemtaCert@reddit
Well he shouldn't have an admin password.
Nstraclassic@reddit (OP)
Their network is self managed. We just do projects and help maintain the equipment for the most part.
markus_b@reddit
Why was he not called back to fix the mess he created?
Nstraclassic@reddit (OP)
Well he had left for the day and do you think he was capable of fixing it?
handlebartender@reddit
Capable or not, it sounds Ike the only way he’ll learn is through personal suffering. His, not yours.
That said, if pulling him back into the fray is likely to be a pain multiplier for you personally, then I can see why you would want to avoid that.
azama14@reddit
u/Nstraclassic I have a gentle suggestion; just leave the Sons workstation set to static. He can discover his 'fix' didn't work and unfuck it himself when he learns the rest are fine.
markus_b@reddit
Great suggestion!
JaschaE@reddit
Did you skip "Owners son" in his job description?
markus_b@reddit
Especially because he was the owner's son, going against explicit instructions.
88theylive88@reddit
Maybe they were using a hostfile mod?
Glitch-v0@reddit
Truly RBAC was lacking
JaschaE@reddit
Good question, also: Hey, at least there is documentation a user can follow, next step: Keep it up to date!
Harry_Smutter@reddit
This was a whole mess. Also, 3 hours to reset DHCP settings on 20 computers?? What??
Nstraclassic@reddit (OP)
I mean i had to drive there and there was a lot more broken than workstation adapter configs. IP conflicts, printing was fucked, internal lookups fucked, one workstation network stack became completely corrupted somehow, they have some obscure version of linux on a shop PC that didnt accept typical commands. I also don't work in the building so needed someone to show me to each computer. But hey if you have a magic wand to fix all that in one go send it over
Harry_Smutter@reddit
Context matters. You left all of this out except fixing 20 PCs.
Nstraclassic@reddit (OP)
I mean none of it was relevant and tbh most experienced IT people would know screwing with adapter settings across an entire network impacts more than just basic hostname resolution
st33p@reddit
I believe that someone will be selling puppies in the near future.
GuessSecure4640@reddit
That'd take me about 15-20 minutes tops?
Transmutagen@reddit
"DHCP wasn't working properly for 5 seconds so I decided to personally fuck up every single workstation until someone who knows more than me can fix them"
caraar12345@reddit
Genuine question: would it not have been a good idea to add the decommissioned DC IP as a secondary IP address on the new one? Then anything set up to access the old one directly would be re-routed to the new one.
I am not super well versed in AD networking though so I imagine there are a number of footguns there
thevoidhearsyou@reddit
This where privilege level come in handy. Had that one guy who loved to change everything to level it took hours to change things back so everything worked only for him to change it back a repeat. Eventually got the go ahead to change everyone's privilege level who wasn't it or management. Email goes out and after the change Mr I knows better screams he can't change anything. Fresh copy of email is sent and HR is notified per protocol. Guy still is pissed but keeps the ticket volume low.
savevicleo@reddit
sorry i'm gonna need some acronym explainers, because i only know DC as direct current and PoC as people of color...
Maleficent-Pin6798@reddit
In this instance, PoC is point of contact. DC is indeed Domain Controller; windows networking server, in essence.
Stryker_One@reddit
Domain Controller
Point of Contact
harrywwc@reddit
in this context - "Domain Controller"
Tegumentario@reddit
He read it in the docs though.
nmrk@reddit
Screwing up DNS? Hey that's MY job!
cofclabman@reddit
Working in higher ed with students using personal owned devices, it's not at all uncommon for them to be set using Google DNS or cloudfare DNS because their friend told them it was faster. Works great until you want to print your homework 10 minutes before class and all our print servers are on the internal network that doesn't route to the outside world.
TinyTC1992@reddit
Should of just span up another dns server with the old ip, and when you got access back via your RMM platform could of just one shot pushed a command to change the adapters back to dhcp.
Nstraclassic@reddit (OP)
If only. We don't have our RMM installed on their workstations. It's a co-managed scenario and we only help manage the infrastructure
TinyTC1992@reddit
Oooof that adds an extra flavour of fuckery!
ponakka@reddit
Or rather don't set the static ips to dhcp range?