When a problem is a priority for the customer... but not enough to assign capable IT personnel to it.
Posted by bagofwisdom@reddit | talesfromtechsupport | View on Reddit | 31 comments
So my company made a hardware revision to one of our products. Unfortunately that revision revealed a nasty bug in the embedded software for the device. Let's just say a handful of units shipped in a state where they won't connect to a network after being left on for more than 5 minutes. It took a few returned units to our reliability engineer, but we found the root-cause. The fix was a less than 500 kilobyte firmware update. Easy peasy to upgrade.
So I get roped into an escalation call. I get called in, managers above my paygrade are called in, this is a five alarm fire to a bunch of non-technical people. Customer is fuming they have a bunch of these devices that don't work. I'm in this call for less than 2 minutes and say "What firmware are these units on?" Customer comes back with the older version with this bug. I say "We fixed this in the latest firmware upgrade. I apologize my field tech didn't catch this when they did your implementation. Let me get that firmware file to you with instructions on how to install it." Noting to myself later on that I need to lecture my team (again) to Always upgrade the firmware.
Customer successfully gets 8 out of the 9 affected devices upgraded. Number 7 is giving us a little difficulty. The IT person assigned to this task couldn't connect to the WebUI to complete the upgrade. It happens, sometimes the IP address isn't what we think it is, and this customer opted for DHCP with no reservation. I told them to just reset the device to factory defaults using the reset button. I provided the default static IP that comes up after reset.
I then get an e-mail from the IT person doing this project. "I can't connect to that default IP either." Since this is customer acting as remote hands for me I make sure I'm dealing with someone that can at least spell TCP/IP. "Are you on the same Layer 2 network as the device and assigned an address on your PC that's in the same subnet as the device?" Customer comes back that they are remoted into their laptop (???) which should be on that network. Then proceeds to whinge about the other stations working fine.
Great... I got the intern. I explain why they could get to the others but not this one that we've reset to factory defaults. "Then I can't do it." I further explain that someone will physically need to be at the location to connect to that device to bring it back online.
Could our product team have made DHCP as the default? Yes, but I'll tell you why a static default IP is easier. When I have 30+ of these devices that need configuration (usually a simple set it once and never touch it again) it is more convenient for me to just patch straight in with my laptop and just keep the same IP address in my browser window. The WebUI does firmware upgrades and configures the device along with assigning it to the customer. Once they're set up, they rarely get touched again until something mechanically breaks down.
At this point, I'm ready to tell the project manager to just send the customer another unit. I have 20 people I have to ride herd on and I don't have time to train customer interns.
tl;dr Customer had a problem that was such a priority they assigned their best intern to it.
FluxMango@reddit
In this case replacing the device with a patched one and having them RMA the "defective" one seems to be the most sensible option, short of going on-site.
If they were able to configure all but one device, chances are pretty good the problem is not on their end. I have once made the mistake of underestimating a customer, and it was quite the lesson in humility.
ParpticularCicada4@reddit
Classic case of 'my emergency is your inconvenience'—gotta love tech support!
Geminii27@reddit
Hmm. Purely speculatively... if I was designing it, I'd have the device look for a DHCP server by default on boot, but allow someone physically near it to press a button (probably a membrane switch) that makes the device respond on a fixed IP for 90 seconds or thereabouts, and locks that in until whatever was plugged into it (laptop etc) stops generating traffic or responding to packets for 30 seconds (or the switch is toggled again).
Also, external visual indicators. Is the device currently using a DHCP address (LED #1), a static IP (LED #2), or searching for a DHCP address after boot (flashing slowly between the two)?
I guess there's always the option of having the laptop run a scan for any IPs responding to a particular packet on a particular port, and then dumping that IP to a browser, but admittedly it's not quite as elegant as just knowing what the IP is going to be.
Engineer_on_skis@reddit
It depends on how useful the thing is out of the box with no config.
If it's completely useless until it's been configured, what's the point of having it get a random ip address, that you would then have to search for with a network scan or looking at the router's client list.
If it can do some useful things out of the box, then that sounds like a good solution (if slightly more expensive and complex).
mercurygreen@reddit
They probably assigned their smartest "computer guy" instead of hiring and ACTUAL "computer guy"
I approve of devices having default IP addresses (and passwords) after a hard reset. Makes it easier to find it without having to trace down what address a headless client is.
OgdruJahad@reddit
Also it's a good idea to have a network scanner on your phone just in case. Then you would be able to scan the network to find the device as long as you're connected to the same WiFi
mercurygreen@reddit
Oh, everyone us probably has an IP scanner and a WiFi scanner on our phone, and a batch file that will extract all the WiFi passwords that are stored on a laptop. I'd rather not spend the time just finding it.
OgdruJahad@reddit
What other tools do you use on your phone? There are some I want to try but I have no clue how to use. And others I just haven't needed to use .
mercurygreen@reddit
Some of the Heatmap apps are pretty good, as they can show problem areas.
We're an M365 shop, so all of the apps that go with it, as well as either user/console apps for ANY product we have. Turns out that there are a LOT of things that people write apps for.
OgdruJahad@reddit
Have you every used Servers Ultimate on Android? I want to use the PXE server option , but I don't know how to get it to work.
mercurygreen@reddit
I haven't - it's not compatible with any of my devices, which is odd as I have a pretty up-to-date Galaxy phone and tablet.
OgdruJahad@reddit
You might be able to sideload it then. Weird as I only had to sideload the expansions not the original Servers Ultimate app.
mercurygreen@reddit
I'm sure I COULD sideload it - and it looks like it would work on an old AmazonFire I keep. But we block things from our WiFi to our servers pretty seriously. (We're a school; believe me, they're ALL trying to do stupid things!)
OgdruJahad@reddit
Ok nevermind that one. Have you ever tried Termux? It's like a mini Linux on Android! It even supports common terminal apps like nmap!
OgdruJahad@reddit
Ok that's fine. It's such a cool app so many servers it can support!
pythbit@reddit
That's fine when the device isn't 10 hours away with no on site technical staff, but I can see that being ok in this case.
mercurygreen@reddit
Eh, we could both make arguments on both sides of this.
bagofwisdom@reddit (OP)
Cases can be made either way. Since my techs have to touch them to assign them to the customer and upgrade firmware, static default IP works best for us. When I was still going in the field I could get a device ready to go in about 3 minutes. Once a customer is deployed they get any warranty replacements ready to go out of the box.
Anonymous_user_2022@reddit
You could be one of my colleagues. We have a product that is so dependent on the ancient IBM RIC serial card, that we ended up making our own version when the official one was discontinued. The present iteration runs a somewhat flaky IP stack, that tend to throw a hissy fit when something else dares to be on the same class C subnet.
bagofwisdom@reddit (OP)
In our case one of our suppliers discontinued a part and offered us a substitute. Our engineers ass-u-me'd that it was a drop-in replacement with no testing required.
Stryker_One@reddit
We had a supplier deliver us a part from a different manufacturer, assured us they were equivalent, they were not.
PantaRheiExpress@reddit
They don’t have any mobile smart hands that can be moved around to sites in an emergency? Let’s say something like the Crowdstrike outage happens to this company. They’ve got a BSOD on 70% of their devices, and it can’t be remotely fixed by senior architects. What’s the plan? Throw interns at it? Pray to whatever deity they believe in?
beboshoulddie@reddit
We created a bootable USB image and then threw interns at it with a fleet of USB drives :)
hicow@reddit
Couple months ago, we had a windows update bork a half-dozen desktops somehow. Contracted, out of state msp tried nothing and ran out of ideas, talking about contracting a tech to physically go out to fix them. They all just needed a bios update. It took all of twenty minutes to fix.
Not looking forward to the next time we get hit with ransomware. And the msp has proven themselves a little lacking in ensuring backups are even being made on the machines that need them, let alone that any backups that do exist have been tested
Stick-Man_Smith@reddit
IT is a cost, you see. Meanwhile, emergencies are covered by insurance.
dreaminginteal@reddit
Even when it’s not covered by insurance, it still comes out of Someone Else’s Budget.
Techn0ght@reddit
When a customer is so cheap they give an important project to an intern. If these things aren't 10k profit each it's not worth the time to coddle this customer.
HINDBRAIN@reddit
Yeah that's commonly done.
GodOfUtopiaPlenitia@reddit
"Best," or "Most Expendable?"
Chocolate_Bourbon@reddit
I’ve had jobs where the intern is the most technically capable person in the building.
dmills_00@reddit
Disturbingly often that intern really is the best they have when it comes to networking...
Could be worse, you could be trying to explain multicast to a customers outsourced "IT team", FML that was an annoying conversation.