Logic Monitor - Good or Hype?

Posted by Inquisitor_ForHire@reddit | sysadmin | View on Reddit | 19 comments

Good Morning all!

Is anyone currently transitioning or recently made the transition to Logic Monitor from other platforms? My org has a very disjointed monitoring setup. We have Nagios, Icinga, SCOM, Solarwinds, VRealize, Oracle Database WhateverItsCalled, and a slew of others - all implemented over the years to solve various problems.

We want to bolt some level of observability on top of what we've got, but we wonder if it would be better to start consolidating the various disparate platforms into one central platform to make the implementation easier.

We can take Grafana and hook it into everything we've got, but that's obviously a lot of work. We also don't really have anything that's currently correlating and reducing noise across the environment. Logic Monitor integrates with our CMDB and change processes in Service Now and looks like it would massively improve this.

We haven't talked to Logic Monitor yet... right now we're looking it over and evaluating and I want to find out what other folks think of it.

We're a fairly largish enterprise - about 18000 servers and stuff in pretty much every cloud platform imaginable - Amazon, Azure, Google, TenCent, et al.

Thoughts?

[-]

iano128@reddit

Having done a migration from Solarwinds to LogicMonitor recently I can say it is comparable or better than Solarwinds application monitor but it does have some noted differences.

The polling setup is simiar but all the dashboards/graphs are cloud hosted on their side and kept up to date for you. You still have to maintain the pollers on your side but they are simple enough and can be rebuilt from scratch in minutes as the actual config is all cloud based.

Some notes from my migration were that the pricing is way higher than all other options we reviewed. This environment was on the smaller side of things, think thousands of assets. Even worse the pricing model was very confusing as they have sub licenses for different types of kit. Like cloud, SaaS, Wifi, Server. And worse again they use some these to calculate others, so 100 servers gets 1000 Azure resources etc.

I will say their pro services team were very good and made the transition a lot easier and could explain any questions I had so that was a real plus. It took a few months as we ran Solarwinds and LogicMonitor side by side during the migration.

Since you mention ServiceNow, we had just rolled it out previously and did an integration for tickets that worked well and allowed auto opening and closing, with LogicMonitor deciding the relevant team. This was a big help as the ServiceNow admins and LogicMonitor admins were seperate teams.

From a pure monitoring stand point, if there is a built in module already in LogicMonitors library then they work very well. Some communitity modules are good as well but you have to review the code befeore blindly trusting them. Alerts can take a while to tweak and you need to organsise your resources with team ownership and alerting profiles in mind to make it easier down the line. Something I wish we had not done was relied on custom properties for resources it seemed like a good idea at the time but LogicMonitor treats custom properties different to built-ins and this caused issues later when we wanted to do more complicated dynamic groups with inheritance. Stick to the closest built-in property that matches and everything is much easier.

Some limits we hit were minimum poll times are on the order of 60 seconds and the fact all monitoring now depends on the LogicMonitor cloud being up and available, we had a few hickups over the first year, usually for a minute or two, root cause being either our sites internet having a blip or failover or possibly LogicMonitors side, they were so short that proving which side took longer than it was worth most of the time.

All in all if I wasn't paying I would buy it again but the price and complicated nature of the licenses would make me choose very carefully.

Alive_Ad7609@reddit

Consolidating 18k servers across multiple clouds is a nightmare with that many legacy tools. LogicMonitor is decent, but you're going to get hit with massive per-device licensing costs at your scale.

Before you lock into another expensive enterprise vendor, look at OpenObserve (O2).

It’s a lightweight, open-source alternative to the ELK/LogicMonitor/Datadog route. It handles logs, metrics, and traces in one spot and uses S3 for storage, which keeps costs way lower than indexed platforms. Since you’re worried about noise, it has built-in alerting and dashboards that can sit on top of your existing telemetry without the "bolt-on" complexity of a massive Grafana project.

Check it out:https://openobserve.ai/

Burge_AU@reddit

Checkmk would handle this well for you and given that you are already running Nagios "some" of the concepts would be familiar. Checkmk can also natively monitor AWS, Azure, GCP as well along with VSphere.

I'm assuming you are using Oracle Enterprise Manager for the Oracle DB stack. Checkmk will provide good visibility into the Oracle Database layer and monitor the critical services but it won't drill down into the ASH data like OEM does.

You can use Grafana to supplement Checkmk's graphing and dashboarding - Checkmk operates as a native datasource to Grafana.

Checkmk will natively integrate with ServiceNow via notification plugin.

Feel free to ping me any questions if you like.

Sliverdraconis@reddit

Ok. So much to unpack here good lord. You do know solarwinds observability self hosted can do all of it? It even hooks into azure, aws and gcp plus has database monitoring solutions.

Yes theyre expensive but you can literally unify and get rid of all the other stuff.

We use it at my job for around 4k nodes with expectations to grow to around 8-10k over next 3-5 years.

Theres alot of hate but honestly the newer versions run well and have nice features. They are also putting in alot of effort into cloud monitoring and on prem still.

Confident_Guide_3866@reddit

We went through the effort of integrating all of our monitoring through Prometheus and grafana, it’s been pretty good

chickibumbum_byomde@reddit

A solid setup and works well. Prometheus/Grafana gives great flexibility and visibility once it’s running.

The downside is you have to maintain everything yourself over time. using checkmk atm flexible, cannot complain, all configs under one hood.

pahampl@reddit

XorMon

junpei@reddit

I used it a few jobs ago in 2017, it was implemented to replace Nagios shortly before I arrived at the job. It's powerful and can do everything you want, but it will take time to set everything up. Their team was nice to work with as well and I will help set you up. I actually think one of my coworkers went to work for LM at one point (they were a local company to us). I wasn't privy to the costs, but we were an under 50 person ISP/MSP shop that decided the costs were worth it to integrate all of our services into one monitoring platform. We were entirely on prem services though, nothing cloud based.

At my current gig, I set up a Grafana/Loki/Prometheus stack as they didn't have any sort of monitoring under our own control.

Sounds like a similiar journey of mine, also used Nagios for a good portion of time, then found out about checkmk (which can use the same Nagios core) later switched to the checkmk core. one big advantage is that you get everything in one place under the hood, same core instead of building and maintaining multiple components or stacks which i also tried and got fatigued maintaining all at once.

The Grafana/Loki/Prometheus stack works too, but as you’ve probably seen, you end up stitching together, metrics (Prometheus), logs (Loki), dashboards (Grafana), alerting (Alertmanager) which as mentioned ...a big hassle.

with one centralised tool, the setup effort is more upfront, but you get monitoring, alerting, inventory, and dashboards in one system, which is why a lot of smaller MSPs/ISPs go that route, especially onprem.

Both work, just depends if you prefer building the stack yourself or having it more “ready out of the box.”

ifpfi@reddit

We bought this back in 2010 where I used to work at and the pricing was reasonable. In my opinion it is the gold standard of monitoring. Every level in the infrastructure up to LogicMonitor itself is monitored and it will text you if something goes haywire. They have steadily raised their prices until I left.

I tried to get a quote for LogicMonitor at my new job but their minim device count and prices were outrageous. They no longer have a form to sign up as a new customer and the pricing is always hidden. For 18,000 expect to pay a pretty penny but it will be a reliable monitoring solution.

Ma7h1@reddit

That situation sounds very familiar — a lot of environments grow exactly like that over the years.

From my experience, the bigger challenge isn’t adding “another layer” on top (like Grafana), but actually reducing complexity and consolidating where it makes sense. Otherwise you’re just aggregating chaos instead of solving it.

We’ve been using Checkmk for quite a long time, and one thing that really helped us was exactly that consolidation aspect. Instead of running multiple tools for different layers, we brought a lot of it into one platform — infrastructure, network, applications, logs, etc.

Especially in larger environments, the key is not just collecting data, but correlating it and reducing noise. Features like built-in checks, service discovery and dependencies make a big difference there, because you don’t have to glue everything together yourself.

Another nice aspect is that you can build really useful custom dashboards directly in Checkmk, so you don’t necessarily need an extra layer like Grafana just to visualize things.

What I’d be careful with is underestimating the effort of “bolting observability on top” of an already fragmented setup. In many cases, starting to standardize on a central platform actually pays off more in the long run.

I’m also running a smaller version of this in my homelab, and it really highlights the difference: one system, consistent data model, less overhead — and much easier to reason about when something breaks.

Since there is a Free Version around I would give it a try

gheyname@reddit

I’ve used it at a SMB, 500~ resources’ in 3 colos. For network, compute and storage I think it’s too expensive and they have at times pulled features from the base sku and made it a value add.

I’m sure there is a point where it makes sense but at that size I’d prefer zabbix.

nowtryreboot@reddit

If it were up to just me and my manager, we’d happily go with Zabbix+Grafana.

We evaluated logicmonitor as well (in addition to Splunk and others), and it was good for most cases. We were almost in your boat (10K-ish hosts spread across two clouds and on-prem).

I see a lot of repetition there: SCOM, Icinga, Solarwinds~ why?

Inquisitor_ForHire@reddit (OP)

The repetition comes from mostly internal stupidity over the years. Team A is doing something and doesn't talk to Team B. Solarwinds was brought in during an acquisition and is mostly legacy. SCOM is used for Database monitoring. I'm not convinced it actually does anything. Icinga is the server team who wouldn't be caught dead using Nagios which is the Network team's pile of crap.

Yeah... inertia and "I want to do my own thing" is real!

Apachez@reddit

I dont have first hand experience from current versions of Logic Monitor but I have some from Unomaly which they aquired to do the anomaly analysis of logs a few years ago (jan 2020).

Unomaly was a great product but very niched.

In short you just send all your logs to it (and there were very little management to deal with this over time - "it just works") and then based on the source it will first start in training mode.

After some hours or days it would automatically switch to regular mode.

The point of this training mode is to find out what is the baseline for this particular source. The purpose is to then suggest to you later on when something new that have not been seen previously shows up - either as a full line or just thresholds of existing lines.

As in finding the needle in a haystack.

So the product worked and did its thing.

Drawback with the whole concept is of course what will be your baseline?

For example if you already have logentries regarding malware this will become your baseline so it wont react if some server reports that a malware is running if this is part of the baseline.

Today I would expect most SIEMs would have something similar so it boils down to what is your needs, how much you sync with the product (which is something only you can tell after actually trying it out - for me I prefer systems which you dont need weeks of training to figure out how they work or how you will configure them) but also the pricerange. Same charge per TB, per EPS, per collecting servers, per source or just sitelicense (which I generally prefer) so you can install as many as you need/want and the capabilities will not be licensebased but which hardware it runs on (either baremetal or as VM).

Other than that you should also consider to scale down on number of products.

Normally you need something that can do SNMP to get realtime data from mainly network equipment (also exists for servers), can collect (both for archiving but also analysing) syslog and perhaps to top it off having a logcollector locally in the server (to get logs from windowsservers and whatelse - drawback with this is that you will install additional software on the server).

Otherwise you end up with something like:

https://xkcd.com/927/

_araqiel@reddit

Zabbix.

jstuart-tech@reddit

I don't manage it nor work with it day to day. But that Edwin AI stuff they have actually seems to work pretty well from what I've seen.

gixxer-kid@reddit

It works great for on prem stuff, with APIs, SNMP and WMI. Lots of great stuff out the box and then the option for custom made modules and thresholds and routing.

Not so great for the cloud stuff, that’s where you’re gonna want a tool like datadog or another log ingestion tool. Having said that, datadog isn’t great on prem. So really depends on how much cloud you have v on prem.

themastermatt@reddit

IMHO - Not great. We currently have it and are looking for options too. They all have things that suck about them but LM seems to lack many features and makes tuning a PIA. The interface is dated and clunky. 18K Servers - id consider expanding Solarwinds and/or Nagios since they are already there.