Migrated from VMware to Hyper-V, what do you use for monitoring?

Posted by Jirobaye@reddit | sysadmin | View on Reddit | 34 comments

Hi everyone,

I've recently migrated from VMware to Hyper-V for cost reasons, like many others.

I’d like to know if there’s a good way to monitor both the hardware and the status of the VMs, something similar to what vCenter provides.

I have a small 2-node Failover Cluster running on Windows Server 2025.

The hardware is Lenovo ThinkSystem, with a dedicated Lenovo SAN as well.

At the moment, I’m managing the VMs through Failover Cluster Manager.

Would it make sense to use a dedicated VM outside the cluster with Windows Admin Center, Lenovo XClarity Integrator, and Zabbix for alerting?

I’m curious to know what others are running in similar setups.

What’s your stack?

[-]

DarkAlman@reddit

PRTG is reasonable, and you can get 100 sensors free

[-]

Yo lo hice con Windows Admin Center y el propio Hyper‑V Manager, y encima una instancia de Prometheus con node_exporter corriendo en cada host. Grafana me da los paneles y Alertmanager se encarga de las notificaciones. No necesito una VM aparte; todo corre sobre el nodo y me da la sensación de vCenter sin la carga extra.

En mi caso eso funciona bien y me ahorra un montón de licencias. ¿Alguien más ha probado algo parecido?
Me encontré una publicación que habla justo de esto, creo que vale la pena echarle un ojo:
https://sysadmincore.com/c%C3%B3mo-monitorizar-hardware-y-vms-en-entornos-hyperv-con-windows-admin-center-y-herramientas-open-source/?utm_source=reddit

[-]

pahampl@reddit

Use XorMon for storage, SAN and Hyper-V, it does very good job

[-]

mat-ferland@reddit

For a 2-node Hyper-V cluster I’d keep monitoring outside the cluster. Windows Admin Center is fine for day-to-day visibility, but I would not make it the only alerting path.

A common split is: Lenovo XClarity for hardware, Failover Cluster events/perf counters for cluster health, and Zabbix/PRTG/Checkmk for alerting and history. The big thing vCenter gave you was one mental model. With Hyper-V you usually have to build that by deciding what is authoritative for hardware, cluster, storage, and VM state.

I’d also test one boring failure on purpose: host reboot, path down, VM failover, low SAN capacity. If the alert tells you the actual problem instead of five symptoms, the stack is probably good enough.

[-]

helpfourm@reddit

On another note what did you use to migrate your servers?

[-]

DarkAlman@reddit

I do a lot of Vmware > Hyper-V conversions

Veeam is the best solution, leverage instant-on recovery to do them quickly. 15-20 minutes of downtime per VM

Starwind has a good free converter tool but it can take hours to copy over the VM data.

[-]

Arudinne@reddit

Starwind has a good free converter tool but it can take hours to copy over the VM data.

It's a useful tool, but its not always successful. I had more than a few converted VMs that wouldn't boot properly the last time I used that tool (2022-ish).

I think I had one VM that took like a day to convert (~10TB File server) and thankfully that worked without issues.

[-]

whatsforsupa@reddit

We've been using Check_MK for quite a while and are slowly transitioning to Zabbix.

Both good at different things, seems like Zabbix is more powerful and has better community / documentation at this point.

For server / idrac monitoring we use Dell OME. I've been learning it more and more and really like it.

[-]

Arudinne@reddit

Interesting. We're switching from Zabbix to CheckMK's cloud because Zabbix has been too much of a pain to get working reliably.

We kept having issues where the proxy VMs would silently fail and we had to hard reboot them because they kept getting stuck during a normal reboot.

[-]

absence09_@reddit

Was about to mention CMK - it’s working for what we need but I might look into zabbix after reading through some of these comments

[-]

merlin_infosec@reddit

Icinga2

[-]

Key-Brilliant9376@reddit

I monitor everything with Zabbix. Once you learn it, it's probably the best monitoring system out there. If there isn't a template to monitor what you want, you can create a new one. Using AI to help create them has made it a little easier as well.

[-]

Jirobaye@reddit (OP)

Zabbix now have is own mcp?

[-]

InvisibleTextArea@reddit

Here you go

https://github.com/mpeirone/zabbix-mcp-server

[-]

Key-Brilliant9376@reddit

Not that I am aware of. I wasn't talking about plugging it in to AI, but rather, using AI to help you parse your templates.

[-]

zakabog@reddit

Not that I'm aware of but Google has access to the Zabbix documentation and can provide some fairly decent help, otherwise there's a subreddit and forum.

[-]

_Blank-IT@reddit

I haven't used Zabbix in like 10 years how is it now?

[-]

Key-Brilliant9376@reddit

Oh man, it's much much better. It does have a learning curve but it's easy to pick up for a semi-intelligent person.

[-]

Skyhound555@reddit

It is probably the standard for monitoring at this point. Especially thanks to Paessler quadrupled their support subscription.

[-]

PixelSage-001@reddit

Zabbix is an excellent choice for a Failover Cluster of this size. It has native Windows performance counter monitoring and templates for Hyper-V. Windows Admin Center is great for ad-hoc management and Lenovo hardware checks, but it's not a real-time alerting platform.

If you want to monitor the cluster's high availability without the overhead of massive enterprise suites, a good setup is pairing Zabbix (for SNMP/hardware metrics) with a lightweight script to query the Failover Cluster PowerShell module (specifically looking at VM cluster resource states and CSV disk space).

In our Hyper-V environments, we run these cluster health checks as a pipeline on Runable. The runner triggers a PowerShell script to audit the VM states, checks if any volumes are running low on space, and formats a Slack/Teams alert if a node is offline. Offloading the active validation checks to an external runner keeps your monitoring lightweight and ensures you get alert notifications even if one of your cluster hosts goes down.

[-]

dire-wabbit@reddit

The Microsoft answer would be System Center Operations Manager (SCOM).

[-]

bill696@reddit

They need to have SystemCenter licenses for that. Also would use VMM and not just Hyper-V.
But i think the modern Microsoft Answer would be azure Arc and not OM

[-]

Jirobaye@reddit (OP)

too much for my enviroment

[-]

bill696@reddit

Yeah i get that, it was more en answer to the comment on Microsofts official answer

[-]

1FFin@reddit

Custom PowerShell Scripts with RMM of you choice (like Ninja, n-able,...).
Just make sure you add and verify checks for every important aspect. So when your RMM is all green - everything should be fine. If you have an issue that was not detected before: check if you could/should create an additional checks for that issue/case.

[-]

Jirobaye@reddit (OP)

A free one?

[-]

anonjohnsc@reddit

We use PRTG with custom PS scripts. Free and works well!

[-]

1FFin@reddit

Stick with the one you already use and are familiar with? Just replace vmware part with new checks for hyper-v.
Sure you already have a RMM in place for vm guests and all other endpoints/workstations.

[-]

4wheels6pack@reddit

Monitor through RMM Can monitor uptime, hardware state and setup conditional automations

[-]

zakcobb@reddit

we use azure arc and azure monitor for our local hyper-v servers. works great.

[-]

GullibleDetective@reddit

Rmm like ncentral

Zabbix

Domotz

Veeamone

[-]

AmazingHand9603@reddit

I would recommend this:

WAC for Hyper-V and cluster management
Lenovo XClarity for hardware, firmware, RAID/SAN, and host health
Zabbix or another alerting layer for notifications

The main thing maybe to watch after leaving vCenter is correlation. It’s easy to have hardware health in one tool, VM/cluster state in another, and application issues somewhere else. That works, but troubleshooting will be slow when storage latency, host pressure, or a VM issue are connected.

I'd still keep WAC and XClarity for management but use CubeAPM if you want one place for infra metrics, logs, application health, and alerts without building a large observability setup. Otherwise, Zabbix is a solid fit if you’re okay maintaining templates and checks.