So this next post is to expand on Sunny Dua‘s “Automatically place ESXi hosts in maintenance in vROps using properties” blog post.
I have been using a similar method in my own environment for some time but with a slight twist…. For example I also wanted to include some Parent and Children ResourceKinds to block alerts from….
Why would I want to do this you ask? Well anyone that manages a vSAN environment knows how noisy the vRops Alerts can be from the related ResourceKind’s…. IE: vSAN Cluster, vSan Disk Group, vSAN Cache Disk, vSAN Capacity Disk… etc when working on a single host in the cluster.
This helps suppress alerts on related objects automatically within vRops… and the great thing is…. vRops will continue collecting data where possible while performing maintenance.
How can it be done?? well we use supermetrics silly!!
So I have 2 separate super metrics for child objects (depth) or a combined one… and another for parent objects
Child ResourceKind’s: (Disk Group)
Count(${adaptertype=VMWARE, objecttype=Hostsystem, depth=-1, attribute=runtime|maintenanceState, where = "!contains notInMaintenance"})
Child ResourceKind’s: (Capacity & Cache disks)
Count(${adaptertype=VMWARE, objecttype=Hostsystem, depth=-2, attribute=runtime|maintenanceState, where = "!contains notInMaintenance"})
Combined Resource Kind’s: (DiskGroup, Capacity and Cache disks)
Count(${adaptertype=VMWARE, objecttype=Hostsystem, depth=-2, attribute=runtime|maintenanceState, where = "!contains notInMaintenance"}) || Count(${adaptertype=VMWARE, objecttype=Hostsystem, depth=-1, attribute=runtime|maintenanceState, where = "!contains notInMaintenance"})

Parent ResourceKind’s: (vSAN Cluster)
Count(${adaptertype=VMWARE, objecttype=Hostsystem, depth=1, attribute=runtime|maintenanceState, where = "!contains notInMaintenance"})

Enable the super metric’s in your “BlackOut” policy:

Create a custom group, link it to your BlackOut Policy, specify the criteria…

Preview the group, woohoo..

Hope this was helpful
vMan
If you put the vSAN cluster in maintenance mode in vROPs, doesn’t that accomplish the same thing?
This puts it into maintenance mode without having to interact in vRops. When you have multiple teams running a large environment people forget about vRops and then generate unwanted auto tickets etc… So I came up with this as a solution for us.
Yes, this is awesome and I totally get it. I might be able to use this. I was just wondering if you only included the vSAN cluster object in maintenance mode wouldn’t that take care of all the children objects? I’m testing this now but I wanted to see what you thought first.
Yes I have another supermetric for the children. It’s 2 super metric’s that need to be configured. You can then apply it to any kind of object you want in the super metric config.
Great article. I’ve had issues, though, getting it to be full-proof. In vROps 7.5, custom group membership is updated every 20 minutes while object collection is every 5. The scenario exists where you put a vSAN host in MM in VC but vROps alerts are still generated because group membership hasn’t updated yet to include the associated cluster/disk objects… so the default policy is in play. Once the group is updated, this works like a charm. Have you run into the same issue?
Hello Dan,
Thanks for the feedback! Yes this is a problem if you perform the maintenance within the 20min before the group membership updates. Usually I wait once the host is in MM for some time before performing maintenance. But I will see if I can find a solution…
Awesome!
You can alter the group membership by editing the /usr/lib/vmware-vcops/user/conf/controller/controller.properties file in vrops
look for a line “groupMembershipEvalPeriod=20”
replace 20 with whatever you need
restart vrops with “service vmware-vcops restart”
I’ve tested it at 5 minutes and it seems to work.
Thanks for sharing!!
I reached out to our company virtualization engineer to see if this setting change would be considered “. He came back with the following. Has anyone made this change in a large enterprise environment and do you agree with the following statement?
“After discussing this with VMware, I do not feel that we would want to make this change. By making that change it would cause vrops to re-evaluate every vm every 5 minutes to check it’s inclusion in the custom group. This has the potential to cause major impact to the performance of vrops and is not recommended.”