Save to My DOJO
It won’t come as a surprise but vRealize Operations Manager, also called vROPS, does exactly what it says on the tin; “Manage operations”. Now, although you probably don’t need to, let’s ponder over the term “Operations” for a bit and what it means for the sake of this blog.
Operations are often referred to as “RUN” while Transformation is called “BUILD”, two terms that pop up all over the place in the IT world. BUILD teams aim at driving innovation and implementation of new projects while the RUN department ensures that the existing environment runs smoothly according to agreed SLAs. As you probably figured, vROPS falls under the umbrella of the latter.
The boundary between BUILD and RUN doesn’t always fall in the same place according to the organization’s setup (semantics also get in the way). For instance, some RUN teams will install and configure infrastructure components such as vRealize Operations, vCenter, or vSphere, while they may only deal with N2/N3 support and capacity planning in another organization.
However, it is still common for SMBs and some medium-sized businesses not to differentiate RUN and BUILD. In which case the IT department will split their time between project work and day-to-day operations. While organizations of all sizes leverage vRealize Operations Manager, those smaller organizations will greatly benefit from vROPS as it will take some of the heavy lifting of infrastructure operations off their hands!
What is vROPS?
vRealize Operations Manager comes as a virtual appliance that is to be deployed in your management cluster if you have one. It can be installed in a number of ways, tailored to your environment’s size and complexity. The easiest scenario consists of embedding all the components in a single virtual appliance, while more complex architectures will require that you deploy the components independently running as separate VMs which opens the door to HA implementations and larger collection sets.
The vROPS components can be deployed in separate appliances to account for large environments and facilitate scalability
vRealize Operations Manager collects data from the environment and processes it to make recommendations, identify issues, trigger policy-based automation as well as a whole lot of analytical goodness to improve operations’ efficiency.
vROPS also offers a pluggable architecture to extend the monitoring to third-party products through what are called management packs. More on that later.
My Top 10 vROPS Features
My top 10 will probably differ from yours as each environment has its quirks and specifics. So, let’s say we will cover 10 features of vRealize Operations Manager that we deemed worthy of making this list. There is obviously a plethora of other high-value features in vROPS that we didn’t mention here, you can find them all in the official VMware documentation.
Feel free to leave a comment with the features that are most interesting to your organization!
Policy Creation and Management
Policies are applied to objects or groups and let you configure which metrics and properties are gathered, which alerts and symptoms are enabled, capacity and compliance settings as well as workload automation.
A default policy is created when you connect an endpoint, from which you can create inherited policies. You then get to tailor each policy to the population of objects it will be applied to.
For instance, you may want to apply rather aggressive thresholds to your dev and test workloads as you don’t really care if it gets toasty there. However, production VMs will get more conservative settings to ensure as best an SLA as possible.
You may also want to ensure the environment associated with a specific customer is compliant with whatever industry-standard they must comply with by contract such as ISO, PCI, HIPAA…
You also use policies to control what data vROPS will collect and report on for specific objects to avoid wasting storage, bandwidth and compute on useless data.
Create inherited policies that you can modify and then apply to specific groups of objects
Note that a fair number of default policies are already baked in vRealize Operations Manager when deploying the appliance. Those policies were designed by VMware to fit most environments and offer a good level of visibility to get started without a great level of knowledge of the product.
By creating inherited policies, you can change the state of inherited symptoms and alerts or even disable them on a subset of objects.
vROPS Workload Optimization
If you work with VMware products, chances are you rely heavily on vSphere s/DRS (storage/ Distributed Resources Scheduler) to make sure the demand of your virtual machines is met. While it may appear like so, DRS is not a load balancing feature. Its goal isn’t to have all hosts at the same resource utilization level, its objective is to ensure that the virtual machines have enough resources to run. For instance, if one host is running at 50% with 30 VMs while others are cruising at 5%, DRS won’t make a move if the VMs are fine.
You can get closer to achieve actual load balancing with vRealize Operations Manager thanks to a feature called vROPS Workload Optimization. It works in concert with vSphere DRS to optimize the VM placement in your environment according to a threshold.
The management pane shows the current optimization status, the operation, and business intents
Comparably to DRS, you get to set a threshold that will either balance the workloads across all hosts or consolidate them on as few as possible to reduce the licensing or electric bills for instance. Where it gets interesting here is that you can set a cluster headroom value to implement a resources buffer and account for demand spikes.
Just like with DRS, a cursor lets you select an optimization profile
On top of that, Workload Optimization works with tags to let you enforce VM placements on hosts or clusters with the “Business Intent” pane. For instance, all VMs with the tag “MSFT” are placed on the cluster assigned with the same “MSFT” tag. This will come in handy for various purposes such as licensing, geographical locations, hardware types… Consequently, it does mean that vRealize Operations will automatically create and manage DRS rules. As a result, all conflicting user-created DRS rules will be disabled.
VM placement is achieved by assigning the same tag to VMs and hosts. Categories can be customized as well
Note that you can obviously choose to run it manually with the “Optimize Now” button or automatically either following a schedule or in real-time when an alert pops up. You can go even further and tie it with predictive DRS to get a tight resource management automation system.
Note that all the clusters in the datacenter must be configured with DRS in fully automated mode.
You can extend vROPS’ monitoring capabilities to other VMware products or third-party products thanks to “Management Packs”. Those are like plug-ins you install in vRealize Operations Manager that open an interface to new endpoints. There is a number of packs already installed in vROPS, some of them are deactivated by default such as standards compliances, ping, and service monitoring.
There is also a plethora of management packs available for download in the VMware Marketplace which offers plugins for other products such as vRealize Log Insight, vRealize Automation… They are distributed either by VMware or by the vendor of the product themselves.
Filter out the display to get vRealize Operations packs only in the left pane
Note that Management Packs can be free or subject to licensing by the vendor. Refer to their website for additional information.
Management Packs will let you extend the capabilities of vROPS outside of the virtual environment such as PostgreSQL, SAP, Exchange, physical servers, storage arrays, you name it…
Example of a Pure Storage FlashArray dashboard included in the management pack
Once you’ve downloaded a management pack you get a *.pak file that you need to upload to the vROPS appliance in Administration > Solutions > Repository.
According to how the plugin is made, new resources will be made available to manage this environment such as dashboards, views, symptoms, alerts… In the example below, you can see all the new dashboards brought by a DellEMC management pack I installed. I don’t have a DellEMC system at home to show you what it looks like but you get the gist.
Management Packs usually bring valuable dashboards, reports, alerts, and symptom definitions
Cloud providers integration
Most companies nowadays have integrated cloud services in their infrastructure to some degree. Whether you leverage SaaS workloads or pay for IaaS capacity such as VMware Cloud on AWS, chances are you will want to monitor whatever you are running in there.
vRealize Operations offers management packs for the biggest cloud providers:
- Google Cloud Platform (GCP)
- Microsoft Azure
- Amazon Web Services (AWS)
- VMware Cloud on AWS (VMC on AWS)
vROPS can collect metrics from the main Cloud providers and display the information into dashboards included in the associated management packs
These are incredibly easy to set up. For instance, in order to monitor AWS services, simply go to IAM in the management console and create a user with Programmatic access which will provide you with an access key ID and secret access key pair. You will then use this pair to connect your AWS account in vROPS. You should start seeing data coming in after a few minutes of collection.
AWS dashboards provide a holistic view of your instances with objects relationships
The screenshot above depicts a t2-micro EC2 instance (free-tier) that I run in AWS. As you can see, similarly to your on-premise components, you get the relationship between the objects (subnet, Nic, EBS volume…) as well as usage metrics such as CPU, RAM, disk, network…
IT pros that aren’t well versed in virtualization usually benefit a great deal from running vROPS in their environment for a few weeks and analyzing the result with a consultant. They are often surprised by the outcome as it may sometimes seem counter-intuitive. A common recommendation made by vROPS is to downsize virtual machines, not only to save capacity but also to improve overall performances. However, you also get valuable recommendations on how to efficiently scale up your workloads.
While you may figure out by yourself that a bunch of VMs are running hot and struggling to keep up with the demand, it will not always be that obvious to know whether you should actually add resources based on trends, spikes, maintenances, etc… and how much.
vROPS will help you with that as it will tell you which VMs would benefit from an increase, and more importantly how much to add. There is no point in throwing 20GB of RAM at the VM if it’s not likely to use more than 10GB.
Virtual Machines sizing is often done on a generic basis from a template and the VMs are scaled up when the demand increases. However, more often than not, people will bump a VM from 2 vCPU to 8 “because it’ll run better” when it would only need 4.
Put it this way, how tricky would it be to get 8 seats next to each other on a Saturday night at the movies when there’s plenty of 2 or 4 groups of empty seats? The problem is the same with oversizing VMs’ CPUs. It can actually harm its performances as the host’s vmkernel will have a hard time scheduling it on the physical cores of the CPU(s) while smaller VMs will easily get a free spot. If you want to learn more about this phenomenon, check out the co-stop CPU metric in esxtop.
Downsizing virtual machines will usually improve overall performances
vRealize Operations Manager will help you a great deal with sizing your VMs as it will make recommendations that you can choose to follow or disregard. It will tell you how many resources you can save. In the example above, these recommendations would reclaim 104GB of allocated RAM and 22 provisioned vCPUs.
Now, don’t power through and apply everything blindly just now. Environment-specific reasons may dictate otherwise. For instance, this is a lab I run at home and I know I underutilize pretty much everything, however, I do want to keep it as is to follow hardware requirements.
On top of making recommendations, vROPS also offers the possibility to resize the virtual machines for you. It can be triggered instantly or scheduled to run at a later time. It will initiate a guest OS shutdown using the VMware Tools, reconfigure the VM’s hardware, and power it back on.
vROPS can initiate the resize operations on virtual machines instantly or on a schedule.
Automation and Actions
vRealize Operations is primarily a monitoring and capacity tool indeed, however, you can very well automate tasks initiated from within vROPS so you don’t have to switch between management consoles.
You can execute actions on most objects in the inventory. The available set will obviously change according to the object type. Below are actions for a cluster and a virtual machine.
Actions can be initiated on an object from within vROPS
Execute a script
Note the “Execute Script” choice in the section above. This will execute a script inside the guest OS like you would do in PowerCLI in this way. If you click Execute Script, you have top type valid OS credentials and you will then get the choice to type commands manually or upload a script to run.
You can run commands or scripts on virtual machine objects
This feature available in the “Home” pane lets you automate tasks on a schedule and display them in an easy-to-use calendar. It works by selecting an action, a scope, and a schedule. A limited set of actions are available for now but it covers the most common operational tasks.
Schedule your common operational tasks in an easy-to-use calendar
On top of what we’ve seen so far, you can also run actions based on triggered alerts. A fair number of actions are built-in vROPS. Note that if you want to build on this feature to achieve a greater level of automation, you can leverage vRealize Orchestrator to create custom recommendations thanks to the vRealize Orchestrator Management Pack. You will need to download it on the VMware marketplace, upload it to the appliance and configure an account to connect to it.
Once this is done you can configure vRealize Orchestrator workflows as remediation to a vROPS alert. This can be valuable if your workflows are tightly integrated with your IT organization such as a ticketing system.
Ensuring environments are compliant with such and such policy is a critical part of an IT department. There are various industry standards and making sure all the requirements are applied is far from straightforward and can be time-consuming.
Implementing the recommendations may actually be the easy part here, what makes it tricky is to ensure that it stays that way. Environments and configurations tend to drift from their original baseline as time goes by and operations get in the way.
vROPS offers incredible value in compliance enforcement through the use of dashboards, views, symptoms, alerts that you get from industry standards compliance management packs (mostly U.S. ones) that aren’t embedded or activated out of the box.
“The major industry standards are covered by management packs provided by VMware.”
Here are the industry standards that have management packs in vRealize Operations:
- PCI: This (Payment Card Industry Security Standards) hardening guide addresses the growing threat to consumer payment information. PCI is important to companies that accept, process or receive payments to prevent, detect and respond to cyber-attacks that can lead to breaches.
- DISA: The Defense Information Systems Agency is a part of the Department of Defense (DoD), and is a combat support agency. Failure to stay compliant with guidelines issued by DISA can result in an organization being denied access to DoD networks.
- FISMA: The Federal Information Security Management Act is United States legislation that defines a comprehensive framework to protect government information, operations, and assets against natural or man-made threats.
- ISO 27001: ISO/IEC 27001 is the best-known standard in the ISO/IEC 27000 family of standards providing requirements for an information security management system (ISMS).
- HIPAA: (Health Insurance Portability and Accountability Act of 1996) provides data privacy and security provisions for safeguarding medical information.
- CIS: CIS Controls and CIS Benchmarks provide global standards for internet security and are a recognized global standard and best practices for securing IT systems and data against attacks.
- vSphere Hardening Guide: Now called Security Configuration Guide, it provides prescriptive guidance for customers on how to deploy and operate VMware products in a secure manner.
Once you activate one of these management packs, you can enable it in the compliance view and get a state of your environment. As you can tell my lab does not comply with ISO27001 recommendations.
You can enable several compliance reports to check your infrastructure against
You then get the list of alerts about objects that aren’t compliant in the bottom right pane so you can start working on the remediation.
Labeling this as a feature might be subject to interpretation but I find the alert management system particularly useful to include it here.
We already talked a little bit about this in another article, however, I still wanted to touch base on this topic as it remains the bread and butter of vROPS as it ties directly in the discussions around monitoring and visibility of the environment.
What happens in too many cases is the monitoring throws so many false positives, admins end up tuning them out. Rendering the whole thing is useless as the approach becomes reactive rather than proactive.
Alerts provide all the relevant information and recommendations in an easy-to-read display
vRealize Operations brings a lot of value in the sense that the symptoms and alert definitions pre-defined are relevant and they come with a level of importance and recommendations on how to fix the issue.
Take this alert for instance. Its name makes it obvious what rule was violated. You easily find exactly which symptoms were triggered and get recommendations on how to fix them. In the case of this alert, you can trigger the deletion of the snapshots directly from this page. You also get some background info on the “why” in the second recommendations pane.
Shortened screenshot of the ‘Potential Evidence’ tab of an alert
When doing troubleshooting of any issue in any environment, the first question I ask myself is “what was changed at that time?”, which often helps to identify the culprit. For that reason, the “Potential Evidence” tab is a personal favorite of mine as it will tell you what happened around the same time that could be related to the issue at hand with a blend of events, property changes, and anomalous metrics. Pretty sweet if you ask me.
There are also a few things you can do from within vROPS to narrow down the search like displaying the list of processes after typing in your credentials. This will come in handy to quickly troubleshoot a heavy hitter for instance.
The Get Top Processes tool is useful for quick troubleshooting on VMs with an abnormal demand
In order to tie up this section, I will quickly finish with notifications. On top of the more common emails and SNMP traps, you can push alert notifications to various destination types through plugins such as Slack, Service Now, or webhook if you integrate with a third-party app.
The plugins in the following screenshot are included with vROPS but others may be added when you add a management pack as a solution.
Additional outbound plugins can be added with management packs
Trending and Capacity planning
One of the pain points of any infrastructure is to account for future growth, also called capacity planning. Although it’s also true for cloud workloads to some degree, it mostly applies to on-premise SDDC as you can’t scale the capacity as flexibly as in the cloud.
vRealize Operations will help you in that aspect by analyzing trends of resource consumption in your environments and make predictions as to where it is going. It will obviously need at least several months’ worth of data to produce somewhat reliable recommendations. Refer to the documentation for more details on the analytics.
The engine uses a combination of usable capacity and demand to calculate the time and capacity remaining, hence deriving recommendations from these
Although obvious, I will also point out that it depends on the business. If your organization signed with a big customer which will require a lot more resources by the end of the year than what you currently have at hand, good for you, however, this is purely business-related and there is no way to predict it with monitoring.
Capacity planning will be most accurate in environments where the overall resources usage is somewhat steady (spikes excluded). As in, if your resource consumption is completely random and goes up and down all over the place, vROPS won’t do a good job at planning future growth. Such patterns may make it worth it to look into moving workloads to the cloud if possible as it may save you some cash.
On the Homepage, you get an overview of the capacity in each data center that will display the time remaining until a resource runs out and recommendations on how to avoid getting there. I used a screenshot from VMware which is more interesting than what my lab environment shows.
Trend predictions will help you understand when you need to scale up
Note that you can obtain the same kind of capacity predictions on specific objects such as VMs, hosts… In which case you get capacity and time remaining. In the following screenshot my vROPS appliance is showing a concerning CPU usage trend (if it were a production environment).
Individual objects also benefit from time and capacity remaining calculations
I wanted to tie that up with the “What-if Analysis” feature. While I get why some may find it a bit gimmicky, it fits right into the discussion around capacity planning. Especially that use case we mentioned earlier where the business signed a big customer that will bring a large number of extra workloads. With the What-if Analysis, you can simulate adding workload to your environment to see if it would hold up or if you need to add capacity.
I went a bit crazy with my simulation, turns out it wouldn’t be a good idea to provision 100 virtual machines in my lab. The output will get you an estimate of how much it would set you back to run it in various cloud providers (not cheap).
What-if Analysis lets you simulate a scenario where you add a number of workloads in the environment
Note that there are several What-if scenarios you can run, not only adding VMs:
- Workload Planning: Traditional
- Workload Planning: Hyperconverged and VMC on AWS
- Infrastructure Planning: Traditional
- Infrastructure Planning: Hyperconverged
- Migration Planning: VMware Cloud
- Migration Planning: Public Cloud
- Datacenter Comparison: Private Cloud
While vRealize Operations is the best tool to monitor VMware environments, it is also a great contender when it comes to application monitoring. Historic software products such as Nagios and Zabbix are very powerful in that regard but vROPS holds its ground as it can offer clear visualization of your application fluxes if you put in the time and effort to set it up.
Straight away you can start by using the Service Discovery feature that works by querying VMware Tools to identify a set of supported services. You can then enable their monitoring of the objects themselves. While this is a great start, you can achieve better in-depth visibility with Application monitoring.
25 Applications are available out of the box in vROPS
The feature leverages the Telegraf agent for Windows Server, Linux (rpm), AIX, Solaris, Oracle Linux, and Photon. It can be installed on virtual or physical machines. It supports a number of applications out of the box and is equipped with sets of metrics that you can expand with custom script monitoring.
Application monitoring enables access to relevant metrics and displays the fluxes visually
As you can tell, vRealize Operations Manager is a very powerful product built on VMware’s experience acquired over the years since the first release of vCenter Operations Manager back in 2013. Although this blog was pretty lengthy, we barely scratched the surface of what vROPS can do and how it can help any organization get more proactive and achieve a better SLA.
If you are interested in giving vRealize Operations a shot, keep in mind that it is available in 3 license levels that will give you access to more or fewer features.
Consider the different licensing levels and their price points before getting started
I will finish by saying this: don’t expect to configure everything perfectly straight away. vROPS is a very complicated product and some things may be confusing at first. We suggest you take it slowly. Start by deploying the appliance, connect your vCenter, and browse the menus to see what you get out of the box. Then when you feel that you are picking up how it works and get comfortable with it, sit down with your colleagues and identify what must be monitored? When? What are the thresholds? Actions?…
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!