Yesterday we received a rash of high CPU utilization alerts from a server we monitor. Logging onto the server, we found performance to be painfully slow, we also had several complaints from end users who were experiencing similar slowness when using network applications. We quickly found the Microsoft.Online.Reporting.MonitoringAgent.Startup Process was using anywhere from 75%-90% of the CPU resources. The Microsoft.Online.Reporting.MonitoringAgent.Startup process is part of Azure AD Connect and runs from the Azure AD Connect Health Sync Monitoring Service.
We first stopped the Azure AD Connect Health Sync Monitoring Service, this instantly relieved the CPU and numbers went back to normal, indicating this is the likely culprit responsible for the issues. In an attempt to replicate the symptoms, we re-started the service. Initially things were still fine, however over the next 15 minutes or so, CPU Utilization from Microsoft.Online.Reporting.MonitoringAgent.Startup slowly crept back up the 75%-90% range.
The quick and dirty workaround for this was to simply stop the service and set it’s Startup type to Manual or Disabled. This will allow your users to work normally until the root problem is fixed.
Looks like this one is caused by some of the latest Windows Updates. There are many reports across the internet of which KB’s are causing the problem. It seems to affect everything from Server 2008 to Server 2016. In our scenario, we’re running Server 2012R2. The KB that seems to be responsible is KB4338815, 2018-07 Cumulative Update for Windows Server.
Uninstalling KB4338815 fixed the problem for us. There are several other updates that others have found to fix the problem for different OS’s and different environments, including KB4103725, KB4096417, KB4095875, KB4054566, KB4338814, KB4338419, KB4340558, KB4338605, KB4339093, KB4340006, and KB4338824. Microsoft says they will be releasing an auto-update for Azure AD Connect to solve the problem once and for all (No ETA at this time).