According to Chris Dougherty, a solutions architect at Langan Engineering, the firm has grown from providing hundreds to thousands of web services, catering to a wide range of usage levels—from high-demand to low-demand services. These services need to be accessible to staff and clients, which led to scalability and performance challenges. He explains that the organization's enterprise GIS was always designed for overcapacity to counter these issues, but the team had difficulty knowing what to scale and what usage was because of the limited metrics available.
Langan is an Esri partner. The firm leverages Esri technology and needed more specific performance metrics that classic IT monitoring tools couldn't offer. Dougherty adds that detailed metrics help the Langan team to understand what parts of the system to scale; for example, increasing memory or storage. However, with the intricacies of scaling an ArcGIS Enterprise architecture, the team wanted more specific metrics to better optimize its enterprise GIS.
"Statistics and metrics on what we have are extremely important to us to help us get a better picture. Being Esri specific is necessary because standard IT system monitoring tools don't show the whole picture," says Dougherty. "They'll show you throughput and RAM usage and things like that. But we really need service-by-service statistics."
Dougherty says the team has had ongoing technical issues, including exceeding the limit of ArcSOC processes. The team has 700 dynamic ArcSOC processes, though there is a limit of around 250 for the server, so the issue often caused the server to crash because it could not handle that much memory consumption.
Alex Bakhtin, senior solutions engineer at Langan Engineering, says when a technical problem occurred, such as a failed multimachine ArcGIS Server site upgrade, the team would need to check each server individually and review all log files in different locations because of the multimachine environment. The new solution would need to aggregate information from logs and provide notifications on system issues and health checks to streamline troubleshooting.
"We basically needed a product that allowed us to administer and manage a widespread enterprise GIS system like this one, that gave us more than what our internal software currently does, where it just lets us know when we have high CPU usage or when a service isn't running," says Bakhtin. "We needed more metrics and more insight."