Across the United States, state and local agencies work hard to maintain safe water supplies by treating water to remove contaminants and distributing it via millions of miles of pipelines. The East Bay Municipal Utility District (EBMUD), based in Oakland, California, provides local residents with reliable, high-quality water and wastewater services. The utility’s water system serves 1.4 million customers east of San Francisco Bay, and its award-winning wastewater treatment serves 685,000 customers while protecting the bay.
EBMUD’s infrastructure includes 4,300 miles of pipeline with fixtures, such as valves, fittings, and pipe junctions, that are mapped and modeled in GIS. The utility’s mapping group creates and maintains the water network data in an enterprise geodatabase that relies on a geometric network. Many stakeholders within the district use the GIS data. The water distribution planning group, for example, is building an enterprise hydraulic model that can estimate the pressure and flow at any point in the water distribution system, conduct hydraulic studies to size facilities, recommend facility outages, and identify distribution system improvements and critical locations. The operations and maintenance group also uses GIS data for dispatching, responding to leak investigations, capturing valve exercise data, and creating outage plans for emergency repairs.
For these hydraulic modeling and operational exercises to be successful, EBMUD’s water distribution system data must be good. But this is challenging not only because the utility’s coverage area is so vast but also because lots of its data had to be converted to GIS from legacy systems. To address this, EBMUD implemented a new set of tools, including ArcGIS Data Reviewer, that make the data review process more efficient and help ensure that the data is of the highest quality.
When Data Quality Is Less Than Ideal
To enable other groups within EBMUD to do hydraulic and pipeline modeling and carry out operational and maintenance exercises, editors in the mapping group export feature classes into a file geodatabase. They then correct any errors found in the local copy of the file geodatabase. In the past, these fixes weren’t copied back into the default version of the geodatabase. As a result, editors ended up having to fix the same errors every time they downloaded the latest dataset locally.
Although the geometric network provides basic quality control methods, the quality of EBMUD’s GIS data was less than ideal. The geometric network doesn’t notify users when a rule is violated, making it difficult to pinpoint errors. Because editors had to manually sort through data to find errors, the quality assurance and quality control (QA/QC) process was time-consuming. And sometimes, users had to make manual corrections to the data to get accurate results for their models and exercises.
“From [doing] exercises on patching leaks to identifying the pipes most likely to fail in the future, GIS data is used for a lot of critical operations,” said Rachel Wong, GISP, GIS software engineer II for EBMUD. “If the GIS data quality is not good enough for a particular job, it can impact their analysis and work.”
To remedy these issues, EBMUD began looking for a more comprehensive way to improve quality assurance in its GIS so the mapping group could proactively identify errors, clean up the database, and provide higher-quality data to the rest of the organization.
A Solution That Does Out-of-the-Box Data Checks
When some members of the EBMUD mapping group attended the Esri User Conference a few years ago, they met with a technical consultant from Esri Services to get a Data Health Check. Using ArcGIS Data Reviewer, an extension for ArcGIS Desktop and ArcGIS Enterprise, the consultant ran automated checks against a copy of EBMUD’s water network data. This gave them an overall assessment of the quality of EBMUD’s data, and they decided to implement the solution.
According to Wong, the mapping group liked ArcGIS Data Reviewer because it provides extensive data validation checks that call attention to integrity, attribute, and relationship errors. If not for these out-of-the-box checks, Wong’s team would have had to build complicated models using ArcGIS Desktop geoprocessing tools or by writing custom Python scripts.
ArcGIS Data Reviewer also allowed EBMUD to store and manage data errors in a centralized location—as a Reviewer Table within the utility’s geodatabase. Editors use this feature to view data errors, navigate to their location, group errors together by category, and understand their severity to be able to prioritize the data cleanup work.
After EBMUD implemented the extension, the Esri technical consultant conducted an on-site workshop to train the engineering and mapping teams on how to use the solution. He also helped them configure the types of data checks they needed. EBMUD then expanded the configurations internally.
The team started with a list of high-priority data errors to check in the geodatabase. These included disconnected features, such as overshoots or undershoots at pipe intersections and orphan pipes; pipe ends without proper devices on them; overlapping pipes that have coincident geographic locations; and pipe geometry that cuts back in on itself. Wong validated the features in the geodatabase using the configured QC checks and stored the resultant errors for editors to use when correcting features. She also trained editors so they could use the new configurations to find errors on their own.
Identifying Errors More Efficiently
Using ArcGIS Data Reviewer has helped EBMUD improve the quality of its data, and that’s had a positive impact on day-to-day operations that involve GIS. The new QC configurations make it easier to edit data efficiently. Editors in the mapping group can now validate their work daily while making map edits, ensuring a complete QC check before the data gets sent to other groups within EBMUD. Additionally, editors no longer have to sort through the data manually to find and fix issues, since ArcGIS Data Reviewer manages the error life cycle and streamlines the correction process.
Being able to find errors in an automated way has enabled EBMUD’s mapping group to focus its efforts on fixing them, according to Wong.
“With ArcGIS Data Reviewer added to our workflow, we now have comprehensive, thorough data quality control audits that validate GIS data and help us identify invalid data more efficiently,” said Wong.
Within six months of implementing the extension, the team was able to correct 80 percent of data errors. That has helped speed up error cleanup as well. Upon rerunning the same checks recently, pipe undershoot/overshoot errors decreased from 2,400 to 400, pipes with errors at crossings plummeted from 490 to 6, and errors showing pipes that don’t split properly at the tee dropped from 2,100 to 30.
“We did not have a comprehensive way to identify errors [before]. We could [have used] geoprocessing tools, but the tools could not cover everything,” said Wong. “ArcGIS Data Reviewer has really helped us identify errors more efficiently. We can look for the exact type of geometric or attribute error and then go right to the error location in our data.”
Additionally, because QA/QC results are saved in the geodatabase, there’s a certain degree of transparency in error tracking. “We can easily go into the Reviewer Table to see how editors have verified data and view their progress,” said Wong.
Other features the mapping group enjoys include the ability to run multiple data checks at once and the QC grid functionality in the Reviewer Table, which links errors to water pressure zone polygons and helps editors assign and prioritize tasks. Focusing on specific pressure zones when cleaning up data reduces the risk of conflicts and, ultimately, increases productivity.
“With ArcGIS Data Reviewer, we’ve gained efficiency and a cleaner workflow,” Wong concluded.