In February, we released ArcGIS Data Pipelines, a powerful tool that streamlines data integration in ArcGIS Online. This innovative app offers a fast and efficient way to bring in datasets, geospatially enable them, clean and prepare them, and write the processed data out to a feature layer ready for use in mapping, analysis, and reporting. Available to all Creators and above in ArcGIS Online, Data Pipelines provides you with a low-code, drag-and-drop approach to data engineering, making it easier than ever to design and automate your data integration workflows. In the June update of Data Pipelines, we added support for new input file formats, enhanced the Calculate field tool to support Arcade geometry functions, and made a few enhancements to scheduling including the ability to schedule more frequent updates.
What’s New in November?
The November 2024 update of Data Pipelines includes new tools, the ability to document your workflow using notes, a new output method to overwrite existing layers, and more. Check out the video below to see some new features in action or keep reading for more information.
New data engineering tools
Use two new tools, Pivot and Dissolve, to distill your data into summarized and transformed datasets.
With the Pivot tool, you can categorize and transform your long, tabular datasets with many records into wide datasets with many fields. This tool can help you explore trends, limit your data to values and attributes of interest, or group your data into summarized categories. Additionally, summarized pivot results can be visualized in dashboards or used for charting. For example, you can create a pivot table using election data to capture voter turnout by geographic region and age group, or any other demographic.
Dissolve can complete similar aggregations, but instead of acting on attributes alone, it also considers polygon or polyline geometries to identify relationships between records. For example, you can use Dissolve if you want to merge California county polygons into a single state polygon. If each county has population counts or other demographic information, you can choose to calculate summary statistics for them.
Document your workflow using notes
You asked, and we listened! We’re excited to introduce the ability to add notes to your data pipeline, allowing you to document your process. Use notes to add context to individual workflow elements or provide an overview of the entire data pipeline. You can use notes however you want, but here are some examples of how notes could be used based on feedback from other Data Pipelines users:
- You work on the same data pipeline as a colleague, and you want to track the change you made so they are aware of it.
- You want to share the data pipeline with your peers so they can complete the same workflow. You can use notes to outline what each step does and why it’s used.
- You maintain multiple data pipelines that are used for various purposes. You can add a general note to add details about the overall workflow, how the results are used, where the source data comes from, etc.
- You invested a lot of time in creating an Arcade expression used to calculate a new field, and you want to take note of your findings throughout the process.
Click here to learn more about adding notes to a data pipeline.
Overwrite existing feature layers
Exploring data and building data prep workflows is rarely a linear process. Sometimes the source data updates and you now have more fields to wrangle, or you realize you need to configure another tool to make a dataset spatial. And sometimes these changes happen after you’ve already created an output layer. So how do you update a layer to pick up new fields or geometry types? Using the new overwrite output option! Unlike the replace or add and update output methods, overwrite lets you add or remove fields from the layer. You can also use overwrite to change the geometry type of a feature layer. This means that if you accidentally output a table before creating a geometry, use overwrite to update the layer.
There are some limitations to overwrite that you should be aware of. For example, you cannot use overwrite to change the spatial reference of an existing layer, and you can’t overwrite a layer that was not created by Data Pipelines. There are other considerations to make when using overwrite that you can learn about in the output feature layer documentation.
Other new features
In addition to the features above, we’ve also added the following:
- Credit estimation—While using the editor, you can now view a credit estimation in the connection details dialog. Click on the connected status to expand the dialog and there you’ll find the estimated number of credits that have been consumed while you’ve been engineering your data prep workflow. This feature was highly requested by you all!
- Standalone run—From the Data Pipelines gallery page, you can now run a data pipeline. You can find the new Run option by clicking the ellipsis (…) on the card for the data pipeline you want to run. After choosing to run the data pipeline, you’ll be taken to a page where you can view the status of the run. When the run is complete, you can access results and messages just like any other data pipeline run. This option is helpful for those who have data pipelines that only need to be run on occasion, rather than on a schedule. Now, this can be done without the need to open the editor.
More information
More details on new enhancements and improvements with the November update can be found in the What’s new documentation. If you’re new to Data Pipelines or just want to learn more, here are some additional resources to get started:
- Watch this webinar recording to learn more about what Data Pipelines can do, plus some demonstrations of how to use the application.
- Check out some other ArcGIS Blogs about Data Pipelines.
- Follow this tutorial to get started building your own data integration workflow.
- If you have questions, check out the FAQ documentation or post in the Data Pipelines Community where the Data Pipelines team will respond.
Read the What’s New in ArcGIS Online (November 2024) blog for news on other exciting updates.
We want to hear from you! There is more to come for ArcGIS Data Pipelines in the future, and we value your opinion on what new features and enhancements we can add to help you with your data preparation workflows. Share your ideas or ask us a question in the Data Pipelines Community.
Commenting is not enabled for this article.