The Esri User Conference consistently attracts GIS enthusiasts from around the world, showcasing major technological breakthroughs, and motivating attendees to contribute to a shared goal of creating a better future through mapping. The 2024 conference was no exception, celebrating a significant milestone as ArcGIS Data Pipelines moved out of beta — an innovation set to revolutionize data integration for ArcGIS Online users. The enthusiasm of the attendees sparked numerous discussions, and we’re thrilled to relay the insights gained from these interactions.
Dive into this blog for answers to the most common questions about Data Pipelines and start exploring the latest no-code data integration capability from Esri!
What is ArcGIS Data Pipelines?
ArcGIS Data Pipelines is a new, native data integration capability in ArcGIS Online. It offers a fast and efficient way to integrate, prepare, and engineer data from a variety of sources, including cloud-based data stores including Amazon S3, Microsoft Azure Storage, Google BigQuery, and Snowflake. It offers a familiar drag-and-drop user interface, allowing organizations to create powerful, visual data integration workflows without the need for a Python script or additional software.
Who can use Data Pipelines?
Data Pipelines is available to all ArcGIS Online organizations. The people who can access the app include Creator or above user types, with the Publisher or equivalent roles.
Please note, Data Pipelines is not available today for users of ArcGIS Location Platform or Personal Use and Trial subscriptions.
How does Data Pipelines handle security for data store connections and sharing, especially when dealing with password-protected data sources?
All connection information to secured data sources is encrypted and stored within a data store item. This data store item is saved to your content in ArcGIS Online, and can be used in the data pipeline, but cannot be shared with other users.
To learn more about managing connections to supported cloud-based data stores, please refer to this product documentation: Work with input data.
Can I connect to an on-premises database (Microsoft SQL Server, Oracle, PostgreSQL)?
As of right now, no. Databases that are set up and run on your company’s own system, like Microsoft SQL Server, Oracle, and PostgreSQL, rarely let you connect from outside of their network or the internet. This is what would be needed for Data Pipelines. For this reason, our support for connecting to external databases is generally limited to cloud-based data sources. If your organization copies the data into one of the supported data sources, then you would be able to leverage it within a data pipeline.
Can I schedule data pipelines?
Yes! To automate your workflows, you can schedule data pipelines to run on regular intervals using tasks. Tasks can be scheduled to run from every 15 minutes to every few months.
Scheduled tasks can be managed either centrally using the scheduled task page shown above or they can also be managed for a single data pipeline from within the editor.
To learn more about scheduling, see the help topic Schedule a data pipeline task.
What are the differences between Data Pipelines and Data Interoperability or FME by Safe Software?
Data Pipelines is a native data integration capability of ArcGIS Online. It runs within ArcGIS Online and is licensed as a part of (is included with) ArcGIS Online. The focus of Data Pipelines is purely on streamlining data integration for ArcGIS Online, with the resulting data being written to hosted feature layers. ArcGIS Data Interoperability (and FME) supports a larger set of inputs and transformations and is able to write results back to the source.
How is Data Pipelines different from ArcGIS Velocity?
While there are similarities in that both allow you to connect to external data sources and import the data into ArcGIS Online, they serve distinct purposes. ArcGIS Velocity is specifically designed for real-time and big data processing, efficiently handling high-speed data streams from sensors and similar sources. It also is focused on enabling analytics such as device tracking, incident detection, and pattern analysis. Data Pipelines is primarily a data integration application that focuses on data engineering tasks, particularly for non-sensor-based data streams.
If you are interested in learning about the different ways data can be integrated with ArcGIS Online, check out this ArcGIS Blog: Seven Ways to Integrate Data with ArcGIS Online.
Is ArcGIS Data Pipelines like ModelBuilder but for ArcGIS Online?
ModelBuilder and Data Pipelines are similar in that they offer a low code, drag and drop user experience for authoring repeatable workflows. However, there are some key differences:
-
- ModelBuilder is a capability included in ArcGIS Pro; Data Pipelines is a capability included in ArcGIS Online.
- ModelBuilder can be used to automate or streamline analysis workflows, leveraging any of the geoprocessing tools found in ArcGIS Pro (over a thousand); Data Pipelines can be used to automate or streamline data integration and preparation workflows, and includes a number of focused tools designed to clean, format, and prepare data for visualization and analysis.
- Note: ModelBuilder for Map Viewer is scheduled for a future release. It is designed for Online and Enterprise users, enabling them to chain map viewer tools and data into models to automate and share their analysis workflows.
Is ArcGIS Data Pipelines coming to ArcGIS Enterprise?
ArcGIS Data Pipelines was prioritized for ArcGIS Online because data integration options were more limited. ArcGIS Enterprise already has robust data integration capabilities, with support for publishing web layers by reference from registered data stores, as well as an extensibility feature that allows read-only feature services to be created from any source (Custom data feeds).
If you would like to see ArcGIS Data Pipelines in ArcGIS Enterprise, please share details about your use case via the Data Pipelines Esri Community here.
How many credits does it cost to use Data Pipelines?
At the time of the 2024 Esri User Conference, the cost of working in Data Pipelines interactively (in the editor when building a data pipeline) is 50 credits per hour, on a per-minute basis, with a 10-minute minimum. For scheduled tasks, used to run a data pipeline at the interval of your choosing, the cost is 70 credits per hour, also charged on a per minute bases, but with no minimum charge.
For example: if you spend 30 minutes working in the editor building a data pipeline, you will be charged 25 credits. If a scheduled run takes 4 minutes to run, you will be charged 5 credits.
Equate credits to the currency of your choosing and then compare the cost of using Data Pipelines to the cost of hiring a developer to author and maintain a python script, or the time it takes for a GIS analyst to complete the process manually on a repeated basis: the value of Data Pipelines will be quickly realized!
For more information on credits in ArcGIS Online, please review this documentation.
I am new to Data Pipelines. How do I get started?
A great place to start is in our product documentation. That said, there are a lot of other resources that can be helpful for both new and more experienced users alike. Here are a few to start with:
If you have other questions, feedback, or ideas for new features you’d like to see in ArcGIS Data Pipelines, please engage with us on the Data Pipelines Community.
Commenting is not enabled for this article.