What's New in Data Pipelines (October 2023)

By Bethany Scott and Corinne Walker

Data Pipelines (beta) is a new data engineering and integration application that was released in ArcGIS Online in June 2023. The Data Pipelines development team has been listening closely to your feedback and have implemented new features based on your requests. New features include scheduled data pipeline workflows, and an easier way to configure the schema for delimited datasets. Keep reading to learn more!

Scheduled data pipelines

Data is constantly changing and growing. Working with the latest data can be critical in providing meaningful maps and dashboards, conducting accurate analysis, and making informed decisions. We’re excited to announce the ability to schedule data pipeline tasks so you can automatically keep your data up to date.

You can schedule data pipelines to run at specific intervals, such as every hour, at a certain time of day, or monthly. For example, let’s say your data pipeline ingests data from a public URL that updates every night at 6 pm. You can configure a task to run every day in the evening to update your feature layer with the newest data. You can use the Manage scheduling page to create, edit, and view tasks. For each task that is run, you can view results, the run duration, inspect any messages returned from the run, and more.

Watch the video below for a quick tour of scheduling.

To learn more about scheduling, see the help topic Schedule a data pipeline task.

Schema configuration

Another new feature is the ability to configure the schema for datasets in CSV and other delimited formats. This gives you more control over your data and how you want to represent fields throughout your workflow and in the output feature layer or table. With the new experience for configuring schema, you can modify field names, field types, and choose which fields you want to include or exclude. You can return to the schema configuration dialog and modify these properties at any time.

Configure schema for delimited files in Data Pipelines.

For now, schema configuration is only available for delimited file formats, such as CSVs. Unlike other formats supported in Data Pipelines, delimited files do not have field types defined in the source dataset; schema configuration enables you to define the types in the Data Pipelines app. For all other dataset formats, you can continue to use the tool Update fields to modify field names and types, and the tool Select fields to exclude fields from your dataset.

General enhancements

In addition to scheduling and schema configuration, there are other new enhancements including:

Support for writing big integers and date only fields
A new option to calculate summary statistics in the Join tool
Improved error reporting
And more!

To learn more about the new features and enhancements in Data Pipelines, see the What’s New topic in the documentation.

More information

We want to hear from you! There is much more to come in future updates of Data Pipelines, and we value your feedback on what we should do next. Share your ideas or ask us a question in the Data Pipelines community.

If you’re new to Data Pipelines and want to learn more about it’s powerful data integration capabilities, check out the introductory blog from June 2023, or read the documentation.

Bethany Scott

Bethany (she/her) is a Product Engineer on the Data Pipelines team and the GeoAnalytics team. Her background is in biology and GIS with experience in data management and spatial-temporal analysis.

Corinne Walker

Corinne is a Product Marketing Manager on Esri’s Spatial Analytics & Data Science team. She has a background in marketing and business analytics with experience working in the technology and geospatial industries.

Article Discussion:

0 Comments

Oldest

Newest

Inline Feedbacks

View all comments