Did you know that as data moves and evolves, one crucial aspect often gets overlooked? Metadata, the information about the data, plays a vital role in ensuring the context, accuracy, and integrity of your data. It also embraces FAIR data principles (Findable, Accessible, Interoperable, and Reusable). In the ETL (Extract Transform and Load) world, this often appears in the form of focusing on data movement at the expense of data usability. In this article, we will discuss the importance of metadata, the value of using ArcGIS Data Interoperability, and how to create and maintain metadata as it moves through ArcGIS.
In this article, we will focus on the first of three use cases for metadata management:
- Populating feature class and table metadata in ArcGIS Pro
Populating Feature Class and Table Metadata
Let’s start with an example of Case #1 – feature class & table metadata written at ETL time.
To start we imported data from Vancouver’s open data web site. The data was brought into an ArcGIS Pro project geodatabase by a spatial ETL tool (ImportBuildingPermits) that reads from Vancouver’s open data web site. The dots on the map are today’s permit locations in the feature class BuildingPermits.
Not only was the building permit data written to a feature class, but it also had metadata generated by an ETL process. In the screenshot below, we are viewing the metadata for the BuildingPermits feature class in ArcGIS Pro. Here we can see the item information (title, tags, summary, description, credits, use limitations).
If you’re familiar with writers in spatial ETL tools you’ll be aware they don’t have much in the way of metadata support. However, as Data Interoperability is an ArcGIS Pro extension it knows about everything ArcGIS Pro does – including ArcPy’s metadata module, and that is where we go for this support!
To back up a little bit, the Workbench app delivered by Data Interoperability has a feature called a shutdown script (there is also a startup script) which lets you run Python code with any installed interpreter on completion of any tool run. For the subject matter data, we leverage this feature to do two things the built-in writers don’t know about:
- Creating relationship classes between output feature types
- Creating metadata for each output (yay!)
The only trick to using this feature is knowing that the shutdown script has a built-in dictionary object (fme.macroValues) that has keys for all tool parameters, this includes input and output data paths. After that, the rest is easy! We can build the script starting from snippets of the Copy Python Command option, located in ArcGIS Pro’s History pane. Here is the shutdown script:
# Create relationship classes between BuildingPermits and child tables PropertyUse and SpecificUseCategory
import arcpy
from datetime import datetime
import fme
import os
import pytz
arcpy.env.overwriteOutput = True
gdb = fme.macroValues['OutputGDB']
arcpy.env.workspace = gdb
origin = "BuildingPermits"
for destination in ["PropertyUse","SpecificUseCategory"]:
arcpy.management.CreateRelationshipClass(
origin_table=origin,
destination_table=destination,
out_relationship_class=origin + '_' + destination,
relationship_type="SIMPLE",
forward_label=destination,
backward_label=origin,
message_direction="NONE",
cardinality="ONE_TO_MANY",
attributed="NONE",
origin_primary_key="PermitNumber",
origin_foreign_key="PermitNumber")
# Create layer level metadata
current_time_utc = datetime.now(pytz.utc)
pst = pytz.timezone('Canada/Pacific')
current_time_pst = current_time_utc.astimezone(pst)
current_time_pst_formatted = current_time_pst.strftime('%Y-%m-%d %H:%M:%S')
descriptions = {"BuildingPermits":"Construction projects and any change of land use or occupancy on private property requiring a building permit",
"PropertyUse":"General use of property; multiple uses will be accessible in a 1:M lookup",
"SpecificUseCategory":"Category of property use; multiple categories will be accessible in a 1:M lookup"}
for obj in ["BuildingPermits","PropertyUse","SpecificUseCategory"]:
new_md = arcpy.metadata.Metadata()
name = obj.replace("gP","g P").replace("yU","y U").replace("cU","c U").replace("eC","e C")
new_md.title = f"{name} of Vancouver, Canada 2024."
new_md.tags = f"Demo,Esri,City of Vancouver,Canada,applications,{obj},{name}"
new_md.summary = f"Layer includes information of all {name} issued to date by the City of Vancouver in 2024"
new_md.description = descriptions[obj]
new_md.credits = f"City of Vancouver, {current_time_pst_formatted} Pacific."
new_md.accessConstraints = "https://opendata.vancouver.ca/pages/licence/"
tgt_item_md = arcpy.metadata.Metadata(obj)
tgt_item_md.copy(new_md)
tgt_item_md.save()
It is important to take the time to embed metadata automation during the ETL time process. Doing this ensures that we don’t forget it later and is important because:
- It ensures that essential information is documented for our data.
- It helps consumers understand the context of our data and how it should be used in mapping and analysis.
- It enhances search and discovery of our data- making it easier for people to find.
You’ll have noticed in the map and metadata screen grab above that we used the unique value renderer to display the data in categories (type of permitted work) and configured pop-ups. We then published the layer and related tables to ArcGIS Online – the target information product.
When our data is published, the metadata for each layer flows to the sublayers of the hosted feature layer.
Here it is in the BuildingPermits layer:
Because metadata is flowing from the data to the published sublayers of the hosted feature layer this saves us the trouble of recreating it. So that’s how to get metadata into your information product at ETL time and why it’s important to do so.
Conclusion
As you can see from this first use case, ArcGIS Data Interoperability is a simple and easy to use tool that can help manage metadata within ArcGIS. Join us in Metadata – Data Interoperability’s Hidden Talent (Part Two) where we will discuss the other two use cases for metadata management:
- Populating metadata for a hosted feature layer in ArcGIS Online
- Updating metadata for multiple portal items in ArcGIS Online in bulk
We will also discuss deep copying data and metadata between environments using metadata.xml.
If you have any questions about the use case above, or if you need additional assistance managing your own metadata, please reach out to your regular account representative or visit us on our Esri Community website.
To obtain a free trial of ArcGIS Data Interoperability, please visit this page.
Article Discussion: