In a recent blog post we released a code sample for integrating MongoDB data with the ArcGIS system. This blog entry will help you use the code sample with an existing Mongo DB database in three steps: downloading and compiling the code sample, pre-processing your data, and preparing your MongoDB database for use with ArcGIS.
You will need the following:
- Familiarity with plug-in data sources in ArcGIS
- Familiarity with MongoDB
- Visual Studio 2010
- ArcGIS Desktop
- Geographic data stored in MongoDB that you wish to view in ArcGIS
Getting the Code
Download the updated code sample, now compile. You might need to repair some of the references in the projects. After compiling, use the esriRegasm utility to register the MongoDBPlugIn and MongoDBCommands assemblies with ArcGIS Desktop.
Open ArcMap, right click on the toolbar, and choose the Customize option. Browse through the commands until you find the MongoDB related buttons, and add them to a toolbar of your liking. At this point, you have to prepare your MongoDB data for display.
Included in the code sample package is a sample connection file. You should customize its contents to reference your MongoDB database. Open it in a text editor and set your IP address and database name:
mongodb:[IPADDRESS]?safe=true,[DATABASE NAME]
Preparing the Data
MongoDB and ArcGIS have significant differences that are worth noting before proceeding:
MongoDB Characteristics | ArcGIS Characteristics |
Documents support arbitrarily deep nestings of key-value pairs. | Rows must consist of a flat collection of values matching a defined list of fields. |
Cursors return heterogeneous documents with the collection keys varying arbitrarily. | Cursors return rows conforming to the same set of fields. |
Documents have a unique object id, which is a 12 byte binary value. | Rows have a unique object id, which is a 32 bit integer. |
While MongoDB is schema-less, ArcGIS assumes each dataset has a consistent, flat schema. The ArcObjects cursor model and ArcGIS plug-in dataset architecture requires that datasets return a cursor to sets of features with a fixed collection of fields. Unstructured or semi-structured MongoDB documents shatter this assumption. To bring MongoDB data into ArcGIS, it must first be standardized and flattened.
One approach is to extract some or all of the data into a new collection, performing the necessary transformations along the way. Noting the mismatch between object id value types, be sure to explicitly set the _id attribute in the result document to unique 32 bit integers. Be sure to record the exact set of attributes, their names, and types during this stage, as you will need them later. Finally, each document’s points need to be stored in a manner that the code sample can read. Each point should be stored in [X,Y] order as a bson array <add link?> under the key “shape.” Note that we have only tested datasets using the WGS 84 geographic coordinate system. Adapting the plug-in data source code sample to handle other projections would require a change to the MongoDBWorkspace CreateDataset method to set the spatial index’s minimum and maximum X and Y values.
Dataset Metadata
All plug-in dataset implementations need to be able to supply a set of fields and the geographic extent of the dataset on demand. In this code sample, we create a collection known as GDB_ITEMS that contains metadata documents. Please refer to the CatalogDataset.cs file for the relevant implementation. The code sample also includes functionality to create this metadata as part of a tool for loading data from a geodatabase feature class into MongoDB. See the MongoDataLoadCmd.cs and DataLoadUtilities.cs files for implementation instructions.
The data load tool is still useful to the current goal of bringing MongoDB data into ArcGIS. One way to generate GDB_ITEMS collection and its contents is to create a dummy feature class in a geodatabase and then to import it into MongoDB using the provided tools. The dummy feature class needs to be a point feature class’s, have a valid geographic extent, and a field set matching those identified previously. Be certain to use a compatible geographic coordinate system when setting the feature class’s spatial reference. Empty feature classes have undefined geographic extents, while we need a feature class with a valid extent that covers the full range of values we have stored in MongoDB. The easiest way to remedy this is to insert points at the upper left and lower right corner of the area your MongoDB data occupies; and then delete the points. The feature class will now have a valid geographic extent.
Once the dummy feature class has been created and its extent primed, open ArcMap and use the data loading tool included in the coding sample to import it into your MongoDB database. Verify that the GDB_ITEMS collection exists and now contains a single metadata document. Additionally a collection will now exist with an identical name to the dummy feature class. Drop this collection and replace it with your actual data by renaming your simplified and flattened collection to match the dummy feature class. Be certain to ensure a spatial index on your spatial field using the MongoDB command line.
Add the Data to a Map
Now the data can be used in ArcGIS. In ArcMap use the Add MongoDB Data button to navigate to your database and select your collection. You should be able to display data both in the map and in the attribute table. Symbolization and labeling should work normally. Additionally, you should be able to use your layer as a source for some geoprocessing tasks.
Thomas Breed, our resident MongoDB/NoSQL ninja who supplied the info for this post, talked about Plug-in data sources and MongoDB in the Effective Geodatabase Programming session at this year’s Developer Conference. You can find a video of that session HERE, fast forward to about the 33:30 mark for the spiel on Plug-in data sources.
Commenting is not enabled for this article.