The ArcGIS API for Python is a Python library that you can use to script and automate GIS workflows in ArcGIS Online and ArcGIS Enterprise. These workflows include administering your Web GIS, managing your organization’s data and content, visualizing GIS data and making web maps, and performing spatial analysis and data science.
While we recommend using the ArcGIS API for Python in ArcGIS Notebooks for consistency and compatibility reasons, we know that some of our customers want to install it via package managers and use it in Jupyter Notebooks, Visual Studio Code, PyCharm, Spyder, or whatever other IDE they prefer.
Recently, we’ve heard from several customers who are interested in installing and using the ArcGIS API for Python in Databricks Notebooks. The following sections walk you through the steps for getting this up and running in Microsoft Azure.
What is Databricks?
Databricks is a cloud-based, scalable lakehouse platform that enables organizations to use their data, analytics, and AI. It seamlessly integrates with the storage and security in your cloud account and deploys and manages cloud infrastructure on your behalf.
One of the key features of Databricks is their Notebook offering. These notebooks are fully managed (e.g. “hosted”) and come with a runtime containing pre-installed Python libraries. You also have the option to customize these runtimes by installing third-party or custom Python libraries, which in our case will be the ArcGIS API for Python.
Steps to install and use ArcGIS API for Python in Microsoft Azure Databricks
Prerequisites
- An active subscription to Microsoft Azure.
- A Databricks account. If you do not have one, you can create one by starting a Databricks free trial.
- A Databricks workspace. You can use these steps to create one, then open it a web browser.
1. Create a cluster in Databricks
We first need to create a cluster, which is the computing resources available to you in your workspace.
- Click the Compute section of the workspace.

- Click Create compute.

- Change the access mode to Shared. Shared access mode allows you to share the cluster with multiple users. It requires either “credential passthrough” or “table access control” to be enabled.


- Scroll down and click on the Advanced options dropdown, then check the box to enable credential passthrough.

- While still in the Advanced options, click the tab for Init Scripts. This is where you define the path of the installation script, which contains commands for installing the
arcgis
package. In this case, our installation script is located in a Workspace folder called Shared.


- Click on the install.sh file, then click Add.

This Init script is a .sh file, which is a shell script that is run during cluster startup and is used to configure the cluster.
Here is an example of the Init script. You can use the example below to create the install.sh from scratch in v2.4.1. Note the installation of the arcgis
package.

- After configuring the Advanced options, at the bottom of the screen, click Create compute. This might take anywhere from a few seconds to a few minutes to complete.

2. Creating and running a notebook in Databricks
- To create a notebook in Databricks, click the Workspace tab on the left menu.
- Click the Create dropdown on the top right corner, and click Notebook.

- Click the Connect dropdown on the top right corner and select the cluster we created in the previous step to configure the notebook runtime to use the cluster configured with the ArcGIS API for Python and its required dependencies.

This may take a few moments. When the cluster is attached, the cluster name will appear as shown below.

3. Verify ArcGIS API for Python installation and use it in a Databricks Notebook
- In a notebook cell, type the command
import arcgis
to import the ArcGIS API for Python into the notebook. - Type
arcgis.__version__
to ensure that you have the correct version of the package installed, and run the code cell.

Now you are ready to connect to your GIS and to start using the ArcGIS API for Python. Happy scripting!


Community and collaboration
You can use the ArcGIS API for Python Esri Community page to ask specific questions, suggest ideas for enhancements and improvements, connect with other users, and read recent blogs. You can also use the ArcGIS API for Python public GitHub repo to submit bugs, enhancement requests, and other issues. The team actively monitors these pages and greatly appreciates your feedback and suggestions, which guide our priorities for future API development. We encourage you to share your thoughts!
Commenting is not enabled for this article.