This is a minimal example for deploying Databricks service on Azure. The smallest number of nodes in the cluster will be 1, and maximum 5. Node type will be the smallest one, and Spark version the latest one with long term support. You will be able to log in automatically with your SSO user. Auto-termination is set to 20 minutes.
Several manual steps are necessary to configure Terraform with Azure Service Principal.
- Create resource group DEPLOY_DATABRICKS
- Create storage account for terraform state deploydatabricks
- Create container in storage account deploy-databricks-terraform-state
- Create Service Principal that will be used for Terraform operations
az ad sp create-for-rbac --name "http://deploy-databricks-service-principal.<YOUR_DOMAIN>.onmicrosoft.com" --role contributor \
--scopes /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP> \
--sdk-auth
- Allow Contributor access to Service Principal for the resource group
- Allow Contributor access to Service Principal for storage account
- Create environment CLEAN_UP, and add yourself as reviewer to have manual approval for decommissioning resources
-
Save authentication credentials in GitHub secrets, along with tenant ID and administrator mail that will enable you to log in automatically:
- SERVICE_PRINCIPAL_ID
- SERVICE_PRINCIPAL_SECRET
- SUBSCRIPTION_ID
- TENANT_ID
- ADMIN_USER_MAIL
Official documentation for configuring Service Principal on Azure for Terraform
Created With:
- Terraform - Infrastructure automation tool
- Databricks - Web-based processing platform for Spark
- GitHub Actions - CICD tool
- Azure - Cloud computing service owned by Microsoft
Screenshots
-
Databricks GUI
-
Databricks workspace tags
-
GitHub secrets
-
Manual approval setup
-
Provisioning and decommissioning jobs