This is a minimal example for deploying Databricks service on Azure. The smallest number of nodes in the cluster will be 1, and maximum 5. Node type will be the smallest one, and Spark version the latest one with long term support. You will be able to log in automatically with your SSO user. Auto-termination is set to 20 minutes.

Databricks logo

Several manual steps are necessary to configure Terraform with Azure Service Principal.

  1. Create resource group DEPLOY_DATABRICKS
  2. Create storage account for terraform state deploydatabricks
  3. Create container in storage account deploy-databricks-terraform-state
  4. Create Service Principal that will be used for Terraform operations
   az ad sp create-for-rbac --name "http://deploy-databricks-service-principal.<YOUR_DOMAIN>.onmicrosoft.com" --role contributor \
   --scopes /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP> \
   --sdk-auth
  1. Allow Contributor access to Service Principal for the resource group
  2. Allow Contributor access to Service Principal for storage account
  3. Create environment CLEAN_UP, and add yourself as reviewer to have manual approval for decommissioning resources
  4. Save authentication credentials in GitHub secrets, along with tenant ID and administrator mail that will enable you to log in automatically:

    • SERVICE_PRINCIPAL_ID
    • SERVICE_PRINCIPAL_SECRET
    • SUBSCRIPTION_ID
    • TENANT_ID
    • ADMIN_USER_MAIL

Official documentation for configuring Service Principal on Azure for Terraform

Created With:

Screenshots

  1. Databricks GUI
    Databricks GUI

  2. Databricks workspace tags
    Databricks tags

  3. GitHub secrets
    GitHub secrets

  4. Manual approval setup
    Manual approval

  5. Provisioning and decommissioning jobs
    GitHub jobs