如何从azure DevOps管道运行/执行Databricks笔记本



我能够成功地将我的工件(笔记本(部署到Databrick Workspace中。但是我如何直接从Azure DevOps管道运行笔记本?

建议我有什么方法可以做到这一点?

Azure Devops有运行笔记本任务,您可以对其进行配置。

或者你也可以使用这个模板。创建一个新的azure_pipelines.yaml,并配置所需的变量:

resources:
- repo: self
trigger:
- master
variables:
databricks-host: 'https://${databricksRegion}.azuredatabricks.net'
notebook-folder: '/Shared/tmp/'
cluster-id: '1234-567890-bobby123'
notebook-name: 'tests'
steps:
- task: UsePythonVersion@0
displayName: 'Use Python 3.x'
- script: |
pip install databricks-cli
displayName: 'Install databricks-cli'
- script: |
databricks fs cp mnt/demo/ dbfs:/tmp/demo --recursive --overwrite
databricks workspace import_dir src/ $(notebook-folder) -o
JOB_ID=$(databricks jobs create --json '{
"name": "Testrun",
"existing_cluster_id": "$(cluster-id)",
"timeout_seconds": 3600,
"max_retries": 1,
"notebook_task": {
"notebook_path": "$(notebook-folder)$(notebook-name)",
"base_parameters": {}
}
}' | jq '.job_id')
RUN_ID=$(databricks jobs run-now --job-id $JOB_ID | jq '.run_id')
job_status="PENDING"
while [ $job_status = "RUNNING" ] || [ $job_status = "PENDING" ]
do
sleep 2
job_status=$(databricks runs get --run-id $RUN_ID | jq -r '.state.life_cycle_state')
echo Status $job_status
done
RESULT=$(databricks runs get-output --run-id $RUN_ID)
RESULT_STATE=$(echo $RESULT | jq -r '.metadata.state.result_state')
RESULT_MESSAGE=$(echo $RESULT | jq -r '.metadata.state.state_message')
if [ $RESULT_STATE = "FAILED" ]
then
echo "##vso[task.logissue type=error;]$RESULT_MESSAGE"
echo "##vso[task.complete result=Failed;done=true;]$RESULT_MESSAGE"
fi
echo $RESULT | jq .
displayName: 'Run Databricks Notebook'
env:
DATABRICKS_TOKEN: $(databricks-token)
DATABRICKS_HOST: $(databricks-host)

最新更新