Chinny Chukwudozie, Ai architecture.

AI Solutions and Agentic Engineering.

Tag: Azure Databricks Cluster

Publish PySpark Streaming Query Metrics to Azure Log Analytics using the Data Collector REST API.

Overview. At the time of this writing, there doesn’t seem to be built-in support for writing PySpark Structured Streaming query metrics from Azure Databricks to Azure Log Analytics. After some research, I found a work around that enables capturing the Streaming query metrics as a Python dictionary object from within a notebook session and publishing…

jbernec

November 27, 2020

PySpark Streaming Logs

Azure Databricks Cluster, Azure Log Analytics, Azure Monitor, HTTP Data Collector API, PySpark Application Logs, PySpark Streaming Logs, Python Wheel Package, setup.py
Build a Jar file for the Apache Spark SQL and Azure SQL Server Connector Using SBT.

The Apache Spark Azure SQL Connector is a huge upgrade to the built-in JDBC Spark connector. It is more than 15x faster than generic JDBC connector for writing to SQL Server. In this short post, I articulate the steps required to build a JAR file from the Apache Spark connector for Azure SQL that can…

jbernec

June 29, 2020

Unified Analytics

Apache Spark, Azure Databricks, Azure Databricks Cluster, Microsoft, sbt, Spark, sql-spark-connector, Unified Analytics
Programmatically Provision an Azure Databricks Workspace and Cluster using Python Functions.

Azure Databricks is a data analytics and machine learning platform based on Apache Spark. The first set of tasks to be performed before using Azure Databricks for any kind of Data exploration and machine learning execution is to create a Databricks workspace and Cluster. The following Python functions were developed to enable the automated provision…

jbernec

May 16, 2019

Apache Spark, Azure Automation Account, Azure Databricks, Python

ARM Templates, Automation, Azure Automation, Azure Databricks, Azure Databricks Cluster, Create Cluster API, Databricks REST API 2.0, Python3, yaml

Tag: Azure Databricks Cluster

Publish PySpark Streaming Query Metrics to Azure Log Analytics using the Data Collector REST API.

Build a Jar file for the Apache Spark SQL and Azure SQL Server Connector Using SBT.

Programmatically Provision an Azure Databricks Workspace and Cluster using Python Functions.