-
Databricks is a distributed data analytics and processing platform designed to run in the Cloud. This platform is built on Apache Spark which is currently at version 2.4.4. In this post, I will demonstrate the deployment and installation of custom R based machine learning packages into Azure Databricks Clusters using Cluster Init Scripts. So, what…
-
In my limited experience with processing big data workloads on the Azure Databricks platform powered by Apache Spark, it has become obvious that a significant part of the tasks are targeted towards Data Quality. Data quality in this context mostly refers to having data that is free of errors, inconsistencies, redundancies, poor formatting and other…
-
1) Install Jupyter on the local machine outside of any existing Python Virtual environment: pip install jupyter –no-cach-dir 2) Create a Python Virtual environment. mkdir virtualenv cd virtualenv python.exe -m venv dbconnect 3) Change directory into the virtual environment and activate .\scripts\activate 4) Install ipykernel package in the virtual environment pip install ipykernel 5) Use…
-
Azure Databricks is a data analytics and machine learning platform based on Apache Spark. The first set of tasks to be performed before using Azure Databricks for any kind of Data exploration and machine learning execution is to create a Databricks workspace and Cluster. The following Python functions were developed to enable the automated provision…
-
Introduction Thanks to a recent Azure Databricks project, I’ve gained insight into some of the configuration components, issues and key elements of the platform. Let’s take a look at this project to give you some insight into successfully developing, testing, and deploying artifacts and executing models. One note: This post is not meant to be…
-
In this post, I want to write about my experience testing and using Azure Kubernetes service to deploy a Jenkins Instance solution that is highly available and resilient. With the Kubernetes persistent volume feature, an Azure disk can be dynamically provisioned and attached to a Jenkins Instance container deployment. In another scenario, an existing Azure…
-
Introduction. Microsoft just updated the ASWPowerShell module to better enable Cloud administrators manage and provision cloud resources in the AWS cloud space while using the same familiar PowerShell tool. As at last count today, the AWSPowerShell module contains almost four thousand cmdlets: This means Microsoft is committed to expanding on PowerShell functionality as a robust…
-
Summary: A new class of security vulnerabilities referred to as “Speculative execution side-channel attacks” also known as “Meltdown and Spectre” were publicly disclosed by Cyber security researchers this week. Given the gravity of these flaws, many concerns have been rightly raised. In this article I will cover their impact as well as the Microsoft Cloud’s…
-
Azure Site Recovery service enables businesses ensure business continuity by keeping their workloads and applications running on Virtual Machines and physical servers available if a site goes down. Site Recovery replicates these workloads running on VMs and physical servers so that they remain available in a secondary location if the primary site isn’t available. It…