The Apache Spark Azure SQL Connector is a huge upgrade to the built-in JDBC Spark connector. It is more than 15x faster than generic JDBC connector for writing to SQL Server. In this short post, I articulate the steps required to build a JAR file from the Apache Spark connector for Azure SQL that can be installed in a Spark cluster and used to read and write Spark Dataframes to and from source and sink platforms.
1. Clone the microsoft/sql-spark-connector GitHub repository.
2. Download the SBT Scala Build Tool.
Download SBT here.
3. Open a shell console, navigate to root folder of the cloned repository and start the sbt shell as shown in the following screen shot:
4. Build the code files into the jar package using the following command:
5. Navigate to the “target” subfolder to locate the built jar file. Using the Azure Databricks Cluster Library GUI, upload the jar file as a spark library.