Category: Scala

Asynchronous programming with Async / Await and the Scala Play Framework

Asynchronous programming has a number of advantages, most notably is its well touted ability to improve responsiveness. Asynchronous events occur independently of the main program flow and asynchronous actions are executed in a non-blocking, lock-free manner. This, ultimately, allows the main program flow to continue unimpeded, without blocking. On the flips side, asynchronous programming can be difficult to reason about. Many actions are often run simultaneously which can lead to...

Jupyter Notebook Server with pyspark over SSL

In this post, we will describe how to configure a publicly accessible Jupyter Notebook Server over SSL. The Jupyter notebook is, by default, accessible only via localhost. In some cases, it is useful to expose it publicly. Here is how to do it simply… Configure a password for public Notebook server Open python REPL $python >>>from IPython.lib import passwd >>>passwd() >>>Enter password >>>Verify password ‘sha1:408a945027ad:fec843e6f020d6c172a16b5ad89989e3c3175d99’ Create a self signed cert openssl req -x509 -nodes -days...

Apache Zeppelin with SSL

Apache Zeppelin is an awesome web based notebook that allows for interactive data analytics. It is architected to be language agnostic and (as of today) supports Scala (with Apache Spark), SparkSQL, Markdown and Shell. In this post, we will describe how to configure a  Zeppelin notebook Server with SSL Here is how to do it simply… First Install Zeppelin Install Zeppelin git clone https://github.com/apache/incubator-zeppelin.git mvn clean package -Pspark-1.4 -Dhadoop.version=2.2.0 -Phadoop-2.2 -DskipTests Note that, eventually,...

Apache Spark: Convert CSV to RDD

Below is a simple Spark / Scala example describing how to convert a CSV file to an RDD and perform some simple filtering. This example transforms each line in the CSV to a Map with form header-name -> data-value. Each map key corresponds to a header name, and each data value corresponds the value of that key the specific line. This particular example also assumes that the header information is...

How to use the Play WS library in a standalone Scala app

The Play WS library makes it possible to execute HTTP requests and process the response asynchronously. It provides an awesome API that is incredibly easy to use. (I’ve provided a few simple WS examples toward the end of this post.) Prior to the release of Play 2.4 (the current 2.4 release is the M2 milestone release), it was possible to utilize the WS API in a standalone Play app, however...