Apache Spark tutorial

apache spark tutorial java

This is the Apache Spark tutorial in Java. Here we start from the easiest parts of learning Spark and dive into more complicated topics.

Apache Spark is a powerful tool for processing of large amounts of data. It operates with RDDs – Resilient Distributed Datasets. RDD is the abstraction over distributed collection. It can be interacted in two ways: transformations and action. We will discuss it all in this tutorial.

1. Apache Spark basics

In this tutorial we go through Spark essentials. In the end you will be familiar with Spark API.

Common errors

Here we list typical Spark errors and exeptions