Spark Applications
The driver, the executors, and the SparkSession.
Spark Applications consist of a driver process and a set of executor processes. The
driver runs your main() function, sits on a node in the cluster, and is responsible for three
things: maintaining information about the Spark Application; responding to a user's program or
input; and analyzing, distributing, and scheduling work across the executors. The driver is the
heart of a Spark Application and maintains all relevant information during its lifetime.
The executors carry out the work the driver assigns. Each executor is responsible for only two things: executing code assigned to it by the driver, and reporting the state of the computation on that executor back to the driver node.
You control your Spark Application through the driver process called the SparkSession. The
SparkSession instance is how Spark executes user-defined manipulations across the cluster.