Category Archives: Architecture

Understand the scheduler component in spark-core

This blog has been moved to new address: http://www.trongkhoanguyen.com. 1. Introduction To day, let’s get to understand what’s really happening behind the scene after we submit a Spark job to the cluster. I promise you that there will be many interesting stuffs … Continue reading

Posted in Architecture, Spark | Tagged | 4 Comments

Apache Spark modules and their dependencies

This blog has been moved to new address: http://www.trongkhoanguyen.com. As you can see, module spark-core is the foundation framework for all the others. This module provides the implementations for spark computing engine: rdd, schedule, deploy, executor, storage, shuffle, … Module spark-sql including spark-hive … Continue reading

Posted in Architecture, Spark | Tagged | Leave a comment

Apache Spark 1.3 architecture – module spark-core

This blog has been moved to new address: http://www.trongkhoanguyen.com. After spending a significant time in reading the source code in spark-core project, I can briefly draw the architecture showing the relationships and the flow (messages passed) between important components in this … Continue reading

Posted in Architecture, Spark | Tagged , | 2 Comments