Tag Archives: Architecture

Apache Spark modules and their dependencies

This blog has been moved to new address: http://www.trongkhoanguyen.com. As you can see, module spark-core is the foundation framework for all the others. This module provides the implementations for spark computing engine: rdd, schedule, deploy, executor, storage, shuffle, … Module spark-sql including spark-hive … Continue reading

Posted in Architecture, Spark | Tagged | Leave a comment

Apache Spark 1.3 architecture – module spark-core

This blog has been moved to new address: http://www.trongkhoanguyen.com. After spending a significant time in reading the source code in spark-core project, I can briefly draw the architecture showing the relationships and the flow (messages passed) between important components in this … Continue reading

Posted in Architecture, Spark | Tagged , | 2 Comments

[Source code analysis] Narrow dependency and wide dependency implementation in Spark

This blog has been moved to new address: http://www.trongkhoanguyen.com. Files: Dependency.scala As mentioned about different types of dependencies of RDDs in previous post, today I’m going to dive more about its implementation. As you can see from the class diagram, dependency is divided … Continue reading

Posted in Source code analysis, Spark | Tagged , | 1 Comment