(一)spark 相关安装部署、开发环境
1、大数据利器:Spark的单机部署与测试笔记
http://bbs.chinahadoop.cn/article-4057-1.html
2、Spark 0.9.1 Standalone模式分布式部署
http://chinasparker.sinaapp.com/?p=67
https://spark.apache.org/docs/latest/spark-standalone.html#installing-spark-standalone-to-a-cluster
3、Spark实战:单节点本地模式搭建Spark运行环境
http://www.cstor.cn/textdetail_7500.html
4、Spark 1.0.0 横空出世 Spark on Yarn 部署(Hadoop 2.4)
http://blog.csdn.net/tntzbzc/article/details/27817189
5、Apache Spark探秘:三种分布式部署方式比较
http://dongxicheng.org/framework-on-yarn/apache-spark-comparing-three-deploying-ways/
(二)spark 原理与编码
1、理解Spark的核心RDD
http://www.infoq.com/cn/articles/spark-core-rdd
2、How-to: Translate from MapReduce to Apache Spark(怎样从 MapReduce 迁移到 Spark)
http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/
3、Spark SQL 源码分析之 In-Memory Columnar Storage 之 cache table
http://blog.csdn.net/oopsoom/article/details/39525483
(三)spark 监控与管理
1、Common Spark Troubleshooting
http://www.datastax.com/dev/blog/common-spark-troubleshooting
2、
(四)YARN & spark
1、Apache Spark探秘:多进程模型还是多线程模型?
http://dongxicheng.org/framework-on-yarn/apache-spark-multi-threads-model/
(五)spark 数据平台架构
(六)spark 应用与实践
1、How-to: Do Near-Real Time Sessionization with Spark Streaming and Apache Hadoop
http://blog.cloudera.com/blog/2014/11/how-to-do-near-real-time-sessionization-with-spark-streaming-and-apache-hadoop/
2、