Fast Data Processing With Spark
Packt Publishing Ltd, 23 oct. 2013 - 120 pages
This book will be a basic, step-by-step tutorial, which will help readers take advantage of all that Spark has to offer.Fastdata Processing with Spark is for software developers who want to learn how to write distributed programs with Spark. It will help developers who have had problems that were too much to be dealt with on a single computer. No previous experience with distributed programming is necessary. This book assumes knowledge of either Java, Scala, or Python.
Avis des internautes - Rédiger un commentaire
Aucun commentaire n'a été trouvé aux emplacements habituels.
Autres éditions - Tout afficher
APIs Boolean Building your Spark cache chapter cogroup combineByKey command Common Java RDD configuration CSVReader(new StringReader(line dataset default Deploying set Deploying Spark distributed doctest DoubleRDD functions environment variables example export following code garbage collection GeneralRDD functions groupByKey Hadoop HBase Hive queries input installed instance type Integer interactive inthe Iterator JAR file Java function classes Java RDD functions JavaDoubleRDD JavaPairRDD functions keyvalue pair Links and references Loading data logistic regression machines over SSH Manipulating your RDD map function MapReduce master Maven Mesos numPartitions output package PairRDD functions parsed partition Partitioner plugin provided function Python reduceByKey result Returns an RDD Running Spark sbt/sbt Scala SCALA_HOME scripts serializer set of machines Shark Spark cluster Spark Java function Spark job Spark project Spark shell spark.SparkContext SparkContext sparkHome sparkuser Standard RDD functions String Summary theSpark variable wget worker youcan zeroValue