scala> val readmeFile = sc.textFile("input/tmp/README.md")
scala> // how to inspect the readmeFile?
ve ...: Örneğin
. take(x)
x öğeleri ve foreach baskılar onları ilk seçer:
scala> val readmeFile = sc.textFile("input/tmp/README.md")
scala> readmeFile.take(5).foreach(println)
# Apache Spark
Spark is a fast and general cluster computing system for Big Data. It provides
high-level APIs in Scala, Java, and Python, and an optimized engine that
supports general computation graphs for data analysis. It also supports a
ve ...
scala> val linesContainingSpark = readmeFile.filter(line => line.contains("Spark"))
scala> linesContainingSpark.take(5).foreach(println)
# Apache Spark
Spark is a fast and general cluster computing system for Big Data. It provides
rich set of higher-level tools including Spark SQL for SQL and structured
and Spark Streaming.
You can find the latest Spark documentation, including a programming
Aşağıdaki örnekler eşdeğer fakat kullanarak pyspark şunlardır:
>>> readmeFile = sc.textFile("input/tmp/README.md")
>>> for line in readmeFile.take(5): print line
...
# Apache Spark
Spark is a fast and general cluster computing system for Big Data. It provides
high-level APIs in Scala, Java, and Python, and an optimized engine that
supports general computation graphs for data analysis. It also supports a
ve
>>> linesContainingSpark = readmeFile.filter(lambda line: "Spark" in line)
>>> for line in linesContainingSpark.take(5): print line
...
# Apache Spark
Spark is a fast and general cluster computing system for Big Data. It provides
rich set of higher-level tools including Spark SQL for SQL and structured
and Spark Streaming.
You can find the latest Spark documentation, including a programming
Muhtemelen zaten fark etmişsinizdir; 'al (5)' aslında unix'te 'head' gibi olurdu ve yayınlanan mesajınızda kullandığınız 'filter' daha çok bir grep'e benziyordu. Bununla birlikte, 'filtre', bunları topladığınızdan beri size benzer bir sonuç vermedi; En kolay yol, 'filter'den sonra' take' eklenecektir. – lrnzcig