首页 技术 正文
技术 2022年11月23日
0 收藏 965 点赞 3,319 浏览 1251 个字

一.使用数据

Apache Spark is a fast and general-purpose cluster computing system.It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.
It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.

二.实现代码

package big.data.analyse.wordcountimport org.apache.spark.sql.SparkSession/**
* Created by zhen on 2019/3/9.
*/
object WordCount {
def main(args: Array[String]) {
val spark = SparkSession.builder().appName("WordCount")
.master("local[2]")
.getOrCreate()
// 加载数据
val textRDD = spark.sparkContext.textFile("src/big/data/analyse/wordcount/wordcount.txt")
val result = textRDD.map(row => row.replace(",", ""))//去除文字中的,防止出现歧义
.flatMap(row => row.split(" "))//把字符串转换为字符集合
.map(row => (row, ))//把每个字符串转换为map,便于计数
.reduceByKey(_+_)//计数
// 打印结果
result.foreach(println)
}
}

三.计算结果

(Spark,)
(GraphX,)
(graphs.,)
(learning,)
(general-purpose,)
(Python,)
(APIs,)
(provides,)
(that,)
(is,)
(a,)
(R,)
(high-level,)
(general,)
(processing,)
(fast,)
(including,)
(higher-level,)
(optimized,)
(Apache,)
(in,)
(SQL,)
(system.,)
(Java,)
(of,)
(data,)
(tools,)
(cluster,)
(also,)
(graph,)
(structured,)
(execution,)
(It,)
(MLlib,)
(for,)
(Scala,)
(an,)
(computing,)
(machine,)
(supports,)
(and,)
(engine,)
(set,)
(rich,)
(Streaming.,)
相关推荐
python开发_常用的python模块及安装方法
adodb:我们领导推荐的数据库连接组件bsddb3:BerkeleyDB的连接组件Cheetah-1.0:我比较喜欢这个版本的cheeta…
日期:2022-11-24 点赞:878 阅读:9,264
Educational Codeforces Round 11 C. Hard Process 二分
C. Hard Process题目连接:http://www.codeforces.com/contest/660/problem/CDes…
日期:2022-11-24 点赞:807 阅读:5,696
下载Ubuntn 17.04 内核源代码
zengkefu@server1:/usr/src$ uname -aLinux server1 4.10.0-19-generic #21…
日期:2022-11-24 点赞:569 阅读:6,537
可用Active Desktop Calendar V7.86 注册码序列号
可用Active Desktop Calendar V7.86 注册码序列号Name: www.greendown.cn Code: &nb…
日期:2022-11-24 点赞:733 阅读:6,307
Android调用系统相机、自定义相机、处理大图片
Android调用系统相机和自定义相机实例本博文主要是介绍了android上使用相机进行拍照并显示的两种方式,并且由于涉及到要把拍到的照片显…
日期:2022-11-24 点赞:512 阅读:7,947
Struts的使用
一、Struts2的获取  Struts的官方网站为:http://struts.apache.org/  下载完Struts2的jar包,…
日期:2022-11-24 点赞:671 阅读:5,108