科百科
当前位置: 首页 范文大全

hadoop大数据平台分布式集群(大数据平台Hadoop集群搭建)

时间:2023-07-27 作者: 小编 阅读量: 1 栏目名: 范文大全

作者:Linux-1874出处:https://www.cnblogs.com/qiuhom-1874/HDFS架构提示:从上图架构可以看到,客户端访问hdfs上的某一文件,首先要向namenode请求文件的元数据信息,然后nn就会告诉客户端,访问的文件在datanode上的位置,然后客户端再依次向datanode请求对应的数据,最后拼接成一个完整的文件;这里需要注意一个概念,datanode存放

作者:Linux-1874

出处:https://www.cnblogs.com/qiuhom-1874/

HDFS架构

提示:从上图架构可以看到,客户端访问hdfs上的某一文件,首先要向namenode请求文件的元数据信息,然后nn就会告诉客户端,访问的文件在datanode上的位置,然后客户端再依次向datanode请求对应的数据,最后拼接成一个完整的文件;这里需要注意一个概念,datanode存放文件数据是按照文件大小和块大小来切分存放的,什么意思呢?比如一个文件100M大小,假设dn(datanode)上的块大小为10M一块,那么它存放在dn上是把100M切分为10M一块,共10块,然后把这10块数据分别存放在不同的dn上;同时这些块分别存放在不同的dn上,还会分别在不同的dn上存在副本,这样一来使得一个文件的数据块被多个dn分散冗余的存放;对于nn节点,它主要维护了那个文件的数据存放在那些节点,和那些dn存放了那些文件的数据块(这个数据是通过dn周期性的向nn发送);我们可以理解为nn内部有两张表分别记录了那些文件的数据块分别存放在那些dn上(以文件为中心),和那些dn存放了那些文件的数据块(以节点为中心);从上面的描述不难想象,当nn挂掉以后,整个存放在hdfs上的文件都将找不到,所以在生产中我们会使用zk(zookeeper)来对nn节点做高可用;对于hdfs来讲,它本质上不是内核文件系统,所以它依赖本地Linux文件系统;

mapreduce计算过程

提示:如上图所示,首先mapreduce会把给定的数据切分为多个(切分之前通过程序员写程序实现把给定的数据切分为多分,并抽取成kv键值对),然后启动多个mapper对其进行map计算,多个mapper计算后的结果在通过combiner进行合并(combiner是有程序员编写程序实现,主要实现合并规则),把相同key的值根据某种计算规则合并在一起,然后把结果在通过partitoner(分区器,这个分区器是通过程序员写程序实现,主要实现对map后的结果和对应reducer进行关联)分别发送给不同的reducer进行计算,最终每个reducer会产生一个最终的唯一结果;简单讲mapper的作用是读入kv键值对,输出新的kv键值对,会有新的kv产生;combiner的作用是把当前mapper生成的新kv键值对进行相同key的键值对进行合并,至于怎么合并,合并规则是什么是由程序员定义,所以combiner就是程序员写的程序实现,本质上combiner是读入kv键值对,输出kv键值对,不会产生新的kv;partitioner的作用就是把combiner合并后的键值对进行调度至reducer,至于怎么调度,该发往那个reducer,以及由几个reducer进行处理,由程序员定义;最终reducer折叠计算以后生成新的kv键值对;

hadoop v1与v2架构

提示:在hadoop v1的架构中,所有计算任务都跑在mapreduce之上,mapreduce就主要担任了两个角色,第一个是集群资源管理器和数据处理;到了hadoop v2 其架构就为hdfs yarn 一堆任务,其实我们可以把一堆任务理解为v1中的mapreduce,不同于v1中的mapreduce,v2中mapreduce只负责数据计算,不在负责集群资源管理,集群资源管理由yarn实现;对于v2来讲其计算任务都跑在了执yarn之上;对于hdfs来讲,v1和v2中的作用都是一样的,都是起存储文件作用;

hadoop v2 计算任务资源调度过程

提示:rm(resource manager)收到客户端的任务请求,此时rm会根据各dn上运行的nm(node manager)周期性报告的状态信息来决定把客户端的任务调度给那个nm来执行;当rm选定好nm后,就把任务发送给对应nm,对应nm内部会起一个appmaster(am)的容器,负责本次任务的主控端,而appmaster需要启动container来运行任务,它会向rm请求,然后rm会根据am的请求在对应的nm上启动一个或多个container;最后各container运行后的结果会发送给am,然后再由am返回给rm,rm再返回给客户端;在这其中rm主要用来接收个nm发送的各节点状态信息和资源调度以及接收各am计算任务后的结果并反馈给各客户端;nm主要用来管理各node上的资源和上报状态信息给rm;am主要用来管理各任务的资源申请和各任务执行后端结果返回给rm;

hadoop生态圈

提示:上图是hadoop v2生态圈架构图,其中hdfs和yarn是hadoop的核心组件,对于运行在其上的各种任务都必须依赖hadoop,也必须支持调用mapreduce接口;

二、hadoop集群部署

环境说明

名称角色ipnode01nn,snn,rm192.168.0.41node02dn,nm192.168.0.42node03dn,nm192.168.0.43node04dn,nm192.168.0.44

各节点同步时间

配置/etc/hosts解析个节点主机名

各节点安装jdk

yum install -y java-1.8.0-openjdk-devel

提示:安装devel包才会有jps命令

验证jdk是否安装完成,版本是否正确,确定java命令所在位置

添加JAVA_HOME环境变量

验证JAVA_HOME变量配置是否正确

创建目录,用于存放hadoop安装包

mkdir /bigdata

到此基础环境就准备OK,接下来下载hadoop二进制包

[root@node01 ~]# wget https://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gz--2020-09-27 22:50:16--https://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gzResolving mirror.bit.edu.cn (mirror.bit.edu.cn)... 202.204.80.77, 219.143.204.117, 2001:da8:204:1205::22Connecting to mirror.bit.edu.cn (mirror.bit.edu.cn)|202.204.80.77|:443... connected.HTTP request sent, awaiting response... 200 OKLength: 366447449 (349M) [application/octet-stream]Saving to: ‘hadoop-2.9.2.tar.gz’100%[============================================================================>] 366,447,449 1.44MB/sin 2m 19s 2020-09-27 22:52:35 (2.51 MB/s) - ‘hadoop-2.9.2.tar.gz’ saved [366447449/366447449][root@node01 ~]# lshadoop-2.9.2.tar.gz[root@node01 ~]#

解压hadoop-2.9.3.tar.gz到/bigdata/目录,并将解压到目录链接至hadoop

导出hadoop环境变量配置

[root@node01 ~]# cat /etc/profile.d/hadoop.shexport HADOOP_HOME=/bigdata/hadoopexport PATH=$PATH:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbinexport HADOOP_YARN_HOME=${HADOOP_HOME}export HADOOP_MAPPERD_HOME=${HADOOP_HOME}export HADOOP_COMMON_HOME=${HADOOP_HOME}export HADOOP_HDFS_HOME=${HADOOP_HOME}[root@node01 ~]#

创建hadoop用户,并设置其密码为admin

[root@node01 ~]# useradd hadoop[root@node01 ~]# echo "admin" |passwd --stdin hadoopChanging password for user hadoop.passwd: all authentication tokens updated successfully.[root@node01 ~]#

各节点间hadoop用户做免密登录

[hadoop@node01 ~]$ ssh-keygen Generating public/private rsa key pair.Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): Created directory '/home/hadoop/.ssh'.Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/hadoop/.ssh/id_rsa.Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.The key fingerprint is:SHA256:6CNhqdagySJXc4iRBVSoLENddO7JLZMCsdjQzqSFnmw hadoop@node01.test.orgThe key's randomart image is: ---[RSA 2048]---- | o*==o .|| o=Bo o||=oX.|| E =.oo.||o.o B.oBS.||.o * =. o||=.o o||oo. .||| ----[SHA256]----- [hadoop@node01 ~]$ ssh-copy-id node01/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"The authenticity of host 'node01 (192.168.0.41)' can't be established.ECDSA key fingerprint is SHA256:lE8/Vyni4z8hsXaa8OMMlDpu3yOIRh6dLcIr oE57oE.ECDSA key fingerprint is MD5:14:59:02:30:c0:16:b8:6c:1a:84:c3:0f:a7:ac:67:b3.Are you sure you want to continue connecting (yes/no)? yes/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keyshadoop@node01's password: Number of key(s) added: 1Now try logging into the machine, with:"ssh 'node01'"and check to make sure that only the key(s) you wanted were added.[hadoop@node01 ~]$ scp -r ./.ssh node02:/home/hadoop/The authenticity of host 'node02 (192.168.0.42)' can't be established.ECDSA key fingerprint is SHA256:lE8/Vyni4z8hsXaa8OMMlDpu3yOIRh6dLcIr oE57oE.ECDSA key fingerprint is MD5:14:59:02:30:c0:16:b8:6c:1a:84:c3:0f:a7:ac:67:b3.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node02,192.168.0.42' (ECDSA) to the list of known hosts.hadoop@node02's password: id_rsa100% 1679636.9KB/s00:00id_rsa.pub100%404186.3KB/s00:00known_hosts100%362153.4KB/s00:00authorized_keys100%404203.9KB/s00:00[hadoop@node01 ~]$ scp -r ./.ssh node03:/home/hadoop/The authenticity of host 'node03 (192.168.0.43)' can't be established.ECDSA key fingerprint is SHA256:lE8/Vyni4z8hsXaa8OMMlDpu3yOIRh6dLcIr oE57oE.ECDSA key fingerprint is MD5:14:59:02:30:c0:16:b8:6c:1a:84:c3:0f:a7:ac:67:b3.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node03,192.168.0.43' (ECDSA) to the list of known hosts.hadoop@node03's password:id_rsa100% 1679755.1KB/s00:00id_rsa.pub100%404165.7KB/s00:00known_hosts100%543350.9KB/s00:00authorized_keys100%404330.0KB/s00:00[hadoop@node01 ~]$ scp -r ./.ssh node04:/home/hadoop/The authenticity of host 'node04 (192.168.0.44)' can't be established.ECDSA key fingerprint is SHA256:lE8/Vyni4z8hsXaa8OMMlDpu3yOIRh6dLcIr oE57oE.ECDSA key fingerprint is MD5:14:59:02:30:c0:16:b8:6c:1a:84:c3:0f:a7:ac:67:b3.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node04,192.168.0.44' (ECDSA) to the list of known hosts.hadoop@node04's password: id_rsa100% 1679707.0KB/s00:00id_rsa.pub100%404172.8KB/s00:00known_hosts100%724437.7KB/s00:00authorized_keys100%404165.2KB/s00:00[hadoop@node01 ~]$

验证:用node01去连接node02,node03,node04看看是否是免密登录了

创建数据目录/data/hadoop/hdfs/{nn,snn,dn},并将其属主属组更改为hadoop

进入到hadoop安装目录,创建其logs目录,并将其安装目录的属主和属组更改为hadoop

提示:以上所有步骤都需要在各节点挨着做一遍;

配置hadoop的core-site.xml

提示:hadoop的配置文件语法都是xml格式的配置文件,其中<property>和</property>是一对标签,里面用name标签来引用配置的选项的key的名称,其value标签用来配置对应key的值;上面配置表示配置默认的文件系统地址;hdfs://node01:8020是hdfs文件系统访问的地址;

完整的配置

[root@node01 hadoop]# cat core-site.xml <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration><property><name>fs.defaultFS</name><value>hdfs://node01:8020</value><final>true</final></property></configuration>[root@node01 hadoop]# View Code

配置hdfs-site.xml

提示:以上配置主要指定hdfs相关目录以及访问web端口信息,副本数量;

完整的配置

[root@node01 hadoop]# cat hdfs-site.xml<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration><property><name>dfs.replication</name><value>3</value></property><property><name>dfs.namenode.name.dir</name><value>file:///data/hadoop/hdfs/nn</value></property><property><name>dfs.namenode.secondary.http-address</name><value>node01:50090</value></property><property><name>dfs.namenode.http-address</name><value>node01:50070</value></property><property><name>dfs.datanode.data.dir</name><value>file:///data/hadoop/hdfs/dn</value></property><property><name>fs.checkpoint.dir</name><value>file:///data/hadoop/hdfs/snn</value></property><property><name>fs.checkpoint.edits.dir</name><value>file:///data/hadoop/hdfs/snn</value></property></configuration>[root@node01 hadoop]# View Code

配置mapred-site.xml

提示:以上配置主要指定了mapreduce的框架为yarn;默认没有mapred-site.xml,我们需要将mapred-site.xml.template修改成mapred.site.xml;这里需要注意我上面是通过复制修改文件名,当然属主信息都会变成root,不要忘记把属组信息修改成hadoop;

完整的配置

[root@node01 hadoop]# cat mapred-site.xml<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property></configuration>[root@node01 hadoop]# View Code

配置yarn-site.xml

提示:以上配置主要配置了yarn框架rm和nm相关地址和指定相关类;

完整的配置

[root@node01 hadoop]# cat yarn-site.xml<?xml version="1.0"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><configuration><property><name>yarn.resourcemanager.address</name><value>node01:8032</value></property><property><name>yarn.resourcemanager.scheduler.address</name><value>node01:8030</value></property><property><name>yarn.resourcemanager.resource-tracker.address</name><value>node01:8031</value></property><property><name>yarn.resourcemanager.admin.address</name><value>node01:8033</value></property><property><name>yarn.resourcemanager.webapp.address</name><value>node01:8088</value></property><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.auxservices.mapreduce_shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property><property><name>yarn.resourcemanager.scheduler.class</name><value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value></property></configuration>[root@node01 hadoop]# View Code

配置slave.xml

[root@node01 hadoop]# cat slaves node02node03node04[root@node01 hadoop]#

复制各配置文件到其他节点

到此hadoop配置就完成了;

接下来切换到hadoop用户下,初始化hdfs

hdfs namenode -format

提示:如果执行hdfs namenode -format 出现红框中的提示,说明hdfs格式化就成功了;

启动hdfs集群

提示:hdfs主要由namenode、secondarynamenode和datanode组成,只要看到对应节点上的进程启动起来,就没有多大问题;

到此hdfs集群就正常启动了

验证:把/etc/passwd上传到hdfs的/test目录下,看看是否可以正常上传?

提示:可以看到/etc/passwd文件已经上传至hdfs的/test目录下了;

验证:查看hdfs /test目录下passwd文件,看看是否同/etc/passwd文件内容相同?

提示:可以看到hdfs上的/test/passwd文件内容同/etc/passwd文件内容相同;

验证:在dn节点查看对应目录下的文件内容,看看是否同/etc/passwd文件内容相同?

[root@node02 ~]# tree /data/data└── hadoop└── hdfs├── dn│├── current││├── BP-157891879-192.168.0.41-1601224158145│││├── current││││├── finalized│││││└── subdir0│││││└── subdir0│││││ ├── blk_1073741825│││││ └── blk_1073741825_1001.meta││││├── rbw││││└── VERSION│││├── scanner.cursor│││└── tmp││└── VERSION│└── in_use.lock├── nn└── snn13 directories, 6 files[root@node02 ~]# cat /data/hadoop/hdfs/dn/current/BP-157891879-192.168.0.41-1601224158145/current/scanner.cursortmp/[root@node02 ~]# cat /data/hadoop/hdfs/dn/current/BP-157891879-192.168.0.41-1601224158145/current/finalized/subdir0/subdir0/blk_1073741825root:x:0:0:root:/root:/bin/bashbin:x:1:1:bin:/bin:/sbin/nologindaemon:x:2:2:daemon:/sbin:/sbin/nologinadm:x:3:4:adm:/var/adm:/sbin/nologinlp:x:4:7:lp:/var/spool/lpd:/sbin/nologinsync:x:5:0:sync:/sbin:/bin/syncshutdown:x:6:0:shutdown:/sbin:/sbin/shutdownhalt:x:7:0:halt:/sbin:/sbin/haltmail:x:8:12:mail:/var/spool/mail:/sbin/nologinoperator:x:11:0:operator:/root:/sbin/nologingames:x:12:100:games:/usr/games:/sbin/nologinftp:x:14:50:FTP User:/var/ftp:/sbin/nologinnobody:x:99:99:Nobody:/:/sbin/nologinsystemd-network:x:192:192:systemd Network Management:/:/sbin/nologindbus:x:81:81:System message bus:/:/sbin/nologinpolkitd:x:999:997:User for polkitd:/:/sbin/nologinpostfix:x:89:89::/var/spool/postfix:/sbin/nologinsshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologinntp:x:38:38::/etc/ntp:/sbin/nologintcpdump:x:72:72::/:/sbin/nologinchrony:x:998:996::/var/lib/chrony:/sbin/nologinhadoop:x:1000:1000::/home/hadoop:/bin/bash[root@node02 ~]#

提示:可以看到在dn节点上的dn目录下能够找到我们上传的passwd文件;

验证:查看其它节点是否有相同的文件?是否有我们指定数量的副本?

提示:在node03和node04上也有相同的目录和文件;说明我们设置的副本数量为3生效了;

启动yarn集群

提示:可以看到对应节点上的nm启动了;主节点上的rm也正常启动了;

访问nn的50070和8088,看看对应的web地址是否能够访问到页面?

提示:这个地址是hdfs的web地址,在这个界面可以看到hdfs的存储状况,以及对hdfs上的文件做操作;

提示:8088是yarn集群的管理地址;在这个界面上能够看到运行的计算任务的状态信息,集群配置信息,日志等等;

验证:在yarn上跑一个计算任务,统计/test/passwd文件的单词数量,看看对应的计算任务是否能够跑起来?

[hadoop@node01 hadoop]$ yarn jar /bigdata/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jarAn example program must be given as the first argument.Valid program names are:aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.dbcount: An example job that count the pageview counts from a database.distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.grep: A map/reduce program that counts the matches of a regex in the input.join: A job that effects a join over sorted, equally partitioned datasetsmultifilewc: A job that counts words from several files.pentomino: A map/reduce tile laying program to find solutions to pentomino problems.pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.randomwriter: A map/reduce program that writes 10GB of random data per node.secondarysort: An example defining a secondary sort to the reduce.sort: A map/reduce program that sorts the data written by the random writer.sudoku: A sudoku solver.teragen: Generate data for the terasortterasort: Run the terasortteravalidate: Checking results of terasortwordcount: A map/reduce program that counts the words in the input files.wordmean: A map/reduce program that counts the average length of the words in the input files.wordmedian: A map/reduce program that counts the median length of the words in the input files.wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.[hadoop@node01 hadoop]$ yarn jar /bigdata/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar wordcountUsage: wordcount <in> [<in>...] <out>[hadoop@node01 hadoop]$ yarn jar /bigdata/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar wordcount /test/passwd /test/passwd-word-count20/09/28 00:58:01 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.0.41:803220/09/28 00:58:01 INFO input.FileInputFormat: Total input files to process : 120/09/28 00:58:01 INFO mapreduce.JobSubmitter: number of splits:120/09/28 00:58:01 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled20/09/28 00:58:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1601224871685_000120/09/28 00:58:02 INFO impl.YarnClientImpl: Submitted application application_1601224871685_000120/09/28 00:58:02 INFO mapreduce.Job: The url to track the job: http://node01:8088/proxy/application_1601224871685_0001/20/09/28 00:58:02 INFO mapreduce.Job: Running job: job_1601224871685_000120/09/28 00:58:08 INFO mapreduce.Job: Job job_1601224871685_0001 running in uber mode : false20/09/28 00:58:08 INFO mapreduce.Job:map 0% reduce 0%20/09/28 00:58:14 INFO mapreduce.Job:map 100% reduce 0%20/09/28 00:58:20 INFO mapreduce.Job:map 100% reduce 100%20/09/28 00:58:20 INFO mapreduce.Job: Job job_1601224871685_0001 completed successfully20/09/28 00:58:20 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=1144FILE: Number of bytes written=399079FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=1053HDFS: Number of bytes written=1018HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job CountersLaunched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=2753Total time spent by all reduces in occupied slots (ms)=2779Total time spent by all map tasks (ms)=2753Total time spent by all reduce tasks (ms)=2779Total vcore-milliseconds taken by all map tasks=2753Total vcore-milliseconds taken by all reduce tasks=2779Total megabyte-milliseconds taken by all map tasks=2819072Total megabyte-milliseconds taken by all reduce tasks=2845696Map-Reduce FrameworkMap input records=22Map output records=30Map output bytes=1078Map output materialized bytes=1144Input split bytes=95Combine input records=30Combine output records=30Reduce input groups=30Reduce shuffle bytes=1144Reduce input records=30Reduce output records=30Spilled Records=60Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=87CPU time spent (ms)=620Physical memory (bytes) snapshot=444997632Virtual memory (bytes) snapshot=4242403328Total committed heap usage (bytes)=285212672Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format CountersBytes Read=958File Output Format CountersBytes Written=1018[hadoop@node01 hadoop]$

查看计算后生成的报告

[hadoop@node01 hadoop]$ hdfs dfs -ls -R /test-rw-r--r--3 hadoop supergroup958 2020-09-28 00:32 /test/passwddrwxr-xr-x- hadoop supergroup0 2020-09-28 00:58 /test/passwd-word-count-rw-r--r--3 hadoop supergroup0 2020-09-28 00:58 /test/passwd-word-count/_SUCCESS-rw-r--r--3 hadoop supergroup1018 2020-09-28 00:58 /test/passwd-word-count/part-r-00000[hadoop@node01 hadoop]$ hdfs dfs -cat /test/passwd-word-count/part-r-00000Management:/:/sbin/nologin1Network 1SSH:/var/empty/sshd:/sbin/nologin1User:/var/ftp:/sbin/nologin1adm:x:3:4:adm:/var/adm:/sbin/nologin1bin:x:1:1:bin:/bin:/sbin/nologin1bus:/:/sbin/nologin1chrony:x:998:996::/var/lib/chrony:/sbin/nologin 1daemon:x:2:2:daemon:/sbin:/sbin/nologin 1dbus:x:81:81:System1for1ftp:x:14:50:FTP 1games:x:12:100:games:/usr/games:/sbin/nologin1hadoop:x:1000:1000::/home/hadoop:/bin/bash1halt:x:7:0:halt:/sbin:/sbin/halt1lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin1mail:x:8:12:mail:/var/spool/mail:/sbin/nologin1message 1nobody:x:99:99:Nobody:/:/sbin/nologin1ntp:x:38:38::/etc/ntp:/sbin/nologin1operator:x:11:0:operator:/root:/sbin/nologin1polkitd:/:/sbin/nologin 1polkitd:x:999:997:User1postfix:x:89:89::/var/spool/postfix:/sbin/nologin1root:x:0:0:root:/root:/bin/bash 1shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown1sshd:x:74:74:Privilege-separated1sync:x:5:0:sync:/sbin:/bin/sync 1systemd-network:x:192:192:systemd1tcpdump:x:72:72::/:/sbin/nologin1[hadoop@node01 hadoop]$

在8088页面上查看任务的状态信息

到此hadoop v2集群就搭建完毕了;

作者:Linux-1874

出处:https://www.cnblogs.com/qiuhom-1874/

    推荐阅读
  • 女生寝室名字创意(这些寝室名字太有意思了)

    游戏王丶我好棒哟,,我来为大家讲解一下关于女生寝室名字创意?跟着小编一起来看一看吧!女生寝室名字创意游戏王丶我好棒哟,胸前感情刀漂亮这么多牵手走未来〆大爷我输过败过何曾怕过伤心过的丶输的一败涂地殺人不眨眼.我曾为谁流泪_淡′花零落。尐儍G〃静静地、坠落待你胡子长了,剃掉。崔雪洋、拥慌预见你強顔歡笑难吻眉小灰灰叫妈咪

  • 白萝卜和老南瓜可以一起食用吗(白萝卜和老南瓜能一起食用吗)

    白萝卜和老南瓜可以一起食用吗可以。南瓜和白萝卜同为蔬菜,营养物质也没有相生相克的,两者还可以互补其营养,让身体获得全面的营养物质。

  • 祝好友生日快乐的句子(怎么祝福好友生日)

    祝好友生日快乐的句子祝你生日快乐,你的善良使这个世界变得更加美好,愿这完全属于你的一天带给你快乐,愿未来的日子锦上添花!祝你生日快乐,天天幸福!祝你又长了一年的皱纹,离满脸褶子又近了一步!长长的距离,远远的路;长长的时间,挂念止不住。耳聪目明无烦恼,笑对人生意从容,晚年自有祥光照,鹤舞夕阳分外红。

  • 芙蓉菊煮水方法(芙蓉菊煮水方法是什么)

    以下内容希望对你有帮助!芙蓉菊煮水方法芙蓉菊性温,可以用来治疗风湿骨痛。一般将芙蓉菊的根加水煎煮之后服用,能够有效缓解风湿带来的疼痛。此外,如果有胃脘因为受凉疼痛不已的时候,也可以用此药加水煎服。可以起到温暖腹部,排出寒湿之气的作用。用芙蓉菊、川贝末,加水一起煎煮,煮好之后,加入蜂蜜,分多次服用,每天服用一次,可以用来治疗小儿多痰咳嗽的病症。此外,若是肺热或者麻疹导致咳嗽,也可以用芙蓉菊治疗。

  • 华硕无畏15pro性能测试(高刷屏成为今年轻薄本新宠)

    首先是刷新率,华硕无畏Pro152022直接用上了120Hz高刷,而目前友商的同类型轻薄本大都只有90Hz。华硕无畏Pro152022的屏幕显示素质也颇具看点,相比同样采用120Hz刷新率甚至更高的轻薄本,其在色彩、亮度等方面都有优势。显示效果细腻有加,能满足不同用户对色彩的显示需求,护眼也没有落下,还能享受高帧HDR视频等等。

  • 三国演义的人物是真实存在的吗(三国演义里的人物在历史上真的存在吗)

    下面更多详细答案一起来看看吧!三国演义的人物是真实存在的吗大多人物是真实存在的,只是小说里有改编,重新演绎。《三国演义》反映了丰富的历史内容,人物名称、地理名称、主要事件与《三国志》基本相同。《三国演义》一方面反映了真实的三国历史,照顾到读者希望了解真实历史的需要;另一方面,根据明朝社会的实际情况对三国人物进行了夸张、美化、丑化等等。

  • 丁勇岱演过的所有电视剧(第二部你肯定听过)

    丁勇岱演过的所有电视剧《白山黑水》由张夷非执导,丁勇岱、许文广、崔岱、傅丽莉等主演的剧情类电影,于1997年在中国上映“九·一八”之后,杨靖宇等人带领抗联队伍,与日军作战。由于藏粮地点被日军发现,陷入极度危险之中。为筹措军粮,老交通青山活活冻死在山洞之中。为救快要饿死的烈士遗孤,特派员林茵与敌人同归于尽。为救被捕的同胞,侠女梅花鹿壮烈牺牲。

  • 常温保存的温度是多少度(储存温度是多少)

    冷链是温度敏感产品制造、包装和分销过程中特定热分布的不间断流动。冷链管理是对冷链所有阶段的管理,包括运输、加工、储存和展示中的产品。需要冷链解决方案的产品包括药品、疫苗、生物制品、实验室样本、诊断材料、化学品、食品和饮料。8冷链产品示例任何需要在特定温度范围内储存的产品都是冷链产品。如果不保持冷链温度,新鲜和冷冻水果和蔬菜会有皮肤损伤、细菌生长、风味丧失、瘀伤等风险。

  • 不用糖醋汁怎么做锅包肉(这款糖醋汁效果太棒了)

    浇汁锅包肉浇汁锅包肉浇汁锅包肉原料:无水猪外脊肉。将切好的肉片放入加有豆油的水粉糊内,均匀挂一层糊2.锅下宽油,大火加热至七成,将挂好糊的肉片逐一下入油锅中。糖醋汁烧开后淋入水淀粉勾芡放入炸好的肉片,淋汁翻匀6.淋明油后起锅盛入盘内,稍微点缀一下,即可上桌。

  • 海贼王娜美闪光魅力手办(海贼王婚纱手办太诱人)

    海贼王娜美闪光魅力手办近日,《海贼王》“LADYEDGE”系列新款手办推出。女帝、娜美、薇薇与佩罗娜身穿婚纱,浑身散发娇艳诱人气质,让人有想把她们通通抱回家的冲动。女帝婚纱手办原型师:あんな据介绍,“LADYEDGE:WEDDING”系列手办均由女性原型师制作,主打“纤细华美”的特性,并分为两种颜色款式。