科百科
当前位置: 首页 范文大全

hadoop大数据平台分布式集群(大数据平台Hadoop集群搭建)

时间:2023-07-27 作者: 小编 阅读量: 1 栏目名: 范文大全

作者:Linux-1874出处:https://www.cnblogs.com/qiuhom-1874/HDFS架构提示:从上图架构可以看到,客户端访问hdfs上的某一文件,首先要向namenode请求文件的元数据信息,然后nn就会告诉客户端,访问的文件在datanode上的位置,然后客户端再依次向datanode请求对应的数据,最后拼接成一个完整的文件;这里需要注意一个概念,datanode存放

作者:Linux-1874

出处:https://www.cnblogs.com/qiuhom-1874/

HDFS架构

提示:从上图架构可以看到,客户端访问hdfs上的某一文件,首先要向namenode请求文件的元数据信息,然后nn就会告诉客户端,访问的文件在datanode上的位置,然后客户端再依次向datanode请求对应的数据,最后拼接成一个完整的文件;这里需要注意一个概念,datanode存放文件数据是按照文件大小和块大小来切分存放的,什么意思呢?比如一个文件100M大小,假设dn(datanode)上的块大小为10M一块,那么它存放在dn上是把100M切分为10M一块,共10块,然后把这10块数据分别存放在不同的dn上;同时这些块分别存放在不同的dn上,还会分别在不同的dn上存在副本,这样一来使得一个文件的数据块被多个dn分散冗余的存放;对于nn节点,它主要维护了那个文件的数据存放在那些节点,和那些dn存放了那些文件的数据块(这个数据是通过dn周期性的向nn发送);我们可以理解为nn内部有两张表分别记录了那些文件的数据块分别存放在那些dn上(以文件为中心),和那些dn存放了那些文件的数据块(以节点为中心);从上面的描述不难想象,当nn挂掉以后,整个存放在hdfs上的文件都将找不到,所以在生产中我们会使用zk(zookeeper)来对nn节点做高可用;对于hdfs来讲,它本质上不是内核文件系统,所以它依赖本地Linux文件系统;

mapreduce计算过程

提示:如上图所示,首先mapreduce会把给定的数据切分为多个(切分之前通过程序员写程序实现把给定的数据切分为多分,并抽取成kv键值对),然后启动多个mapper对其进行map计算,多个mapper计算后的结果在通过combiner进行合并(combiner是有程序员编写程序实现,主要实现合并规则),把相同key的值根据某种计算规则合并在一起,然后把结果在通过partitoner(分区器,这个分区器是通过程序员写程序实现,主要实现对map后的结果和对应reducer进行关联)分别发送给不同的reducer进行计算,最终每个reducer会产生一个最终的唯一结果;简单讲mapper的作用是读入kv键值对,输出新的kv键值对,会有新的kv产生;combiner的作用是把当前mapper生成的新kv键值对进行相同key的键值对进行合并,至于怎么合并,合并规则是什么是由程序员定义,所以combiner就是程序员写的程序实现,本质上combiner是读入kv键值对,输出kv键值对,不会产生新的kv;partitioner的作用就是把combiner合并后的键值对进行调度至reducer,至于怎么调度,该发往那个reducer,以及由几个reducer进行处理,由程序员定义;最终reducer折叠计算以后生成新的kv键值对;

hadoop v1与v2架构

提示:在hadoop v1的架构中,所有计算任务都跑在mapreduce之上,mapreduce就主要担任了两个角色,第一个是集群资源管理器和数据处理;到了hadoop v2 其架构就为hdfs yarn 一堆任务,其实我们可以把一堆任务理解为v1中的mapreduce,不同于v1中的mapreduce,v2中mapreduce只负责数据计算,不在负责集群资源管理,集群资源管理由yarn实现;对于v2来讲其计算任务都跑在了执yarn之上;对于hdfs来讲,v1和v2中的作用都是一样的,都是起存储文件作用;

hadoop v2 计算任务资源调度过程

提示:rm(resource manager)收到客户端的任务请求,此时rm会根据各dn上运行的nm(node manager)周期性报告的状态信息来决定把客户端的任务调度给那个nm来执行;当rm选定好nm后,就把任务发送给对应nm,对应nm内部会起一个appmaster(am)的容器,负责本次任务的主控端,而appmaster需要启动container来运行任务,它会向rm请求,然后rm会根据am的请求在对应的nm上启动一个或多个container;最后各container运行后的结果会发送给am,然后再由am返回给rm,rm再返回给客户端;在这其中rm主要用来接收个nm发送的各节点状态信息和资源调度以及接收各am计算任务后的结果并反馈给各客户端;nm主要用来管理各node上的资源和上报状态信息给rm;am主要用来管理各任务的资源申请和各任务执行后端结果返回给rm;

hadoop生态圈

提示:上图是hadoop v2生态圈架构图,其中hdfs和yarn是hadoop的核心组件,对于运行在其上的各种任务都必须依赖hadoop,也必须支持调用mapreduce接口;

二、hadoop集群部署

环境说明

名称角色ipnode01nn,snn,rm192.168.0.41node02dn,nm192.168.0.42node03dn,nm192.168.0.43node04dn,nm192.168.0.44

各节点同步时间

配置/etc/hosts解析个节点主机名

各节点安装jdk

yum install -y java-1.8.0-openjdk-devel

提示:安装devel包才会有jps命令

验证jdk是否安装完成,版本是否正确,确定java命令所在位置

添加JAVA_HOME环境变量

验证JAVA_HOME变量配置是否正确

创建目录,用于存放hadoop安装包

mkdir /bigdata

到此基础环境就准备OK,接下来下载hadoop二进制包

[root@node01 ~]# wget https://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gz--2020-09-27 22:50:16--https://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gzResolving mirror.bit.edu.cn (mirror.bit.edu.cn)... 202.204.80.77, 219.143.204.117, 2001:da8:204:1205::22Connecting to mirror.bit.edu.cn (mirror.bit.edu.cn)|202.204.80.77|:443... connected.HTTP request sent, awaiting response... 200 OKLength: 366447449 (349M) [application/octet-stream]Saving to: ‘hadoop-2.9.2.tar.gz’100%[============================================================================>] 366,447,449 1.44MB/sin 2m 19s 2020-09-27 22:52:35 (2.51 MB/s) - ‘hadoop-2.9.2.tar.gz’ saved [366447449/366447449][root@node01 ~]# lshadoop-2.9.2.tar.gz[root@node01 ~]#

解压hadoop-2.9.3.tar.gz到/bigdata/目录,并将解压到目录链接至hadoop

导出hadoop环境变量配置

[root@node01 ~]# cat /etc/profile.d/hadoop.shexport HADOOP_HOME=/bigdata/hadoopexport PATH=$PATH:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbinexport HADOOP_YARN_HOME=${HADOOP_HOME}export HADOOP_MAPPERD_HOME=${HADOOP_HOME}export HADOOP_COMMON_HOME=${HADOOP_HOME}export HADOOP_HDFS_HOME=${HADOOP_HOME}[root@node01 ~]#

创建hadoop用户,并设置其密码为admin

[root@node01 ~]# useradd hadoop[root@node01 ~]# echo "admin" |passwd --stdin hadoopChanging password for user hadoop.passwd: all authentication tokens updated successfully.[root@node01 ~]#

各节点间hadoop用户做免密登录

[hadoop@node01 ~]$ ssh-keygen Generating public/private rsa key pair.Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): Created directory '/home/hadoop/.ssh'.Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/hadoop/.ssh/id_rsa.Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.The key fingerprint is:SHA256:6CNhqdagySJXc4iRBVSoLENddO7JLZMCsdjQzqSFnmw hadoop@node01.test.orgThe key's randomart image is: ---[RSA 2048]---- | o*==o .|| o=Bo o||=oX.|| E =.oo.||o.o B.oBS.||.o * =. o||=.o o||oo. .||| ----[SHA256]----- [hadoop@node01 ~]$ ssh-copy-id node01/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"The authenticity of host 'node01 (192.168.0.41)' can't be established.ECDSA key fingerprint is SHA256:lE8/Vyni4z8hsXaa8OMMlDpu3yOIRh6dLcIr oE57oE.ECDSA key fingerprint is MD5:14:59:02:30:c0:16:b8:6c:1a:84:c3:0f:a7:ac:67:b3.Are you sure you want to continue connecting (yes/no)? yes/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keyshadoop@node01's password: Number of key(s) added: 1Now try logging into the machine, with:"ssh 'node01'"and check to make sure that only the key(s) you wanted were added.[hadoop@node01 ~]$ scp -r ./.ssh node02:/home/hadoop/The authenticity of host 'node02 (192.168.0.42)' can't be established.ECDSA key fingerprint is SHA256:lE8/Vyni4z8hsXaa8OMMlDpu3yOIRh6dLcIr oE57oE.ECDSA key fingerprint is MD5:14:59:02:30:c0:16:b8:6c:1a:84:c3:0f:a7:ac:67:b3.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node02,192.168.0.42' (ECDSA) to the list of known hosts.hadoop@node02's password: id_rsa100% 1679636.9KB/s00:00id_rsa.pub100%404186.3KB/s00:00known_hosts100%362153.4KB/s00:00authorized_keys100%404203.9KB/s00:00[hadoop@node01 ~]$ scp -r ./.ssh node03:/home/hadoop/The authenticity of host 'node03 (192.168.0.43)' can't be established.ECDSA key fingerprint is SHA256:lE8/Vyni4z8hsXaa8OMMlDpu3yOIRh6dLcIr oE57oE.ECDSA key fingerprint is MD5:14:59:02:30:c0:16:b8:6c:1a:84:c3:0f:a7:ac:67:b3.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node03,192.168.0.43' (ECDSA) to the list of known hosts.hadoop@node03's password:id_rsa100% 1679755.1KB/s00:00id_rsa.pub100%404165.7KB/s00:00known_hosts100%543350.9KB/s00:00authorized_keys100%404330.0KB/s00:00[hadoop@node01 ~]$ scp -r ./.ssh node04:/home/hadoop/The authenticity of host 'node04 (192.168.0.44)' can't be established.ECDSA key fingerprint is SHA256:lE8/Vyni4z8hsXaa8OMMlDpu3yOIRh6dLcIr oE57oE.ECDSA key fingerprint is MD5:14:59:02:30:c0:16:b8:6c:1a:84:c3:0f:a7:ac:67:b3.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node04,192.168.0.44' (ECDSA) to the list of known hosts.hadoop@node04's password: id_rsa100% 1679707.0KB/s00:00id_rsa.pub100%404172.8KB/s00:00known_hosts100%724437.7KB/s00:00authorized_keys100%404165.2KB/s00:00[hadoop@node01 ~]$

验证:用node01去连接node02,node03,node04看看是否是免密登录了

创建数据目录/data/hadoop/hdfs/{nn,snn,dn},并将其属主属组更改为hadoop

进入到hadoop安装目录,创建其logs目录,并将其安装目录的属主和属组更改为hadoop

提示:以上所有步骤都需要在各节点挨着做一遍;

配置hadoop的core-site.xml

提示:hadoop的配置文件语法都是xml格式的配置文件,其中<property>和</property>是一对标签,里面用name标签来引用配置的选项的key的名称,其value标签用来配置对应key的值;上面配置表示配置默认的文件系统地址;hdfs://node01:8020是hdfs文件系统访问的地址;

完整的配置

[root@node01 hadoop]# cat core-site.xml <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration><property><name>fs.defaultFS</name><value>hdfs://node01:8020</value><final>true</final></property></configuration>[root@node01 hadoop]# View Code

配置hdfs-site.xml

提示:以上配置主要指定hdfs相关目录以及访问web端口信息,副本数量;

完整的配置

[root@node01 hadoop]# cat hdfs-site.xml<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration><property><name>dfs.replication</name><value>3</value></property><property><name>dfs.namenode.name.dir</name><value>file:///data/hadoop/hdfs/nn</value></property><property><name>dfs.namenode.secondary.http-address</name><value>node01:50090</value></property><property><name>dfs.namenode.http-address</name><value>node01:50070</value></property><property><name>dfs.datanode.data.dir</name><value>file:///data/hadoop/hdfs/dn</value></property><property><name>fs.checkpoint.dir</name><value>file:///data/hadoop/hdfs/snn</value></property><property><name>fs.checkpoint.edits.dir</name><value>file:///data/hadoop/hdfs/snn</value></property></configuration>[root@node01 hadoop]# View Code

配置mapred-site.xml

提示:以上配置主要指定了mapreduce的框架为yarn;默认没有mapred-site.xml,我们需要将mapred-site.xml.template修改成mapred.site.xml;这里需要注意我上面是通过复制修改文件名,当然属主信息都会变成root,不要忘记把属组信息修改成hadoop;

完整的配置

[root@node01 hadoop]# cat mapred-site.xml<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property></configuration>[root@node01 hadoop]# View Code

配置yarn-site.xml

提示:以上配置主要配置了yarn框架rm和nm相关地址和指定相关类;

完整的配置

[root@node01 hadoop]# cat yarn-site.xml<?xml version="1.0"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><configuration><property><name>yarn.resourcemanager.address</name><value>node01:8032</value></property><property><name>yarn.resourcemanager.scheduler.address</name><value>node01:8030</value></property><property><name>yarn.resourcemanager.resource-tracker.address</name><value>node01:8031</value></property><property><name>yarn.resourcemanager.admin.address</name><value>node01:8033</value></property><property><name>yarn.resourcemanager.webapp.address</name><value>node01:8088</value></property><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.auxservices.mapreduce_shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property><property><name>yarn.resourcemanager.scheduler.class</name><value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value></property></configuration>[root@node01 hadoop]# View Code

配置slave.xml

[root@node01 hadoop]# cat slaves node02node03node04[root@node01 hadoop]#

复制各配置文件到其他节点

到此hadoop配置就完成了;

接下来切换到hadoop用户下,初始化hdfs

hdfs namenode -format

提示:如果执行hdfs namenode -format 出现红框中的提示,说明hdfs格式化就成功了;

启动hdfs集群

提示:hdfs主要由namenode、secondarynamenode和datanode组成,只要看到对应节点上的进程启动起来,就没有多大问题;

到此hdfs集群就正常启动了

验证:把/etc/passwd上传到hdfs的/test目录下,看看是否可以正常上传?

提示:可以看到/etc/passwd文件已经上传至hdfs的/test目录下了;

验证:查看hdfs /test目录下passwd文件,看看是否同/etc/passwd文件内容相同?

提示:可以看到hdfs上的/test/passwd文件内容同/etc/passwd文件内容相同;

验证:在dn节点查看对应目录下的文件内容,看看是否同/etc/passwd文件内容相同?

[root@node02 ~]# tree /data/data└── hadoop└── hdfs├── dn│├── current││├── BP-157891879-192.168.0.41-1601224158145│││├── current││││├── finalized│││││└── subdir0│││││└── subdir0│││││ ├── blk_1073741825│││││ └── blk_1073741825_1001.meta││││├── rbw││││└── VERSION│││├── scanner.cursor│││└── tmp││└── VERSION│└── in_use.lock├── nn└── snn13 directories, 6 files[root@node02 ~]# cat /data/hadoop/hdfs/dn/current/BP-157891879-192.168.0.41-1601224158145/current/scanner.cursortmp/[root@node02 ~]# cat /data/hadoop/hdfs/dn/current/BP-157891879-192.168.0.41-1601224158145/current/finalized/subdir0/subdir0/blk_1073741825root:x:0:0:root:/root:/bin/bashbin:x:1:1:bin:/bin:/sbin/nologindaemon:x:2:2:daemon:/sbin:/sbin/nologinadm:x:3:4:adm:/var/adm:/sbin/nologinlp:x:4:7:lp:/var/spool/lpd:/sbin/nologinsync:x:5:0:sync:/sbin:/bin/syncshutdown:x:6:0:shutdown:/sbin:/sbin/shutdownhalt:x:7:0:halt:/sbin:/sbin/haltmail:x:8:12:mail:/var/spool/mail:/sbin/nologinoperator:x:11:0:operator:/root:/sbin/nologingames:x:12:100:games:/usr/games:/sbin/nologinftp:x:14:50:FTP User:/var/ftp:/sbin/nologinnobody:x:99:99:Nobody:/:/sbin/nologinsystemd-network:x:192:192:systemd Network Management:/:/sbin/nologindbus:x:81:81:System message bus:/:/sbin/nologinpolkitd:x:999:997:User for polkitd:/:/sbin/nologinpostfix:x:89:89::/var/spool/postfix:/sbin/nologinsshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologinntp:x:38:38::/etc/ntp:/sbin/nologintcpdump:x:72:72::/:/sbin/nologinchrony:x:998:996::/var/lib/chrony:/sbin/nologinhadoop:x:1000:1000::/home/hadoop:/bin/bash[root@node02 ~]#

提示:可以看到在dn节点上的dn目录下能够找到我们上传的passwd文件;

验证:查看其它节点是否有相同的文件?是否有我们指定数量的副本?

提示:在node03和node04上也有相同的目录和文件;说明我们设置的副本数量为3生效了;

启动yarn集群

提示:可以看到对应节点上的nm启动了;主节点上的rm也正常启动了;

访问nn的50070和8088,看看对应的web地址是否能够访问到页面?

提示:这个地址是hdfs的web地址,在这个界面可以看到hdfs的存储状况,以及对hdfs上的文件做操作;

提示:8088是yarn集群的管理地址;在这个界面上能够看到运行的计算任务的状态信息,集群配置信息,日志等等;

验证:在yarn上跑一个计算任务,统计/test/passwd文件的单词数量,看看对应的计算任务是否能够跑起来?

[hadoop@node01 hadoop]$ yarn jar /bigdata/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jarAn example program must be given as the first argument.Valid program names are:aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.dbcount: An example job that count the pageview counts from a database.distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.grep: A map/reduce program that counts the matches of a regex in the input.join: A job that effects a join over sorted, equally partitioned datasetsmultifilewc: A job that counts words from several files.pentomino: A map/reduce tile laying program to find solutions to pentomino problems.pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.randomwriter: A map/reduce program that writes 10GB of random data per node.secondarysort: An example defining a secondary sort to the reduce.sort: A map/reduce program that sorts the data written by the random writer.sudoku: A sudoku solver.teragen: Generate data for the terasortterasort: Run the terasortteravalidate: Checking results of terasortwordcount: A map/reduce program that counts the words in the input files.wordmean: A map/reduce program that counts the average length of the words in the input files.wordmedian: A map/reduce program that counts the median length of the words in the input files.wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.[hadoop@node01 hadoop]$ yarn jar /bigdata/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar wordcountUsage: wordcount <in> [<in>...] <out>[hadoop@node01 hadoop]$ yarn jar /bigdata/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar wordcount /test/passwd /test/passwd-word-count20/09/28 00:58:01 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.0.41:803220/09/28 00:58:01 INFO input.FileInputFormat: Total input files to process : 120/09/28 00:58:01 INFO mapreduce.JobSubmitter: number of splits:120/09/28 00:58:01 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled20/09/28 00:58:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1601224871685_000120/09/28 00:58:02 INFO impl.YarnClientImpl: Submitted application application_1601224871685_000120/09/28 00:58:02 INFO mapreduce.Job: The url to track the job: http://node01:8088/proxy/application_1601224871685_0001/20/09/28 00:58:02 INFO mapreduce.Job: Running job: job_1601224871685_000120/09/28 00:58:08 INFO mapreduce.Job: Job job_1601224871685_0001 running in uber mode : false20/09/28 00:58:08 INFO mapreduce.Job:map 0% reduce 0%20/09/28 00:58:14 INFO mapreduce.Job:map 100% reduce 0%20/09/28 00:58:20 INFO mapreduce.Job:map 100% reduce 100%20/09/28 00:58:20 INFO mapreduce.Job: Job job_1601224871685_0001 completed successfully20/09/28 00:58:20 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=1144FILE: Number of bytes written=399079FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=1053HDFS: Number of bytes written=1018HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job CountersLaunched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=2753Total time spent by all reduces in occupied slots (ms)=2779Total time spent by all map tasks (ms)=2753Total time spent by all reduce tasks (ms)=2779Total vcore-milliseconds taken by all map tasks=2753Total vcore-milliseconds taken by all reduce tasks=2779Total megabyte-milliseconds taken by all map tasks=2819072Total megabyte-milliseconds taken by all reduce tasks=2845696Map-Reduce FrameworkMap input records=22Map output records=30Map output bytes=1078Map output materialized bytes=1144Input split bytes=95Combine input records=30Combine output records=30Reduce input groups=30Reduce shuffle bytes=1144Reduce input records=30Reduce output records=30Spilled Records=60Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=87CPU time spent (ms)=620Physical memory (bytes) snapshot=444997632Virtual memory (bytes) snapshot=4242403328Total committed heap usage (bytes)=285212672Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format CountersBytes Read=958File Output Format CountersBytes Written=1018[hadoop@node01 hadoop]$

查看计算后生成的报告

[hadoop@node01 hadoop]$ hdfs dfs -ls -R /test-rw-r--r--3 hadoop supergroup958 2020-09-28 00:32 /test/passwddrwxr-xr-x- hadoop supergroup0 2020-09-28 00:58 /test/passwd-word-count-rw-r--r--3 hadoop supergroup0 2020-09-28 00:58 /test/passwd-word-count/_SUCCESS-rw-r--r--3 hadoop supergroup1018 2020-09-28 00:58 /test/passwd-word-count/part-r-00000[hadoop@node01 hadoop]$ hdfs dfs -cat /test/passwd-word-count/part-r-00000Management:/:/sbin/nologin1Network 1SSH:/var/empty/sshd:/sbin/nologin1User:/var/ftp:/sbin/nologin1adm:x:3:4:adm:/var/adm:/sbin/nologin1bin:x:1:1:bin:/bin:/sbin/nologin1bus:/:/sbin/nologin1chrony:x:998:996::/var/lib/chrony:/sbin/nologin 1daemon:x:2:2:daemon:/sbin:/sbin/nologin 1dbus:x:81:81:System1for1ftp:x:14:50:FTP 1games:x:12:100:games:/usr/games:/sbin/nologin1hadoop:x:1000:1000::/home/hadoop:/bin/bash1halt:x:7:0:halt:/sbin:/sbin/halt1lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin1mail:x:8:12:mail:/var/spool/mail:/sbin/nologin1message 1nobody:x:99:99:Nobody:/:/sbin/nologin1ntp:x:38:38::/etc/ntp:/sbin/nologin1operator:x:11:0:operator:/root:/sbin/nologin1polkitd:/:/sbin/nologin 1polkitd:x:999:997:User1postfix:x:89:89::/var/spool/postfix:/sbin/nologin1root:x:0:0:root:/root:/bin/bash 1shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown1sshd:x:74:74:Privilege-separated1sync:x:5:0:sync:/sbin:/bin/sync 1systemd-network:x:192:192:systemd1tcpdump:x:72:72::/:/sbin/nologin1[hadoop@node01 hadoop]$

在8088页面上查看任务的状态信息

到此hadoop v2集群就搭建完毕了;

作者:Linux-1874

出处:https://www.cnblogs.com/qiuhom-1874/

    推荐阅读
  • 内啡肽多巴胺如何增加(多巴胺内啡肽看4点如何增强人体荷尔蒙)

    研究表明,只有持续坚持每次至少进行30分钟的有氧运动,人体才会释放内啡肽,而且集体锻炼比单独锻炼更有益。

  • 墨兰

    假鳞茎卵球形,包藏于叶基之内。叶带形,近薄革质,暗绿色。花期10月至次年3月。生长习性墨兰生于林下、灌木林中或溪谷旁湿润但排水良好的荫蔽处,海拔300-2000米。喜阴,而忌强光。喜温暖,而忌严寒。

  • 古典吉他与民谣吉他的区别(怎么区分古典和民谣吉他)

    古典吉他的琴头一般为镂空加上机械弦扭,琴颈指板宽而平,从弦枕到琴柄与琴箱结合处为12品,使用尼龙弦。古典吉他指板较民谣的指板要宽出很多,便于快速地移动手指。天赋一定程度上体现在这里,手指长对古典吉他是有较大帮助的。主要适合弹唱持琴姿势区别古典吉他:采用四点持琴法古典吉他左腿需要踩脚踏或支架;民谣吉他:采用三点持琴法民谣吉他可自由发挥,如翘二郎腿。

  • 网吧电脑怎么开机(网吧电脑如何开机)

    如果一体机,看一下电脑的左右两边下角,有没有开关,不同的电脑开关位置不同,耐心找一下。然后,按下机箱电源按钮,一般都在右边,有一个圆按钮,或者是在屏幕下面,不要操作键盘。最后,系统会自动默认从网络启动,加载服务器系统文件,开机进入系统桌面,可以像普通的电脑一样使用。

  • 怎么在淘宝上开网店 在淘宝开网店的详细步骤

    在淘宝上开网店的方法:1、首先有张开通了网上银行的银行卡:工商,建设,农行等,并且得年满18周岁。登录淘宝,点击上方“我的淘宝”,在页面中间“欢迎您XXX,小二在此恭候多时了!”

  • 大量空间类种田文小说(老书迷小说推荐)

    昨天双十一,她买了大批的生活用品和米面粮油。物流业发达,第二天她下的订单就开始陆续到货。结果被楼下一对夫妻撞倒,晏缈一个重心不稳,一头往楼梯下栽去。偏偏她什么好事不干,作天作地作死地针对女主,还和女主抢男人。就是隔壁那个刚回家的兵哥怎么总偷偷看她,还无事献殷勤帮她干活?不会是发现了她发家致富的小秘密了吧?!

  • 海参不能和什么一起吃(哪些食物不能和海参一起吃)

    常见的寒性食物有:苦瓜,绿豆,黄瓜,西瓜,鸭肉,冬瓜等。海参不能和草酸食物一起吃,海参属于贝类食物,这类食物除了含有大量的蛋白外,矿物质钙的含量也极高,而生活中很多食物中都含有草酸,草酸和蛋白质会发生反应,使其硬化,降低海参的食用口感;草酸还会和矿物质钙相互反应,生成难以溶解的草酸钙,降低海参的营养价值,还会增加肾结石的风险。常见的草酸食物有:菠菜,甜菜,芹菜,巧克力,葡萄,青椒,香菜等。

  • 劝人放下恩怨的句子 劝人放下恩怨的句子简短

    即使有千般不愿,万般不舍,也阻止不了它的离去。删掉一切,却无法删掉那最深的记忆。醒来一切也不会好了。宁可高傲到发霉。

  • 真正的道口烧鸡在哪儿(道口烧鸡因何成名)

    据当地传说,乾隆皇帝巡游江南时路过滑县,“闻异香而醒神”,问及左右,滑县县令连忙把当地道口镇老字号“义兴张”烧鸡呈献上去。乾隆吃后赞其色、香、味三绝,并当即把“义兴张”烧鸡定为贡品,此后当地烧鸡借助卫河水运的便利而远销外省,并得到不断传承和发展。到了今天已经从三绝发展到了“四绝”。

  • 冰箱干湿分储什么意思(冰箱干湿分储的意思)

    冰箱干湿分储什么意思冰箱干湿分储意思就是带有干湿分储功能的冰箱,为消费者提供了食材干而不燥,湿而不腐的完美解决方案。干湿分储技术,能够使湿区湿度保持在90%左右,水果蔬菜保湿不凝露,给果蔬提供了适宜的高湿存储环境。干区湿度保持在45%左右,干货不受潮不变质,为鹿茸、冬虫夏草、名贵茶叶等贵重干货提供了恰当的存储环境。