1、HDFS-HA集群配置
Apache Hadoop 3.3.4 – HDFS High Availability Using the Quorum Journal Manager
1.1、环境准备
- 修改IP
- 修改主机名及主机名和IP地址的映射
- 关闭防火墙
- ssh免密登录
- 安装JDK,配置环境变量等
1.2、集群规划
linux121 | linux122 | linux123 |
NameNode | NameNode | |
JournalNode | JournalNode | JournalNode |
DataNode | DataNode | DataNode |
ZK | ZK | ZK |
ResourceManager | ||
NodeManager | NodeManager | NodeManager |
1.3、NodeManager
启动zookeeper集群
zk.sh start
查看状态
zk.sh status
注意:这里的zk.sh是我写的群起脚本命令。
1.4、配置HDFS-HA集群
(1)停止原先HDFS集群
stop-dfs.sh
(2)在所有节点,/opt/lagou/servers目录下创建一个ha文件夹
mkdir /opt/lagou/servers/ha
(3)将/opt/lagou/servers/目录下的 hadoop-2.9.2拷贝到ha目录下
cp -r hadoop-2.9.2 ha
(4)删除原集群data目录
rm -rf /opt/lagou/servers/ha/hadoop-2.9.2/data
(5)配置hdfs-site.xml(后续配置都要清空原先的配置)
<property>
<name>dfs.nameservices</name>
<value>lagoucluster</value>
</property>
<property>
<name>dfs.ha.namenodes.lagoucluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.lagoucluster.nn1</name>
<value>linux121:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.lagoucluster.nn2</name>
<value>linux122:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.lagoucluster.nn1</name>
<value>linux121:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.lagoucluster.nn2</name>
<value>linux122:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://linux121:8485;linux122:8485;linux123:8485/lagou</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.lagoucluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/journalnode</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
(6)配置core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://lagoucluster</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/lagou/servers/ha/hadoop-2.9.2/data/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>linux121:2181,linux122:2181,linux123:2181</value>
</property>
(7)拷贝配置好的hadoop环境到其他节点
1.5、启动HDFS-HA集群
(1)在各个JournalNode节点上,输入以下命令启动journalnode服务(去往HA安装目录,不要使用环境变量中命令)
/opt/lagou/servers/ha/hadoop-2.9.2/sbin/hadoop-daemon.sh start journalnode
(2)在[nn1]上,对其进行格式化,并启动
/opt/lagou/servers/ha/hadoop-2.9.2/bin/hdfs namenode -format
/opt/lagou/servers/ha/hadoop-2.9.2/sbin/hadoop-daemon.sh start namenode
(3)在[nn2]上,同步nn1的元数据信息
/opt/lagou/servers/ha/hadoop-2.9.2/bin/hdfs namenode -bootstrapStandby
(4)在[nn1]上初始化zkfc
/opt/lagou/servers/ha/hadoop-2.9.2/bin/hdfs zkfc -formatZK
(5)在[nn1]上,启动集群
/opt/lagou/servers/ha/hadoop-2.9.2/sbin/start-dfs.sh
(6)验证
- 将Active NameNode进程kill
- kill -9 namenode的进程id
2、YARN-HA配置
2.1、YARN-HA工作机制
官方文档
Apache Hadoop 3.3.4 – ResourceManager High Availability
YARN-HA工作机制
2.2、配置YARN-HA集群
(1)配置YARN-HA集群
- 修改IP
- 修改主机名及主机名和IP地址的映射
- 关闭防火墙
- ssh免密登录
- 安装JDK,配置环境变量等
- 配置Zookeeper集群
(2)具体配置
(3)yarn-site.xml(清空原有内容)
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--启⽤resourcemanager ha-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!--声明两台resourcemanager的地址-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster-yarn</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>linux122</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>linux123</value>
</property>
<!--指定zookeeper集群的地址-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>linux121:2181,linux122:2181,linux123:2181</value>
</property>
<!--启⽤⾃动恢复-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!--指定resourcemanager的状态信息存储在zookeeper集群-->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
</configuration>
(4)同步更新其他节点的配置信息
(5)启动hdfs文章来源:https://www.toymoban.com/news/detail-422418.html
sbin/start-yarn.sh文章来源地址https://www.toymoban.com/news/detail-422418.html
到了这里,关于hadoop高可用集群配置的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!