搭建HBase分布式集群

这篇具有很好参考价值的文章主要介绍了搭建HBase分布式集群。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

0. Prerequisite
There are 3 VMs - hadoop3/hadoop4/hadoop5 for fully-distributed HBase cluster, the setup plan looks like:

hadoop3 hadoop4 hadoop5
Hadoop hdfs

NameNode:8020

DateNode:50010

JobHistoryServer:19888

DataNode:50010

SecondaryNameNode:50090

DateNode:50010

Hadoop yarn NodeManger:8040

ResourceMananger:8088

NodeManger:8040

NodeManger:8040
Zookeeper QuorumPeerMain:2181 QuorumPeerMain:2181 QuorumPeerMain:2181
HBase

HMaster:16000

HRegionServer:16020

HRegionServer:16020 HRegionServer:16020

And JDK/Zookeeper/Hadoop/HBase have been installed under /opt on 3 VMs with user sunxo:

$ ls /opt
hadoop-2.10.2  hbase-2.4.16  jdk  zookeeper-3.8.1

1) configure passwordless SSH access
hadoop3 who has Namenode needs to access all VMs as sunxo

$ ssh-keygen -t rsa
$ ssh-copy-id hadoop3
$ ssh-copy-id hadoop4
$ ssh-copy-id hadoop5

and root as well

# ssh-keygen -t rsa
# ssh-copy-id hadoop3
# ssh-copy-id hadoop4
# ssh-copy-id hadoop5

haddoop4 who has ResourceMananger needs to access all VMs as sunxo

$ ssh-keygen -t rsa
$ ssh-copy-id hadoop3
$ ssh-copy-id hadoop4
$ ssh-copy-id hadoop5

2) hadoop3 add environment variable in $HOME/.bashrc (not .bash.profile)

export JAVA_HOME=/opt/jdk
export ZOOKEEPER_HOME=/opt/zookeeper-3.8.1
export KAFKA_HOME=/opt/kafka-3.3.1
export HADOOP_HOME=/opt/hadoop-2.10.2
export HBASE_HOME=/opt/hbase-2.4.16
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:$JAVA_HOME/bin:$HOME/bin:.:$PATH

And distribute .bashrc to hadoop4 hadoop5

$ rsync.sh .bashrc hadoop4 hadoop5
rsync -rvl /home/sunxo/.bashrc sunxo@hadoop4:/home/sunxo
sending incremental file list
.bashrc

sent 614 bytes  received 37 bytes  1302.00 bytes/sec
total size is 541  speedup is 0.83
rsync -rvl /home/sunxo/.bashrc sunxo@hadoop5:/home/sunxo
sending incremental file list
.bashrc

1. Zookeeper on hadoop3
1) configurate and distribute zoo.cfg

$ cd $ZOOKEEPER_HOME/conf 
$ diff -u zoo_sample.cfg zoo.cfg
--- zoo_sample.cfg	2023-01-26 00:31:05.000000000 +0800
+++ zoo.cfg	2023-10-17 14:30:06.598229298 +0800
@@ -9,7 +9,7 @@
 # the directory where the snapshot is stored.
 # do not use /tmp for storage, /tmp here is just 
 # example sakes.
-dataDir=/tmp/zookeeper
+dataDir=/opt/zookeeper-3.8.1/tmp
 # the port at which the clients will connect
 clientPort=2181
 # the maximum number of client connections.
@@ -25,7 +25,7 @@
 #autopurge.snapRetainCount=3
 # Purge task interval in hours
 # Set to "0" to disable auto purge feature
-#autopurge.purgeInterval=1
+autopurge.purgeInterval=1
 
 ## Metrics Providers
 #
@@ -35,3 +35,7 @@
 #metricsProvider.httpPort=7000
 #metricsProvider.exportJvmInfo=true
 
+# cluster
+server.3=hadoop3:2888:3888
+server.4=hadoop4:2888:3888
+server.5=hadoop5:2888:3888

$ rsync.sh zoo.cfg hadoop4 hadoop5

2) create and distribute data dir assigned in zoo.cfg

cd $ZOOKEEPER_HOME
$ mkdir -p tmp
$ rsync.sh tmp hadoop4 hadoop5

3) start zookeeper cluster

$ hosts="hadoop3 hadoop4 hadoop5"
$     for host in $hosts
>     do
>         echo "============= zk start on $host ============="
>         ssh $host $ZOOKEEPER_HOME/bin/zkServer.sh start
>     done
============= zk start on hadoop3 =============
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.8.1/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
============= zk start on hadoop4 =============
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.8.1/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
============= zk start on hadoop5 =============
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.8.1/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

$ jps.sh hadoop3 hadoop4 hadoop5
============= hadoop3 =============
30495 QuorumPeerMain
============= hadoop4 =============
313 QuorumPeerMain
============= hadoop5 =============
4264 QuorumPeerMain

2. Hadoop on hadoop3
1) configurate

$ cd $HADOOP_HOME/etc/hadoop
$ diff -u hadoop-env.sh.orig hadoop-env.sh
...
-export JAVA_HOME=${JAVA_HOME}
+export JAVA_HOME=/opt/jdk
$ cat core-site.xml
...
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop3:8020</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/hadoop-2.10.2/data/tmp</value>
    </property>
</configuration>
$ cat hdfs-site.xml
...
<configuration>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop5:50090</value>
    </property>
</configuration>
$ cat mapred-site.xml
...
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>hadoop3:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>hadoop3:19888</value>
    </property>
</configuration>
$ cat yarn-site.xml
...
<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop4</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>604800</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
    </property>
</configuration>
$ cat slaves
hadoop3
hadoop4
hadoop5

2) distribute configuration to hadoop4, hadoop5

$ cd $HADOOP_HOME/etc
$ rsync.sh hadoop/ hadoop4 hadoop5

3) format filesystem

$ cd $HADOOP_HOME
$ rm.sh data/ hadoop3 hadoop4 hadoop5    # remove old data if need
$ rm.sh log/ hadoop3 hadoop4 hadoop5     # remove old log if need
$ bin/hdfs namenode -format

4) start hdfs/yarn/historyserver

$   echo "============= dfs start from hadoop3 ============="
    ssh hadoop3 $HADOOP_HOME/sbin/start-dfs.sh
    echo "============= yarn start from hadoop4 ============="
    ssh hadoop4 $HADOOP_HOME/sbin/start-yarn.sh
    echo "============= history start on hadoop3 ============="
    ssh hadoop3 $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
============= dfs start from hadoop3 =============
Starting namenodes on [hadoop3]
hadoop3: starting namenode, logging to /opt/hadoop-2.10.2/logs/hadoop-sunxo-namenode-hadoop3.out
hadoop4: starting datanode, logging to /opt/hadoop-2.10.2/logs/hadoop-sunxo-datanode-hadoop4.out
hadoop3: starting datanode, logging to /opt/hadoop-2.10.2/logs/hadoop-sunxo-datanode-hadoop3.out
hadoop5: starting datanode, logging to /opt/hadoop-2.10.2/logs/hadoop-sunxo-datanode-hadoop5.out
Starting secondary namenodes [hadoop5]
============= yarn start from hadoop4 =============
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.10.2/logs/yarn-sunxo-resourcemanager-hadoop4.out
hadoop3: starting nodemanager, logging to /opt/hadoop-2.10.2/logs/yarn-sunxo-nodemanager-hadoop3.out
hadoop4: starting nodemanager, logging to /opt/hadoop-2.10.2/logs/yarn-sunxo-nodemanager-hadoop4.out
hadoop5: starting nodemanager, logging to /opt/hadoop-2.10.2/logs/yarn-sunxo-nodemanager-hadoop5.out
============= history start on hadoop3 =============
starting historyserver, logging to /opt/hadoop-2.10.2/logs/mapred-sunxo-historyserver-hadoop3.out

$ jps.sh hadoop3 hadoop4 hadoop5
============= hadoop3 =============
816 DataNode
616 NameNode
1385 JobHistoryServer
1166 NodeManager
30495 QuorumPeerMain
============= hadoop4 =============
2065 DataNode
2354 NodeManager
313 QuorumPeerMain
2222 ResourceManager
============= hadoop5 =============
5892 DataNode
6023 SecondaryNameNode
4264 QuorumPeerMain
6120 NodeManager

3. HBase on hadoop3
1) configurate

$ diff -u hbase-env.sh.orig hbase-env.sh
--- hbase-env.sh.orig	2020-01-22 23:10:15.000000000 +0800
+++ hbase-env.sh	2023-10-19 18:21:33.098131203 +0800
@@ -25,7 +25,7 @@
 # into the startup scripts (bin/hbase, etc.)
 
 # The java implementation to use.  Java 1.8+ required.
-# export JAVA_HOME=/usr/java/jdk1.8.0/
+export JAVA_HOME=/opt/jdk
 
 # Extra Java CLASSPATH elements.  Optional.
 # export HBASE_CLASSPATH=
@@ -123,7 +123,7 @@
 # export HBASE_SLAVE_SLEEP=0.1
 
 # Tell HBase whether it should manage it's own instance of ZooKeeper or not.
-# export HBASE_MANAGES_ZK=true
+export HBASE_MANAGES_ZK=false
 
 # The default log rolling policy is RFA, where the log file is rolled as per the size defined for the

$ cat hbase-site.xml
<configuration>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>hadoop3,hadoop4,hadoop5</value>
    </property>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://hadoop3:8020/hbase</value>
    </property>
</configuration>

$ cat regionservers 
hadoop3
hadoop4
hadoop5

2) start hbase

$   echo "============= hbase start from hadoop3 ============="
    $HBASE_HOME/bin/start-hbase.sh
============= hbase start from hadoop3 =============
running master, logging to /opt/hbase-2.4.16/logs/hbase-sunxo-master-hadoop3.out
hadoop3: running regionserver, logging to /opt/hbase-2.4.16/logs/hbase-sunxo-regionserver-hadoop3.out
hadoop4: running regionserver, logging to /opt/hbase-2.4.16/logs/hbase-sunxo-regionserver-hadoop4.out
hadoop5: running regionserver, logging to /opt/hbase-2.4.16/logs/hbase-sunxo-regionserver-hadoop5.out

$ jps.sh hadoop3 hadoop4 hadoop5
============= hadoop3 =============
816 DataNode
2064 HMaster
616 NameNode
2280 HRegionServer
1385 JobHistoryServer
1166 NodeManager
30495 QuorumPeerMain
============= hadoop4 =============
2065 DataNode
2354 NodeManager
2995 HRegionServer
313 QuorumPeerMain
2222 ResourceManager
============= hadoop5 =============
5892 DataNode
6023 SecondaryNameNode
4264 QuorumPeerMain
6120 NodeManager
6616 HRegionServer

Note: check related url
Hdfs - http://hadoop3:50070/explorer.html#/
Yarn - http://hadoop4:8088/cluster
HBase - http://hadoop3:16010/master-status

 文章来源地址https://www.toymoban.com/news/detail-721077.html

到了这里,关于搭建HBase分布式集群的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包赞助服务器费用

相关文章

  • HBase伪分布式集群的复制

    0. prerequisite Suppose 2 hbase pseudo distributed clusters have both started as folowing relevant parameters in hbase-site.xml source destnation hbase.zookeeper.quorum    ubuntu centos2 hbase.zookeeper.property.clientPort 2181 2181 zookeeper.znode.parent /hbase /hbase 1. Create table for replication 1) start hbase shell on source cluster and create a tab

    2024年02月13日
    浏览(8)
  • Zookeeper+Hadoop+Spark+Flink+Kafka+Hbase+Hive 完全分布式高可用集群搭建(保姆级超详细含图文)

    Zookeeper+Hadoop+Spark+Flink+Kafka+Hbase+Hive 完全分布式高可用集群搭建(保姆级超详细含图文)

    说明: 本篇将详细介绍用二进制安装包部署hadoop等组件,注意事项,各组件的使用,常用的一些命令,以及在部署中遇到的问题解决思路等等,都将详细介绍。 ip hostname 192.168.1.11 node1 192.168.1.12 node2 192.168.1.13 node3 1.2.1系统版本 1.2.2内存建议最少4g、2cpu、50G以上的磁盘容量 本次

    2024年02月12日
    浏览(15)
  • HBase 分布式搭建

    HBase 分布式搭建

    前言: 请先确保 Hadoop 集群搭建完成。 Hadoop 完全分布式搭建(超详细) 搭建环境介绍: 三台主机,一主两从,系统为 Centos 7.5。 相关组件版本信息如下: jdk1.8 hadoop-3.1.3 zookeeper-3.5.7 hbase-2.2.3 注意,以下安装教程中涉及到的路径请替换成自己的! ZooKeeper 安装 解压并改名 添

    2024年02月04日
    浏览(7)
  • HBase 伪分布式环境搭建 - 头歌

    HBase 伪分布式环境搭建 - 头歌

    mkdir /app cd /opt tar -zxvf hbase-2.1.1-bin.tar.gz -C /app cd /app cd hbase-2.1.1 cd conf echo $JAVA_HOME /usr/lib/jvm/jdk1.8.0_111 vim hbase-env.sh 进入文件,找到下图中红色框框内的路径,将#去掉,把=号后面的路径改成/usr/lib/jvm/jdk1.8.0_111。注意:记得先按A,才能改。 按esc键,输入  :wq  ,回车。 vim hb

    2024年04月28日
    浏览(5)
  • Hbase数据库完全分布式搭建以及java中操作Hbase

    Hbase数据库完全分布式搭建以及java中操作Hbase

    基础的环境准备不在赘述,包括jdk安装,防火墙关闭,网络配置,环境变量的配置,各个节点之间进行免密等操作等。使用的版本2.0.5. 参考官方文档 分布式的部署,都是在单节点服务的基础配置好配置,直接分发到其他节点即可。 jdk路径的配置,以及不适用内部自带的zk. 配

    2024年02月03日
    浏览(11)
  • 头歌大数据——HBase 伪分布式环境搭建

    第1关:HBASE伪分布式环境搭建 编程要求 好了,到你啦,你需要先按照上次实训——HBase单节点安装的方式将 HBase 安装在 /app 目录下,然后根据本关知识配置好伪分布式的 HBase ,最后点击测评即可通关。 测试说明 程序会检测你的 HBase 服务和 Hadoop 服务是否启动,以及伪分布

    2024年02月08日
    浏览(33)
  • Hadoop3.x完全分布式环境搭建Zookeeper和Hbase

    Hadoop3.x完全分布式环境搭建Zookeeper和Hbase

    集群规划 IP地址 主机名 集群身份 192.168.138.100 hadoop00 主节点 192.168.138.101 hadoop01 从节点 192.168.138.102 hadoop02 从节点 Hadoop完全分布式环境搭建请移步传送门 先在主节点上进行安装和配置,随后分发到各个从节点上。 1.1 解压zookeeper并添加环境变量 1)解压zookeeper到/usr/local文件夹

    2024年02月04日
    浏览(13)
  • 【生产级实践】Docker部署配置Hadoop3.x + HBase2.x实现真正分布式集群环境

    【生产级实践】Docker部署配置Hadoop3.x + HBase2.x实现真正分布式集群环境

    网上找了很多资料,但能够实现Docker安装Hadoop3.X和Hbase2.X真正分布式集群的教程很零散,坑很多, 把经验做了整理, 避免趟坑。 1、机器环境 这里采用三台机器来部署分布式集群环境: 192.168.1.101 hadoop1 (docker管理节点) 192.168.1.102 hadoop2 192.168.1.103 hadoop3 2、下载Docker Hadoop的

    2024年02月02日
    浏览(9)
  • Hbase 系列教程:HBase 分布式文件存储系统解析

    作者:禅与计算机程序设计艺术 : Hbase 是 Apache 的开源 NoSQL 数据库项目之一。它是一个分布式、可扩展的、高性能、面向列的非关系型数据库。作为 Hadoop 大数据生态的一部分,Hbase 以高可用性、可伸缩性和水平可扩展性著称。它提供了一个列族模型(Column Family Model),能

    2024年02月05日
    浏览(8)
  • hbase分布式安装

    hbase分布式安装

    目录 1.安装Hbase 2.配置文件 3.Hbase服务启动      Zookeeper和Hadoop集群正常启动           进入/export/software目录,将hbase软件包导入该目录下。解压hbase软件包 到/export/servers目录下,并重命名为hbase。         进入hbase安装目录下,进入conf目录,修改hbase-env.sh、 hbase-si

    2024年02月10日
    浏览(8)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包