HBase集群复制之验证

这篇具有很好参考价值的文章主要介绍了HBase集群复制之验证。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

0. prerequisite
Suppose 2 HBase pseudo distributed clusters have both started as folowing

relevant parameters in hbase-site.xml  source destnation
hbase.zookeeper.quorum macos ubuntu
hbase.zookeeper.property.clientPort 2181 2181
zookeeper.znode.parent /hbase   /hbase

1. Create table for replication
1) start hbase shell on source cluster and create a table

$ cd $HOME_HBASE
$ bin/hbase shell
> create 'peTable', {NAME => 'info0', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536', METADATA => {'IN_MEMORY_COMPACTION' => 'NONE'}}

2) create excatly same table on destination cluster 

2. Add the destination cluster as a peer in source cluster hbase shell

> add_peer 'ubt_pe', CLUSTER_KEY => "ubuntu:2181:/hbase", TABLE_CFS => { "peTable" => []}

3. Enable the table for replication in source cluster hbase shell

> enable_table_replication 'peTable'

4. Put data by using HBase PerformanceEvaluation tool

$ cd $HOME_HBASE
$ bin/hbase pe --table=peTable --nomapred --valueSize=100 randomWrite 1
2023-09-08 19:57:55,256 INFO  [main] hbase.PerformanceEvaluation: RandomWriteTest test run options={"cmdName":"randomWrite","nomapred":true,"filterAll":false,"startRow":0,"size":0.0,"perClientRunRows":1048576,"numClientThreads":1,"totalRows":1048576,"measureAfter":0,"sampleRate":1.0,"traceRate":0.0,"tableName":"peTable","flushCommits":true,"writeToWAL":true,"autoFlush":false,"oneCon":false,"connCount":-1,"useTags":false,"noOfTags":1,"reportLatency":false,"multiGet":0,"multiPut":0,"randomSleep":0,"inMemoryCF":false,"presplitRegions":0,"replicas":1,"compression":"NONE","bloomType":"ROW","blockSize":65536,"blockEncoding":"NONE","valueRandom":false,"valueZipf":false,"valueSize":100,"period":104857,"cycles":1,"columns":1,"families":1,"caching":30,"latencyThreshold":0,"addColumns":true,"inMemoryCompaction":"NONE","asyncPrefetch":false,"cacheBlocks":true,"scanReadType":"DEFAULT","bufferSize":"2097152"}
...
2023-09-08 19:57:58,476 INFO  [TestClient-0] hbase.PerformanceEvaluation: row [start=0, current=104857, last=1048576], latency [mean=19.87, min=0.00, max=328487.00, stdDev=1355.87, 95th=1.00, 99th=8.00]
2023-09-08 19:57:59,679 INFO  [TestClient-0] hbase.PerformanceEvaluation: row [start=0, current=209714, last=1048576], latency [mean=15.34, min=0.00, max=328487.00, stdDev=1026.36, 95th=1.00, 99th=4.00]
...
2023-09-08 19:58:10,520 INFO  [TestClient-0] hbase.PerformanceEvaluation: row [start=0, current=1048570, last=1048576], latency [mean=13.17, min=0.00, max=328487.00, stdDev=780.16, 95th=0.00, 99th=1.00]
2023-09-08 19:58:10,569 INFO  [TestClient-0] hbase.PerformanceEvaluation: Test : RandomWriteTest, Thread : TestClient-0
2023-09-08 19:58:10,577 INFO  [TestClient-0] hbase.PerformanceEvaluation: Latency (us) : mean=13.17, min=0.00, max=328487.00, stdDev=780.16, 50th=0.00, 75th=0.00, 95th=0.00, 99th=1.00, 99.9th=19.00, 99.99th=28853.39, 99.999th=58579.15
2023-09-08 19:58:10,577 INFO  [TestClient-0] hbase.PerformanceEvaluation: Num measures (latency) : 1048575
2023-09-08 19:58:10,584 INFO  [TestClient-0] hbase.PerformanceEvaluation: Mean      = 13.17
Min       = 0.00
Max       = 328487.00
StdDev    = 780.16
50th      = 0.00
75th      = 0.00
95th      = 0.00
99th      = 1.00
99.9th    = 19.00
99.99th   = 28853.39
99.999th  = 58579.15
2023-09-08 19:58:10,584 INFO  [TestClient-0] hbase.PerformanceEvaluation: No valueSize statistics available
2023-09-08 19:58:10,586 INFO  [TestClient-0] hbase.PerformanceEvaluation: Finished class org.apache.hadoop.hbase.PerformanceEvaluation$RandomWriteTest in 14286ms at offset 0 for 1048576 rows (9.24 MB/s)
2023-09-08 19:58:10,586 INFO  [TestClient-0] hbase.PerformanceEvaluation: Finished TestClient-0 in 14286ms over 1048576 rows
2023-09-08 19:58:10,586 INFO  [main] hbase.PerformanceEvaluation: [RandomWriteTest] Summary of timings (ms): [14286]
2023-09-08 19:58:10,595 INFO  [main] hbase.PerformanceEvaluation: [RandomWriteTest duration ]	Min: 14286ms	Max: 14286ms	Avg: 14286ms
2023-09-08 19:58:10,595 INFO  [main] hbase.PerformanceEvaluation: [ Avg latency (us)]	13
2023-09-08 19:58:10,596 INFO  [main] hbase.PerformanceEvaluation: [ Avg TPS/QPS]	73399	 row per second
2023-09-08 19:58:10,596 INFO  [main] client.AsyncConnectionImpl: Connection has been closed by main.

Note, help of PerformanceEvaluation can be shown as:

$ bin/hbase pe

5. Count rows on source and peer
1) in source cluster hbase shell

> count 'peTable'
Current count: 1000, row: 00000000000000000000001563                                                
Current count: 2000, row: 00000000000000000000003160 
...
Current count: 663000, row: 00000000000000000001048457                                              
663073 row(s)
Took 12.9970 seconds

2) in peer cluster hbase shell

> count 'peTable'
Current count: 1000, row: 00000000000000000000001563                                                
Current count: 2000, row: 00000000000000000000003160
...
Current count: 663000, row: 00000000000000000001048457                                              
663073 row(s)
Took 7.1883 seconds                                                                                 

6. Verify replication by using VerifyReplication class from source cluster hbase shell

$ cd $HOME_HBASE
$ bin/hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication 'ubt_pe' 'peTable'
2023-09-08 20:14:37,199 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=VerifyReplication connecting to ZooKeeper ensemble=localhost:2181
...
2023-09-08 20:14:44,393 INFO  [main] mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1694172104063_0001/
2023-09-08 20:14:44,394 INFO  [main] mapreduce.Job: Running job: job_1694172104063_0001
2023-09-08 20:14:54,521 INFO  [main] mapreduce.Job: Job job_1694172104063_0001 running in uber mode : false
2023-09-08 20:14:54,524 INFO  [main] mapreduce.Job:  map 0% reduce 0%
2023-09-08 20:20:18,907 INFO  [main] mapreduce.Job:  map 100% reduce 0%
2023-09-08 20:20:19,924 INFO  [main] mapreduce.Job: Job job_1694172104063_0001 completed successfully
2023-09-08 20:20:20,040 INFO  [main] mapreduce.Job: uces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=321487
		Total vcore-milliseconds taken by all map tasks=321487
		Total megabyte-milliseconds taken by all map tasks=329202688
	Map-Reduce Framework
		Map input records=663073
		Map output records=0
		Input split bytes=105
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=707
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=114819072
	HBaseCounters
		BYTES_IN_REMOTE_RESULTS=103439388
		BYTES_IN_RESULTS=103439388
		MILLIS_BETWEEN_NEXTS=313921
		NOT_SERVING_REGION_EXCEPTION=0
		REGIONS_SCANNED=1
		REMOTE_RPC_CALLS=60
		REMOTE_RPC_RETRIES=0
		ROWS_FILTERED=17
		ROWS_SCANNED=663073
		RPC_CALLS=60
		RPC_RETRIES=0
	org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier$Counters
		GOODROWS=663073
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=0

Note, help of VerifyReplication can be shown as:文章来源地址https://www.toymoban.com/news/detail-701939.html

$ bin/hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication –help

到了这里,关于HBase集群复制之验证的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 《Hadoop核心技术》Hbase集群部署,创建表,删除表,插入数据,查询数据

    额前言:         我是一名正在学习《Hadoop核心技术》的学生,今天跟大家分享一下在虚拟机上在Hadoop集群中用Hbase进行简单的增删查 可以进行随机访问的存取和检索数据的存储平台         HBase 是一个开源的、分布式的、版本化的 NoSQL 数据库(也即非关系型数据库

    2024年02月03日
    浏览(53)
  • 云计算集群搭建记录[Hadoop|Zookeeper|Hbase|Spark | Docker |OpenStack]更新索引 |动态更新

    为了能够更好的查看所更新的文章,讲该博文设为索引 为了解决在编辑文件等操作的过程中的权限问题,博主一律 默认采用 root 账户登录 对于初次安装的用户可以采用如下命令行: 另外推荐一款终端工具:Tabby,既能够连接自己的服务器,也能够连接自己本地的虚拟机,还

    2023年04月13日
    浏览(55)
  • Linux多虚拟机集群化配置详解(Zookeeper集群、Kafka集群、Hadoop集群、HBase集群、Spark集群、Flink集群、Zabbix、Grafana部署)

    前面安装的软件,都是以单机模式运行的,学习大数据相关的软件部署,后续安装软件服务,大多数都是以集群化(多台服务器共同工作)模式运行的。所以,需要完成集群化环境的前置准备,包括创建多台虚拟机,配置主机名映射,SSH免密登录等等。 我们可以使用VMware提供

    2024年02月04日
    浏览(52)
  • 大数据集群搭建全部过程(Vmware虚拟机、hadoop、zookeeper、hive、flume、hbase、spark、yarn)

    1.网关配置(参照文档) 注意事项:第一台虚拟机改了,改为centos 101 ,地址为192.168.181.130 网关依然是192.168.181.2,但是一定要注意,它在D盘的文件名称是Hadoop 101,后面重新搭建的会命名文件夹为hadoop 101,hadoop 102和hadoop 103,然后发到一个总的文件夹hadoop_03里面去 VMnet8的IP地址一定

    2024年02月02日
    浏览(100)
  • 【生产级实践】Docker部署配置Hadoop3.x + HBase2.x实现真正分布式集群环境

    网上找了很多资料,但能够实现Docker安装Hadoop3.X和Hbase2.X真正分布式集群的教程很零散,坑很多, 把经验做了整理, 避免趟坑。 1、机器环境 这里采用三台机器来部署分布式集群环境: 192.168.1.101 hadoop1 (docker管理节点) 192.168.1.102 hadoop2 192.168.1.103 hadoop3 2、下载Docker Hadoop的

    2024年02月02日
    浏览(49)
  • 实操Hadoop大数据高可用集群搭建(hadoop3.1.3+zookeeper3.5.7+hbase3.1.3+kafka2.12)

    前言 纯实操,无理论,本文是给公司搭建测试环境时记录的,已经按照这一套搭了四五遍大数据集群了,目前使用还未发现问题。 有问题麻烦指出,万分感谢! PS:Centos7.9、Rocky9.1可用 集群配置 ip hostname 系统 CPU 内存 系统盘 数据盘 备注 192.168.22.221 hadoop1 Centos7.9 4 16 250G 19

    2024年02月03日
    浏览(39)
  • Zookeeper+Hadoop+Spark+Flink+Kafka+Hbase+Hive 完全分布式高可用集群搭建(保姆级超详细含图文)

    说明: 本篇将详细介绍用二进制安装包部署hadoop等组件,注意事项,各组件的使用,常用的一些命令,以及在部署中遇到的问题解决思路等等,都将详细介绍。 ip hostname 192.168.1.11 node1 192.168.1.12 node2 192.168.1.13 node3 1.2.1系统版本 1.2.2内存建议最少4g、2cpu、50G以上的磁盘容量 本次

    2024年02月12日
    浏览(53)
  • HBase 复制、备份、迁移

    分享1 阿里云 BDS-HBase 《HBase高效一键迁移的设计与实践.pdf》 https://developer.aliyun.com/live/730 https://developer.aliyun.com/article/704972 https://developer.aliyun.com/article/704977 https://blog.csdn.net/u013411339/article/details/101185699 分享2 腾讯FIT HBase金融大数据乾坤大挪移 https://www.jianshu.com/p/cb4a645dd66a HB

    2024年01月17日
    浏览(31)
  • HBase架构篇 - Hadoop家族的天之骄子HBase

    HBase 的数据存储在表中。表名是一个字符串。表由行和列组成。 HBase 的行由行键(rowkey)和 n 个列(column)组成。行键没有数据类型,可以看作是字节数组,类似于关系型数据库的主键索引,在整个 HBase 表中是唯一的,按照字母顺序排序。 HBase 的列族由多个列组成,相当于

    2023年04月20日
    浏览(35)
  • Hbase 系列教程:HBase 搭建高可用集群

    作者:禅与计算机程序设计艺术 Apache HBase 是 Apache 基金会开源项目之一,是一个分布式 NoSQL 数据库。它是一个可扩展的、面向列的、存储在 Hadoop 文件系统(HDFS)上的结构化数据存储。它支持 Hadoop 的 MapReduce 和它的周边生态系统,并且可以通过 Thrift 或 RESTful API 来访问。

    2024年02月07日
    浏览(45)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包