flume组件以及通过命令监控大数据平台转态

这篇具有很好参考价值的文章主要介绍了flume组件以及通过命令监控大数据平台转态。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

实验一、Flume 组件安装配置

1、下载和解压 Flume

可 以 从 官 网 下 载 Flume 组 件 安 装 包 , 下 载 地 址 如 下 URL 链 接 所 示 https://archive.apache.org/dist/flume/1.6.0/

[root@master ~]# ls
anaconda-ks.cfg               jdk-8u152-linux-x64.tar.gz
apache-flume-1.6.0-bin.tar.gz mysql
apache-hive-2.0.0-bin.tar.gz   mysql-connector-java-5.1.46.jar
derby.log                     sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
hadoop-2.7.1.tar.gz           zookeeper-3.4.8.tar.gz
hbase-1.2.1-bin.tar.gz
#使用 root 用户解压 Flume 安装包到“/usr/local/src”路径,并修改解压后文件夹名为 flume。
[root@master ~]# tar xf apache-flume-1.6.0-bin.tar.gz -C /usr/local/src/
[root@master ~]# cd /usr/local/src/
[root@master src]# ls
apache-flume-1.6.0-bin hadoop hbase hive jdk sqoop zookeeper
[root@master src]# mv apache-flume-1.6.0-bin flume
[root@master src]# ls
flume hadoop hbase hive jdk sqoop zookeeper

2、Flume 组件部署

步骤一:使用 root 用户设置 Flume 环境变量,并使环境变量对所有用户生效。

[root@master ~]# vim /etc/profile.d/flume.sh
export FLUME_HOME=/usr/local/src/flume
export PATH=${FLUME_HOME}/bin:$PATH

步骤二:修改 Flume 相应配置文件。

首先,切换到 hadoop 用户,并切换当前工作目录到 Flume 的配置文件夹。

[root@master ~]# chown -R hadoop.hadoop /usr/local/src/
[root@master ~]# su - hadoop
Last login: Fri Apr 14 16:31:48 CST 2023 on pts/1
[hadoop@master ~]$ ls
derby.log input student.java zookeeper.out
[hadoop@master ~]$ cd /usr/local/src/flume/conf/
[hadoop@master conf]$ ls
flume-conf.properties.template flume-env.sh.template
flume-env.ps1.template         log4j.properties

拷贝 flume-env.sh.template 文件并重命名为 flume-env.sh。

[hadoop@master conf]$ cp flume-env.sh.template flume-env.sh
[hadoop@master conf]$ ls
flume-conf.properties.template flume-env.sh           log4j.properties
flume-env.ps1.template         flume-env.sh.template

步骤三:修改并配置 flume-env.sh 文件。

删除 JAVA_HOME 变量前的注释,修改为 JDK 的安装路径。

[hadoop@master conf]$ vi flume-env.sh
export JAVA_HOME=/usr/loocal/src/jdk
#export HBASE_CLASSPATH=/usr/local/src/hadoop/etc/hadoop

使用 flume-ng version 命令验证安装是否成功,若能够正常查询 Flume 组件版本为 1.6.0,则表示安装成功。

[hadoop@master conf]$ flume-ng version
Error: Could not find or load main class org.apache.flume.tools.GetJavaProperty
Flume 1.6.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080
Compiled by hshreedharan on Mon May 11 11:15:44 PDT 2015
From source with checksum b29e416802ce9ece3269d34233baf43f

3、使用 Flume 发送和接受信息

通过 Flume 将 Web 服务器中数据传输到 HDFS 中。

步骤一:在 Flume 安装目录中创建 xxx.conf 文件。

[hadoop@master flume]$ vi xxx.conf
a1.sources=r1
a1.sinks=k1
a1.channels=c1
a1.sources.r1.type=spooldir
a1.sources.r1.spoolDir=/usr/local/src/hadoop/logs
a1.sources.r1.fileHeader=true
a1.sinks.k1.type=hdfs
a1.sinks.k1.hdfs.path=hdfs://master:9000/tmp/flume
a1.sinks.k1.hdfs.rollsize=1048760
a1.sinks.k1.hdfs.rollCount=0
a1.sinks.k1.hdfs.rollInterval=900
a1.sinks.k1.hfds.useLocalTimeStamp=true
a1.channels.c1.type=file
a1.channels.c1.capacity=1000
a1.channels.c1.transactionCapacity=100
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
[hadoop@master flume]$ ls /usr/local/src/hadoop/
bin etc     lib     LICENSE.txt NOTICE.txt sbin   tmp
dfs include libexec logs         README.txt share
[hadoop@master flume]$ ls /usr/local/src/hadoop/logs/
hadoop-hadoop-namenode-master.example.com.log
hadoop-hadoop-namenode-master.example.com.out
hadoop-hadoop-namenode-master.example.com.out.1
hadoop-hadoop-namenode-master.example.com.out.2
hadoop-hadoop-namenode-master.example.com.out.3
hadoop-hadoop-namenode-master.example.com.out.4
hadoop-hadoop-namenode-master.example.com.out.5
hadoop-hadoop-secondarynamenode-master.example.com.log
hadoop-hadoop-secondarynamenode-master.example.com.out
hadoop-hadoop-secondarynamenode-master.example.com.out.1
hadoop-hadoop-secondarynamenode-master.example.com.out.2
hadoop-hadoop-secondarynamenode-master.example.com.out.3
hadoop-hadoop-secondarynamenode-master.example.com.out.4
hadoop-hadoop-secondarynamenode-master.example.com.out.5
SecurityAuth-hadoop.audit
yarn-hadoop-resourcemanager-master.example.com.log
yarn-hadoop-resourcemanager-master.example.com.out
yarn-hadoop-resourcemanager-master.example.com.out.1
yarn-hadoop-resourcemanager-master.example.com.out.2
yarn-hadoop-resourcemanager-master.example.com.out.3
yarn-hadoop-resourcemanager-master.example.com.out.4
yarn-hadoop-resourcemanager-master.example.com.out.5

步骤二:使用 flume-ng agent 命令加载 simple-hdfs-flume.conf 配置信息,启 动 flume 传输数据。

[[hadoop@master flume]$ flume-ng agent --conf-file xxx.conf --name a1
Warning: No configuration directory set! Use --conf <dir> to override.
Info: Including Hadoop libraries found via (/usr/local/src/hadoop/bin/hadoop) for HDFS access
Info: Excluding /usr/local/src/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
Info: Excluding /usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
Info: Including HBASE libraries found via (/usr/local/src/hbase/bin/hbase) for HBASE access
Info: Excluding /usr/local/src/hbase/lib/slf4j-api-1.7.7.jar from classpath
Info: Excluding /usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/local/src/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
Info: Excluding /usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
Info: Including Hive libraries found via (/usr/local/src/hive) for Hive access
...
23/04/21 16:02:35 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: r1. src.append.accepted == 0
23/04/21 16:02:35 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: r1. src.append.received == 0
23/04/21 16:02:35 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: r1. src.events.accepted == 17
23/04/21 16:02:35 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: r1. src.events.received == 17
23/04/21 16:02:35 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SOURCE, name: r1. src.open-connection.count == 0
23/04/21 16:02:35 INFO source.SpoolDirectorySource: SpoolDir source r1 stopped. Metrics: SOURCE:r1{src.events.accepted=17, src.open-connection.count=0, src.append.received=0, src.append-batch.received=1, src.append-batch.accepted=1, src.append.accepted=0, src.events.received=17}

ctrl+c 退出 flume 传输

#首先得开启所有节点
[hadoop@master flume]$ start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.example.com.out
192.168.88.201: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.example.com.out
192.168.88.200: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.example.com.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.example.com.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.example.com.out
192.168.88.200: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.example.com.out
192.168.88.201: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.example.com.out
[hadoop@master flume]$ ss -antl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 192.168.88.101:9000 *:*
LISTEN 0 128 *:50090 *:*
LISTEN 0 128 *:50070 *:*
LISTEN 0 128 *:22 *:*
LISTEN 0 128 ::ffff:192.168.88.101:8030 :::*
LISTEN 0 128 ::ffff:192.168.88.101:8031 :::*
LISTEN 0 128 ::ffff:192.168.88.101:8032 :::*
LISTEN 0 128 ::ffff:192.168.88.101:8033 :::*
LISTEN 0 80 :::3306 :::*
LISTEN 0 128 :::22 :::*
LISTEN 0 128 ::ffff:192.168.88.101:8088 :::*

步骤三:查看 Flume 传输到 HDFS 的文件,若能查看到 HDFS 上/tmp/flume 目 录有传输的数据文件,则表示数据传输成功。

[hadoop@master flume]$ hdfs dfs -ls /
Found 5 items
drwxr-xr-x - hadoop supergroup 0 2023-04-07 15:20 /hbase
drwxr-xr-x - hadoop supergroup 0 2023-03-17 17:33 /input
drwxr-xr-x - hadoop supergroup 0 2023-03-17 18:45 /output
drwx------ - hadoop supergroup 0 2023-04-21 16:02 /tmp
drwxr-xr-x - hadoop supergroup 0 2023-04-14 20:48 /user
[hadoop@master flume]$ hdfs dfs -ls /tmp/flume
Found 72 items
-rw-r--r-- 2 hadoop supergroup 1560 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143693
-rw-r--r-- 2 hadoop supergroup 1398 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143694
-rw-r--r-- 2 hadoop supergroup 1456 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143695
-rw-r--r-- 2 hadoop supergroup 1398 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143696
-rw-r--r-- 2 hadoop supergroup 1403 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143697
-rw-r--r-- 2 hadoop supergroup 1434 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143698
-rw-r--r-- 2 hadoop supergroup 1383 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143699
...
-rw-r--r-- 2 hadoop supergroup 1508 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143760
-rw-r--r-- 2 hadoop supergroup 1361 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143761
-rw-r--r-- 2 hadoop supergroup 1359 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143762
-rw-r--r-- 2 hadoop supergroup 1502 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143763
-rw-r--r-- 2 hadoop supergroup 1399 2023-04-21 16:02 /tmp/flume/FlumeData.1682064143764

实验二、通过命令监控大数据平台运行状态

1、通过命令查看大数据平台状态

步骤一:查看 Linux 系统的信息(uname -a)

[root@master ~]# uname -a
Linux master.example.com 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

结果显示,该 Linux 节点名称为 master,内核发行号为 3.10.0-693.el7.x86_64。

步骤二:查看硬盘信息

(1)查看所有分区(fdisk -l)

[root@master ~]# fdisk -l

Disk /dev/sda: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000af885

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 2099199 1048576 83 Linux
/dev/sda2 2099200 209715199 103808000 8e Linux LVM

Disk /dev/mapper/centos-root: 53.7 GB, 53687091200 bytes, 104857600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/centos-swap: 8455 MB, 8455716864 bytes, 16515072 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/centos-home: 44.1 GB, 44149243904 bytes, 86228992 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

结果显示,硬盘空间为 107.4GB。

(2)查看所有交换分区(swapon -s)

[root@master ~]# swapon -s
Filename Type Size Used Priority
/dev/dm-1 partition 8257532 0 -1

结果显示,交换分区为 8257532。

(3)查看文件系统占比(df -h)

[root@master ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 50G 4.6G 46G 10% /
devtmpfs 1.9G 0 1.9G 0% /dev
tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs 1.9G 12M 1.9G 1% /run
tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
/dev/mapper/centos-home 42G 36M 42G 1% /home
/dev/sda1 1014M 142M 873M 14% /boot
tmpfs 378M 0 378M 0% /run/user/0

结果显示,挂载点“/”的容量为 50G,已使用 4.6G。

步骤三:查看网络 IP 地址(ip a)

[root@master ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:20:cd:03 brd ff:ff:ff:ff:ff:ff
inet 192.168.88.101/24 brd 192.168.88.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::75a3:9da2:ede2:5be7/64 scope link noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::1c1b:d0e3:f01a:7c11/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever

结果显示 ens33 的 IP 地址为 192.168.88.101,子网掩码为 255.255.255.0;回环地址 为 127.0.0.1,子网掩码为 255.0.0.0。

步骤四:查看所有监听端口(netstat -lntp)

[root@master ~]# netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 908/sshd
tcp6 0 0 :::3306 :::* LISTEN 1053/mysqld
tcp6 0 0 :::22 :::* LISTEN 908/sshd

结果显示,在监听的端口分别为 22、3306。

步骤五:查看所有已经建立的连接(netstat -antp)

[root@master ~]# netstat -antp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 908/sshd
tcp 0 36 192.168.88.101:22 192.168.88.1:61450 ESTABLISHED 1407/sshd: root@pts
tcp6 0 0 :::3306 :::* LISTEN 1053/mysqld
tcp6 0 0 :::22 :::* LISTEN 908/sshd

结果显示,已经连接上的本地端口分别为 50608、22、9000、8031 等。

步骤六:实时显示进程状态(top),该命令可以查看进程对 CPU、内存的占比 等。

image-20230423214902071

步骤七:查看 CPU 信息( cat /proc/cpuinfo)

[root@master ~]# cat /proc/cpuinfo 
processor : 0
vendor_id : AuthenticAMD
cpu family : 25
model : 80
model name : AMD Ryzen 7 5800U with Radeon Graphics
stepping : 0
cpu MHz : 1896.438
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl tsc_reliable nonstop_tsc extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext retpoline_amd vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero ibpb ibrs arat pku ospke overflow_recov succor
bogomips : 3792.87
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 45 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : AuthenticAMD
cpu family : 25
model : 80
model name : AMD Ryzen 7 5800U with Radeon Graphics
stepping : 0
cpu MHz : 1896.438
cache size : 512 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl tsc_reliable nonstop_tsc extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext retpoline_amd vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero ibpb ibrs arat pku ospke overflow_recov succor
bogomips : 3792.87
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 45 bits physical, 48 bits virtual

步骤八:查看内存信息( cat /proc/meminfo),该命令可以查看总内存、空 闲内存等信息。

[root@master ~]# cat /proc/meminfo 
MemTotal: 3863564 kB
MemFree: 2971196 kB
MemAvailable: 3301828 kB
Buffers: 2120 kB
Cached: 529608 kB
SwapCached: 0 kB
Active: 626584 kB
Inactive: 130828 kB
Active(anon): 226348 kB
Inactive(anon): 11152 kB
Active(file): 400236 kB
Inactive(file): 119676 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 8257532 kB
SwapFree: 8257532 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 225716 kB
Mapped: 28960 kB
Shmem: 11816 kB
Slab: 59428 kB
SReclaimable: 30680 kB
SUnreclaim: 28748 kB
KernelStack: 4432 kB
PageTables: 3664 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 10189312 kB
Committed_AS: 773684 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 185376 kB
VmallocChunk: 34359310332 kB
HardwareCorrupted: 0 kB
AnonHugePages: 180224 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 73536 kB
DirectMap2M: 3072000 kB
DirectMap1G: 3145728 kB

2、通过命令查看 Hadoop 状态

步骤一:切换到 hadoop 用户

[root@master ~]# su - hadoop
Last login: Fri Apr 21 15:26:05 CST 2023 on pts/0
Last failed login: Fri Apr 21 16:02:08 CST 2023 from slave1 on ssh:notty
There were 4 failed login attempts since the last successful login.

若当前的用户为 root,请切换到 hadoop 用户进行操作。

步骤二:切换到 Hadoop 的安装目录

[hadoop@master ~]$ cd /usr/local/src/hadoop/

步骤三:启动 Hadoop

[hadoop@master ~]$ start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.example.com.out
192.168.88.201: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.example.com.out
192.168.88.200: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.example.com.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.example.com.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.example.com.out
192.168.88.200: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.example.com.out
192.168.88.201: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.example.com.out

步骤四:关闭 Hadoop

[hadoop@master hadoop]$ stop-all.sh 
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [master]
master: stopping namenode
192.168.88.201: stopping datanode
192.168.88.200: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
stopping yarn daemons
stopping resourcemanager
192.168.88.200: stopping nodemanager
192.168.88.201: stopping nodemanager
no proxyserver to stop

实验三、通过命令监控大数据平台资源状态

1、通过命令查看 YARN 状态

步骤一:确认切换到目录 /usr/local/src/hadoop

[hadoop@master hadoop]$ cd /usr/local/src/hadoop/

步骤二:返回主机界面在 Master 主机上执行 start-all.sh

#master 节点启动 zookeeper
[hadoop@master hadoop]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

#slave1 节点启动 zookeeper
[root@slave1 ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

#slave2 节点启动 zookeeper
[root@slave2 ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

master节点
[hadoop@master hadoop]$ start-all.sh

步 骤 三 : 执 行 JPS 命 令 , 发 现 Master 上 有 NodeManager 进程和 ResourceManager 进程,则 YARN 启动完成。

[hadoop@master hadoop]$ jps

执行结果如下,说明 YARN 已启动。

[hadoop@master hadoop]$ jps
2642 QuorumPeerMain
2994 SecondaryNameNode
3154 ResourceManager
3413 Jps
2795 NameNode

2、通过命令查看 HDFS 状态

步骤一:目录操作

切换到 hadoop 目录,执行 cd /usr/local/src/hadoop 命令

[hadoop@master hadoop]$ cd /usr/local/src/hadoop/

查看 HDFS 目录

[hadoop@master hadoop]$  ./bin/hdfs dfs -ls /
Found 5 items
drwxr-xr-x - hadoop supergroup 0 2023-04-07 15:20 /hbase
drwxr-xr-x - hadoop supergroup 0 2023-03-17 17:33 /input
drwxr-xr-x - hadoop supergroup 0 2023-03-17 18:45 /output
drwx------ - hadoop supergroup 0 2023-04-21 16:02 /tmp
drwxr-xr-x - hadoop supergroup 0 2023-04-14 20:48 /user

步骤二:查看 HDSF 的报告,执行命令: bin/hdfs dfsadmin -report

[hadoop@master hadoop]$ bin/hdfs dfsadmin -report
Configured Capacity: 107321753600 (99.95 GB)
Present Capacity: 102855995392 (95.79 GB)
DFS Remaining: 102852079616 (95.79 GB)
DFS Used: 3915776 (3.73 MB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (2):

Name: 192.168.88.201:50010 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 53660876800 (49.98 GB)
DFS Used: 1957888 (1.87 MB)
Non DFS Used: 2232373248 (2.08 GB)
DFS Remaining: 51426545664 (47.89 GB)
DFS Used%: 0.00%
DFS Remaining%: 95.84%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun Apr 23 21:56:46 CST 2023


Name: 192.168.88.200:50010 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 53660876800 (49.98 GB)
DFS Used: 1957888 (1.87 MB)
Non DFS Used: 2233384960 (2.08 GB)
DFS Remaining: 51425533952 (47.89 GB)
DFS Used%: 0.00%
DFS Remaining%: 95.83%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun Apr 23 21:56:46 CST 2023

步骤三:查看 HDFS 空间情况,执行命令:hdfs dfs -df

[hadoop@master hadoop]$ hdfs dfs -df /
Filesystem Size Used Available Use%
hdfs://master:9000 107321753600 3915776 102852079616 0%

3、通过命令查看 HBase 状态

步骤一:启动运行 HBase

切换到 HBase 安装目录/usr/local/src/hbase,命令如下:

[hadoop@master hadoop]$ cd /usr/local/src/hbase/
[hadoop@master hbase]$ hbase version
HBase 1.2.1
Source code repository git://asf-dev/home/busbey/projects/hbase revision=8d8a7107dc4ccbf36a92f64675dc60392f85c015
Compiled by busbey on Wed Mar 30 11:19:21 CDT 2016
From source with checksum f4bb4a14bb4e0b72b46f729dae98a772

结果显示 HBase1.2.1,说明 HBase 正在运行,版本号为 1.2.1。

如果没有启动,则执行命令 start-hbase.sh 启动 HBase。
[hadoop@master hbase]$ start-hbase.sh
master: starting zookeeper, logging to /usr/local/src/hbase/logs/hbase-hadoop-zookeeper-master.example.com.out
slave1: starting zookeeper, logging to /usr/local/src/hbase/logs/hbase-hadoop-zookeeper-slave1.example.com.out
slave2: starting zookeeper, logging to /usr/local/src/hbase/logs/hbase-hadoop-zookeeper-slave2.example.com.out
starting master, logging to /usr/local/src/hbase/logs/hbase-hadoop-master-master.example.com.out
slave2: starting regionserver, logging to /usr/local/src/hbase/logs/hbase-hadoop-regionserver-slave2.example.com.out
slave1: starting regionserver, logging to /usr/local/src/hbase/logs/hbase-hadoop-regionserver-slave1.example.com.out

步骤二:查看 HBase 版本信息

执行命令hbase shell,进入HBase命令交互界面

[hadoop@master hbase]$ hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016

hbase(main):001:0>

输入 version,查询 HBase 版本

hbase(main):001:0> version
1.2.1, r8d8a7107dc4ccbf36a92f64675dc60392f85c015, Wed Mar 30 11:19:21 CDT 2016

结果显示 HBase 版本为 1.2.1。

步骤三:查询 HBase 状态,在 HBase 命令交互界面,执行 status 命令

hbase(main):002:0> status
1 active master, 0 backup masters, 2 servers, 0 dead, 1.0000 average load

查询结果显示,1 台活动 master,0 台备份 masters,共 3 台服务主机,平均加载 时间为 0.6667 秒。

我们还可以“简单”查询 HBase 的状态,执行命令 status 'simple'

hbase(main):003:0> status 'simple'
active master: master:16000 1682258321433
0 backup masters
2 live servers
slave1:16020 1682258322908
requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=19, maxHeapMB=440, numberOfStores=2, numberOfStorefiles=0, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=7, writeRequestsCount=4, rootIndexSizeKB=0 totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[MultiRowMutationEndpoint]
slave2:16020 1682258322896
requestsPerSecond=0.0, numberOfOnlineRegions=0, usedHeapMB=11, maxHeapMB=440, numberOfStores=0, numberOfStorefiles=0, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[]
0 dead servers
Aggregate load: 0, regions: 2

显示更多的关于 Master、Slave1 和 Slave2 主机的服务端口、请求时间等详细信息。

如果需要查询更多关于 HBase 状态,执行命令 help 'status'

hbase(main):004:0> help 'status'
Show cluster status. Can be 'summary', 'simple', 'detailed', or 'replication'. The
default is 'summary'. Examples:

hbase> status
hbase> status 'simple'
hbase> status 'summary'
hbase> status 'detailed'
hbase> status 'replication'
hbase> status 'replication', 'source'
hbase> status 'replication', 'sink'

hbase(main):005:0> quit
[hadoop@master hbase]$

结果显示出所有关于 status 的命令。

步骤四 停止 HBase 服务

停止 HBase 服务,则执行命令 stop-hbase.sh。

[hadoop@master hbase]$ stop-hbase.sh 
stopping hbase.................
slave1: no zookeeper to stop because no pid file /tmp/hbase-hadoop-zookeeper.pid
slave2: no zookeeper to stop because no pid file /tmp/hbase-hadoop-zookeeper.pid
master: no zookeeper to stop because no pid file /tmp/hbase-hadoop-zookeeper.pid

没有错误提示,显示$提示符时,即停止了 HBase 服务。

4、通过命令查看 Hive 状态

步骤一:启动 Hive

切换到/usr/local/src/hive 目录,输入 hive,回车。

[hadoop@master hbase]$ cd /usr/local/src/hive/
[hadoop@master hive]$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>

当显示 hive>时,表示启动成功,进入到了 Hive shell 状态。

步骤二:Hive 操作基本命令

注意:Hive 命令行语句后面一定要加分号。

(1)查看数据库

hive> show databases;
OK
default
sample
Time taken: 0.973 seconds, Fetched: 2 row(s)

显示默认的数据库 default。

(2)查看 default 数据库所有表

hive> use default;
OK
Time taken: 0.02 seconds
hive> show tables;
OK
test
Time taken: 2.201 seconds, Fetched: 1 row(s)

显示 default 数据中没有任何表。

(3)创建表 stu,表的 id 为整数型,name 为字符型

hive> create table stu(id int,name string);
OK
Time taken: 0.432 seconds

(4)为表 stu 插入一条信息,id 号为 001,name 为liuyaling

hive> insert into stu values (001,"liuyaling");
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20230423222915_a95e9891-fdf5-4739-a63e-fcadecc85e28
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1682258121749_0001, Tracking URL = http://master:8088/proxy/application_1682258121749_0001/
Kill Command = /usr/local/src/hadoop/bin/hadoop job -kill job_1682258121749_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2023-04-23 22:31:36,420 Stage-1 map = 0%, reduce = 0%
2023-04-23 22:31:43,892 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.72 sec
MapReduce Total cumulative CPU time: 2 seconds 720 msec
Ended Job = job_1682258121749_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://master:9000/user/hive/warehouse/stu/.hive-staging_hive_2023-04-23_22-30-32_985_529079703757687911-1/-ext-10000
Loading data to table default.stu
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 2.72 sec HDFS Read: 4135 HDFS Write: 79 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 720 msec
OK
Time taken: 72.401 seconds

按照以上操作,继续插入两条信息:id 和 name 分别为 1002、1003 和 yanhaoxiang、tnt。

(5)插入数据后查看表的信息

hive> show tables;
OK
stu
test
values__tmp__table__1
values__tmp__table__2
Time taken: 0.026 seconds, Fetched: 4 row(s)

(6)查看表 stu 结构

hive> desc stu;
OK
id int
name string
Time taken: 0.041 seconds, Fetched: 2 row(s)

(7)查看表 stu 的内容

hive> select * from stu;
OK
1 liuyaling
1002 yanhaoxiang
1002 yanhaoxiang
1003 tnt
Time taken: 0.118 seconds, Fetched: 4 row(s)

步骤三:通过 Hive 命令行界面查看文件系统和历史命令

(1)查看本地文件系统,执行命令 ! ls /usr/local/src;

hive> ! ls /usr/local/src;
flume
hadoop
hbase
hive
jdk
sqoop
zookeeper

(2)查看 HDFS 文件系统,执行命令 dfs -ls /;

hive> dfs -ls /;
Found 5 items
drwxr-xr-x - hadoop supergroup 0 2023-04-23 21:58 /hbase
drwxr-xr-x - hadoop supergroup 0 2023-03-17 17:33 /input
drwxr-xr-x - hadoop supergroup 0 2023-03-17 18:45 /output
drwx------ - hadoop supergroup 0 2023-04-21 16:02 /tmp
drwxr-xr-x - hadoop supergroup 0 2023-04-14 20:48 /user

hive> exit;

(3)查看在 Hive 中输入的所有历史命令

进入到当前用户 Hadoop 的目录/home/hadoop,查看.hivehistory 文件。

[hadoop@master hive]$ cd /home/hadoop/
[hadoop@master ~]$ cat .hivehistory
show databases;
create database sample;
show databases;
use sample;
create table student(number string,name string);
exit
show databases;
exit;
use sample;
select * from student;
sqoop export --connect "jdbc:mysql://master:3306/sample?useUnicode=true&characterEncoding=utf-8" --username root --password Password@123! --table student --input-fields-terminated-by '|' --export-dir /user/hive/warehouse/sample.db/student/*
sqoop export --connect "jdbc:mysql://master:3306/sample?useUnicode=true&characterEncoding=utf-8" --username root --password Password@123! --table student --input-fields-terminated-by '|' --export-dir /user/hive/warehouse/sample.db/student/*;
sqoop export --connect "jdbc:mysql://master3306/sample?useUnicode=true&characterEncoding=utf-8" --username root --password Password@123! --table student --input-fields-terminated-by '|' --export-dir /user/hive/warehouse/sample.db/student/*
sqoop export --connect "jdbc:mysql://master3306/sample?useUnicode=true&characterEncoding=utf-8" --username root --password Password@123! --table student --input-fields-terminated-by '|' --export-dir /user/hive/warehouse/sample.db/student/*;
sqoop export --connect "jdbc:mysql://master3306/sample?useUnicode=true&characterEncoding=utf-8" --username root --password Password@123! --table student --input-fields-terminated-by '|' --export-dir /user/hive/warehouse/sample.db/student/*
exit
show databases;
use sample;
show tables;
use default;
show tables;
quit
'
select*from default.test;
quit;
show databases;
use default;
show tables;
insert into stu values (001,"liuyaling");
create table stu(id int,name string);
insert into stu values (001,"liuyaling");
insert into stu values (1002,"yanhaoxiang");
hive
insert into stu values (1003,"tnt");
quit
exit
exit;
quit;
show tables;
insert into stu values (1002,"yanhaoxiang");
insert into stu values (1003,"tnt");
show tables;
desc stu;
select * from stu;
! ls /usr/local/src;
dfs -ls /;
exit;

结果显示,之前在 Hive 命令行界面下运行的所有命令(含错误命令)都显示了出 来,有助于维护、故障排查等工作。

实验四、通过命令监控大数据平台服务状态

1、通过命令查看 ZooKeeper 状态

步骤一: 查看 ZooKeeper 状态,执行命令 zkServer.sh status,结果显示如下

[hadoop@master ~]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Mode: follower

以上结果中,Mode:follower 表示为 ZooKeeper 的跟随者。

步骤二: 查看运行进程

QuorumPeerMain:QuorumPeerMain 是 ZooKeeper 集群的启动入口类,是用来加载配 置启动 QuorumPeer 线程的。

执行命令 jps 以查看进程情况。

[hadoop@master ~]$ jps
2642 QuorumPeerMain
2994 SecondaryNameNode
3154 ResourceManager
5400 Jps
2795 NameNode

此时 QuorumPeerMain 进程已启动。

步骤四: 在成功启动 ZooKeeper 服务后,输入命令 zkCli.sh,连接到 ZooKeeper 服务。

[hadoop@master ~]$ zkCli.sh 
Connecting to localhost:2181
2023-04-23 22:39:17,093 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT
2023-04-23 22:39:17,096 [myid:] - INFO [main:Environment@100] - Client environment:host.name=master
2023-04-23 22:39:17,096 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.8.0_152
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/local/src/jdk/jre
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/usr/local/src/zookeeper/bin/../build/classes:/usr/local/src/zookeeper/bin/../build/lib/*.jar:/usr/local/src/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/src/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/src/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/usr/local/src/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/local/src/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/local/src/zookeeper/bin/../zookeeper-3.4.8.jar:/usr/local/src/zookeeper/bin/../src/java/lib/*.jar:/usr/local/src/zookeeper/bin/../conf:/usr/local/src/sqoop/lib:
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:os.version=3.10.0-862.el7.x86_64
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:user.name=hadoop
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/home/hadoop
2023-04-23 22:39:17,097 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/home/hadoop
2023-04-23 22:39:17,099 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@69d0a921
Welcome to ZooKeeper!
2023-04-23 22:39:17,122 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2023-04-23 22:39:17,208 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session
2023-04-23 22:39:17,223 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x187ae88b8390000, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0]

步骤五: 使用 Watch 监听/hbase 目录,一旦/hbase 内容有变化,将会有提 示。打开监视,执行命令 get /hbase 1。

[zk: localhost:2181(CONNECTED) 0] get /hbase 1

cZxid = 0x500000002
ctime = Sun Apr 23 21:58:42 CST 2023
mZxid = 0x500000002
mtime = Sun Apr 23 21:58:42 CST 2023
pZxid = 0x500000062
cversion = 18
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 14

[zk: localhost:2181(CONNECTED) 1] quit
Quitting...
2023-04-23 22:40:16,816 [myid:] - INFO [main:ZooKeeper@684] - Session: 0x187ae88b8390001 closed
2023-04-23 22:40:16,817 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@519] - EventThread shut down for session: 0x187ae88b8390001
[hadoop@master ~]$

结果显示,当执行命令 set /hbase value-update 后,数据版本由 0 变成 1,说明 /hbase 处于监控中。

2、通过命令查看 Sqoop 状态

步骤一: 查询 Sqoop 版本号,验证 Sqoop 是否启动成功。

首先切换到/usr/local/src/sqoop 目录,执行命令:./bin/sqoop-version

[hadoop@master ~]$ cd /usr/local/src/sqoop/
[hadoop@master sqoop]$ ./bin/sqoop-version
Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
23/04/23 22:40:59 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Sqoop 1.4.7
git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
Compiled by maugli on Thu Dec 21 15:59:58 STD 2017

结果显示:Sqoop 1.4.7,说明 Sqoop 版本号为 1.4.7,并启动成功。

步骤二: 测试 Sqoop 是否能够成功连接数据库

切换到 Sqoop 的 目 录 , 执 行 命 令 bin/sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password Password123$,命令中 “master:3306”为数据库主机名和端口。

[hadoop@master sqoop]$  bin/sqoop list-databases --connect jdbc:mysql://master:3306/ --username root --password Password@123!
Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
23/04/23 22:42:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
23/04/23 22:42:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
23/04/23 22:42:16 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
Sun Apr 23 22:42:16 CST 2023 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
information_schema
hive
mysql
performance_schema
sample
sys

结果显示,可以连接到 MySQL,并查看到 Master 主机中 MySQL 的所有库实例,如 information_schema、hive、mysql、performance_schema 和 sys 等数据库。

步骤三: 执行命令 sqoop help,可以看到如下内容,代表 Sqoop 启动成功。

[hadoop@master sqoop]$ sqoop help
Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
23/04/23 22:42:37 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
usage: sqoop COMMAND [ARGS]

Available commands:
codegen Generate code to interact with database records
create-hive-table Import a table definition into Hive
eval Evaluate a SQL statement and display the results
export Export an HDFS directory to a database table
help List available commands
import Import a table from a database to HDFS
import-all-tables Import tables from a database to HDFS
import-mainframe Import datasets from a mainframe server to HDFS
job Work with saved jobs
list-databases List available databases on a server
list-tables List available tables in a database
merge Merge results of incremental imports
metastore Run a standalone Sqoop metastore
version Display version information

See 'sqoop help COMMAND' for information on a specific command.

结果显示了 Sqoop 的常用命令和功能,如下表所示。

序号 命令 功能
1 import 将数据导入到集群
2 export 将集群数据导出
3 codegen 生成与数据库记录交互的代码
4 create-hivetable 创建 Hive 表
5 eval 查看 SQL 执行结果
6 import-all-tables 导入某个数据库下所有表到 HDFS 中
7 job 生成一个 job
8 list-databases 列出所有数据库名
9 list-tables 列出某个数据库下所有的表
10 merge 将 HDFS 中不同目录下数据合在一起,并存放在指定的目录 中
11 metastore 记录 Sqoop job 的元数据信息,如果不启动 metastore 实 例,则默认的元数据存储目录为:~/.sqoop
12 help 打印 Sqoop 帮助信息
13 version 打印 Sqoop 版本信息

3、通过命令查看 Flume 状态

步骤一:检查 Flume 安装是否成功,执行 flume-ng version 命令,查看 Flume 的版本。

[hadoop@master sqoop]$ cd /usr/local/src/flume/
[hadoop@master flume]$ flume-ng version
Flume 1.6.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080
Compiled by hshreedharan on Mon May 11 11:15:44 PDT 2015
From source with checksum b29e416802ce9ece3269d34233baf43f
[hadoop@master flume]$

步骤二:添加 example.conf 到/usr/local/src/flume

[hadoop@master flume]$ vim /usr/local/src/flume/example.conf
a1.sources=r1
a1.sinks=k1
a1.channels=c1
a1.sources.r1.type=spooldir
a1.sources.r1.spoolDir=/usr/local/src/flume/
a1.sources.r1.fileHeader=true
a1.sinks.k1.type=hdfs
a1.sinks.k1.hdfs.path=hdfs://master:9000/flume
a1.sinks.k1.hdfs.rollsize=1048760
a1.sinks.k1.hdfs.rollCount=0
a1.sinks.k1.hdfs.rollInterval=900
a1.sinks.k1.hdfs.useLocalTimeStamp=true
a1.channels.c1.type=file
a1.channels.c1.capacity=1000
a1.channels.c1.transactionCapacity=100
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

步骤三:启动 Flume Agent a1 日志控制台

[hadoop@master flume]$ flume-ng agent --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console
Warning: No configuration directory set! Use --conf <dir> to override.
Info: Including Hadoop libraries found via (/usr/local/src/hadoop/bin/hadoop) for HDFS access
Info: Excluding /usr/local/src/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
Info: Excluding /usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
Info: Including HBASE libraries found via (/usr/local/src/hbase/bin/hbase) for HBASE access
Info: Excluding /usr/local/src/hbase/lib/slf4j-api-1.7.7.jar from classpath
Info: Excluding /usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar from classpathxxxxxxxxxx [hadoop@master flume]$ flume-ng agent --conf-file example.conf --name a1 -Dflume.root.logger=INFO,consoleWarning: No configuration directory set! Use --conf <dir> to override.Info: Including Hadoop libraries found via (/usr/local/src/hadoop/bin/hadoop) for HDFS accessInfo: Excluding /usr/local/src/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpathInfo: Excluding /usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpathInfo: Including HBASE libraries found via (/usr/local/src/hbase/bin/hbase) for HBASE accessInfo: Excluding /usr/local/src/hbase/lib/slf4j-api-1.7.7.jar from classpathInfo: Excluding /usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar from classpath[hadoop@master flume]$ /usr/local/src/flume/bin/flume-ng agent --conf ./conf --conf-file ./example.conf  --name a1 -Dflume.root.logger=INFO,consoleInfo: Sourcing environment configuration script /usr/local/src/flume/conf/flume-env.shInfo: Including Hadoop libraries found via (/usr/local/src/hadoop/bin/hadoop) for HDFS access.../lib/native:/usr/local/src/hadoop/lib/native org.apache.flume.node.Application --conf-file ./example.conf --name a1/usr/local/src/flume/bin/flume-ng: line 241: /usr/loocal/src/jdk/bin/java: No such file or directory
...
23/04/23 22:52:56 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SINK, name: k1. sink.connection.failed.count == 0
23/04/23 22:52:56 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SINK, name: k1. sink.event.drain.attempt == 1918
23/04/23 22:52:56 INFO instrumentation.MonitoredCounterGroup: Shutdown Metric for type: SINK, name: k1. sink.event.drain.sucess == 1918
[hadoop@master flume]$ ^C

步骤四: 查看结果文章来源地址https://www.toymoban.com/news/detail-423686.html

[hadoop@master flume]$ hdfs dfs -lsr /flume
lsr: DEPRECATED: Please use 'ls -R' instead.
-rw-r--r-- 2 hadoop supergroup 1499 2023-04-23 22:52 /flume/FlumeData.1682261559722
-rw-r--r-- 2 hadoop supergroup 1419 2023-04-23 22:52 /flume/FlumeData.1682261559723
-rw-r--r-- 2 hadoop supergroup 1468 2023-04-23 22:52 /flume/FlumeData.1682261559724
...
-rw-r--r-- 2 hadoop supergroup 1795 2023-04-23 22:52 /flume/FlumeData.1682261559817
-rw-r--r-- 2 hadoop supergroup 1841 2023-04-23 22:52 /flume/FlumeData.1682261559818
-rw-r--r-- 2 hadoop supergroup 1665 2023-04-23 22:52 /flume/FlumeData.1682261559819
-rw-r--r-- 2 hadoop supergroup 1439 2023-04-23 22:52 /flume/FlumeData.1682261559820
 

到了这里,关于flume组件以及通过命令监控大数据平台转态的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • (二十四)大数据实战——Flume数据流监控之Ganglia的安装与部署

    本节内容我们主要介绍一下Flume数据流的监控工具Ganglia。Ganglia是一个开源的分布式系统性能监控工具。它被设计用于监视大规模的计算机群集(包括集群、网格和云环境),以便收集和展示系统和应用程序的性能数据。Ganglia 可以轻松地扩展到数千台计算机节点,并支持跨多

    2024年02月08日
    浏览(47)
  • 大数据组件-Flume集群环境的启动与验证

    🥇🥇【大数据学习记录篇】-持续更新中~🥇🥇 个人主页:beixi@ 本文章收录于专栏(点击传送):【大数据学习】 💓💓持续更新中,感谢各位前辈朋友们支持学习~💓💓 上一篇文章写到了Flume集群环境的安装,这篇文章接着上篇文章延伸Flume集群环境的启动与验证,如果

    2024年02月10日
    浏览(47)
  • 【数仓】通过Flume+kafka采集日志数据存储到Hadoop

    【数仓】基本概念、知识普及、核心技术 【数仓】数据分层概念以及相关逻辑 【数仓】Hadoop软件安装及使用(集群配置) 【数仓】Hadoop集群配置常用参数说明 【数仓】zookeeper软件安装及集群配置 【数仓】kafka软件安装及集群配置 【数仓】flume软件安装及配置 【数仓】flum

    2024年03月17日
    浏览(59)
  • 【云原生】Docker容器命令监控+Prometheus监控平台

    目录 1.常用命令监控 docker ps docker top docker stats 2.weave scope 1.下载 2.安装 3.访问查询即可 3.Prometheus监控平台 1.部署数据收集器cadvisor 2.部署Prometheus 3.部署可视化平台Gragana 4.进入后台控制台 1.常用命令监控 docker ps 字段含义 docker top 查看指定容器内的进程 选项 查看详细docker容器

    2024年02月15日
    浏览(44)
  • 教程:通过navicat修改视频监控EasyCVR云平台的登录密码

    TSINGSEE青犀视频监控管理平台EasyCVR可以根据不同的应用场景需求,让平台在内网、专网、VPN、广域网、互联网等各种环境下进行音视频的采集、接入与多端分发。在视频能力上,平台可实现视频实时直播、云端录像、云存储、回放与检索、告警上报、视频快照、视频转码与分

    2024年02月13日
    浏览(31)
  • CDH大数据平台 14Cloudera Manager Console之flume安装和配置(markdown新版)

    💖个人主页:@与自己作战 💯作者简介: CSDN@博客专家 、 CSDN@大数据领域优质创作者 、 CSDN@内容合伙人 、 阿里云@专家博主 🆘希望大佬们多多支持,携手共进 📝 如果文章对你有帮助的话,欢迎评论💬点赞👍收藏📂加关注 ⛔ 如需要支持请私信我 , 💯 必支持

    2024年02月02日
    浏览(47)
  • 1、电商数仓(用户行为采集平台)数据仓库概念、用户行为日志、业务数据、模拟数据、用户行为数据采集模块、日志采集Flume

    数据仓库( Data Warehouse ),是为企业制定决策,提供数据支持的。可以帮助企业,改进业务流程、提高产品质量等。 数据仓库的输入数据通常包括:业务数据、用户行为数据和爬虫数据等。 业务数据:就是各行业在处理事务过程中产生的数据。比如用户在电商网站中登录、

    2024年02月12日
    浏览(45)
  • 视频云存储/安防监控EasyCVR视频汇聚平台如何通过角色权限自行分配功能模块?

    视频云存储/安防监控EasyCVR视频汇聚平台基于云边端智能协同,支持海量视频的轻量化接入与汇聚、转码与处理、全网智能分发、视频集中存储等。音视频流媒体视频平台EasyCVR拓展性强,视频能力丰富,具体可实现视频监控直播、视频轮播、视频录像、云存储、回放与检索、

    2024年02月12日
    浏览(59)
  • 关于视频监控平台EasyCVR视频汇聚平台建设“明厨亮灶”具体实施方案以及应用

    一、方案背景 近几年来,餐饮行业的食品安全、食品卫生等新闻频频发生,比如某火锅店、某网红奶茶,食材以次充好、后厨卫生被爆堪忧,种种问题引起大众关注和热议。这些负面新闻不仅让餐饮门店的品牌口碑暴跌,附带的连锁效应导致门店收益直接巨额亏损。与此同时

    2024年02月11日
    浏览(43)
  • Proteus平台下基于Arduino的通过UART串口可靠通信系统仿真、传感器数据采集、以及LCD屏幕二级菜单功能实现(附工程源码、设计报告)

    三个按键控制菜单,功能分别为:选择功能1,选择功能2,以及返回上一级; 通过三个外部中断对页面状态参量进行控制: 停止等待 当U1收到U2的ACK后才会发送下一次传感器采集到的数据 超时重传 当关闭U2后,U1到达设定的超时时间后,进行重传操作,直到收到U2的确认收到

    2024年02月16日
    浏览(46)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包