网约车大数据综合项目——数据分析Hive

这篇具有很好参考价值的文章主要介绍了网约车大数据综合项目——数据分析Hive。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

第1关:Hive 储存数据

开启Hadoop服务,并使得 Hive 连接 MySQL 初始化

start-all.sh
schematool -dbType mysql -initSchema

进入hive命令行

hive

创建数据库并连接数据库

create database trafficdata;
use trafficdata;

创建表并从本地导入数据

create table cancelorder(companyid string,address string,districtname string,orderid string,ordertime string,canceltime string,operator string,canceltypecode string,cancelreason string) row format delimited fields terminated by '|';
load data local inpath '/data/workspace/myshixun/data/canceldata.txt' into table cancelorder;
create table createorder(companyid string,address string,districtname string,orderid string,departtime string,ordertime string,departure string,deplongitude string,deplatitude string,destination string,destlongitude string,destlatitude string) row format delimited fields terminated by '\t';
load data local inpath '/data/workspace/myshixun/data/createdata.txt' into table createorder;

第2关:统计撤销订单中撤销理由最多的前 10 种理由

开启Hadoop服务,并使得 Hive 连接 MySQL 初始化(已开启无需重复启动)

start-all.sh
schematool -dbType mysql -initSchema

进入hive命令行

hive

连接数据库

use trafficdata;

创建表

create table cancelreason(cancelreason string,num int) row format delimited fields terminated by '\t';

查询数据并插入表

insert into cancelreason  select cancelreason,count(*) num from cancelorder where cancelreason != '未知' group by cancelreason order by num desc limit 10;

上传表

export table cancelreason to'/user/hadoop/cancelreason';

另开命令行进入MySQL

mysql -h127.0.0.1 -uroot -p123123

创建数据库并连接数据库

create database trafficdata;
use trafficdata;

创建表

create table cancelreason(
    cancelreason varchar(255),
    num int not null
);

退出MySQL

exit;

下载数据到MySQL的表

sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/cancelreason/data/000000_0' --table cancelreason --fields-terminated-by '\t';

第3关:查询出成功订单最多的 10 个行政区名

开启Hadoop服务,并使得 Hive 连接 MySQL 初始化(已开启无需重复启动)

start-all.sh
schematool -dbType mysql -initSchema

进入hive命令行

hive

连接数据库

use trafficdata;

创建表

create table order_district(district string,num int) row format delimited fields terminated by '\t';

查询数据并插入表

insert into order_district select districtname,count(*) num from createorder group by districtname order by num desc limit 10;

上传表

export table order_district to'/user/hadoop/order_district';

另开命令行进入MySQL

mysql -h127.0.0.1 -uroot -p123123

连接数据库(无数据库需要新建数据库)

use trafficdata;

创建表

create table order_district(
    cancelreason varchar(255),
    num int not null
);

退出MySQL

exit;

下载数据到MySQL的表

sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/order_district/data/000000_0' --table order_district --fields-terminated-by '\t';

第4关:查询湖南省各个市的所有订单总量

开启Hadoop服务,并使得 Hive 连接 MySQL 初始化(已开启无需重复启动)

start-all.sh
schematool -dbType mysql -initSchema

进入hive命令行

hive

连接数据库

use trafficdata;

创建表

create table orderbycity(city varchar(255),num int) row format delimited fields terminated by '\t';

查询数据并插入表

INSERT INTO orderbycity (city, num) SELECT '湖南省长沙市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省长沙市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省长沙市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省株洲市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省株洲市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省株洲市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省湘潭市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省湘潭市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省湘潭市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省衡阳市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省衡阳市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省衡阳市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省邵阳市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省邵阳市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省邵阳市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省岳阳市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省岳阳市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省岳阳市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省常德市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省常德市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省常德市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省张家界市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省张家界市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省张家界市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省益阳市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省益阳市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省益阳市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省娄底市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省娄底市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省娄底市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省郴州市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省郴州市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省郴州市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省永州市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省永州市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省永州市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省怀化市' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省怀化市%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省怀化市%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;
INSERT INTO orderbycity (city, num) SELECT '湖南省湘西土家族苗族自治州' AS districtname, SUM(total_num) AS total_num FROM (SELECT districtname, SUM(num) AS total_num FROM (SELECT districtname, COUNT(*) AS num FROM cancelorder WHERE districtname LIKE '湖南省湘西土家族苗族自治州%' GROUP BY districtname UNION ALL SELECT districtname, COUNT(*) AS num FROM createorder WHERE districtname LIKE '湖南省湘西土家族苗族自治州%' GROUP BY districtname) AS combined GROUP BY districtname) AS final_result;

上传表

export table orderbycity to'/user/hadoop/orderbycity';

另开命令行进入MySQL

mysql -h127.0.0.1 -uroot -p123123

连接数据库(无数据库需要新建数据库)

use trafficdata;

创建表

create table orderbycity(
    city varchar(255),
    num int not null
);

退出MySQL

exit;

下载数据到MySQL的表

sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_1' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_2' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_3' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_4' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_5' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_6' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_7' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_8' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_9' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_10' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_11' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_12' --table orderbycity --fields-terminated-by '\t';
sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/orderbycity/data/000000_0_copy_13' --table orderbycity --fields-terminated-by '\t';

第5关:统计湖南省当天的每分钟订单总数量

开启Hadoop服务,并使得 Hive 连接 MySQL 初始化(已开启无需重复启动)

start-all.sh
schematool -dbType mysql -initSchema

进入hive命令行

hive

连接数据库

use trafficdata;

创建表

create table order_province_time(`time` string,num int) row format delimited fields terminated by '\t';

查询数据并插入表

insert into order_province_time select datetime, sum(num) as total_num from (select date_format(ordertime, 'yyyy-MM-dd HH:mm') as datetime, count(ordertime) as num from cancelorder where districtname like '湖南省%' group by date_format(ordertime, 'yyyy-MM-dd HH:mm') union all select date_format(departtime, 'yyyy-MM-dd HH:mm') as datetime, count(ordertime) as num from createorder where districtname like '湖南省%' group by date_format(departtime, 'yyyy-MM-dd HH:mm')) as combined group by datetime order by datetime;

上传表

export table order_province_time to'/user/hadoop/order_province_time';

另开命令行进入MySQL

mysql -h127.0.0.1 -uroot -p123123

连接数据库(无数据库需要新建数据库)

use trafficdata;

创建表

create table order_province_time(
    times varchar(255),
    num int not null
);

退出MySQL

exit;

下载数据到MySQL的表

sqoop export --connect jdbc:mysql://127.0.0.1:3306/trafficdata --username root --password 123123 --export-dir '/user/hadoop/order_province_time/data/000000_0' --table order_province_time --fields-terminated-by '\t';

文章来源地址https://www.toymoban.com/news/detail-788266.html

到了这里,关于网约车大数据综合项目——数据分析Hive的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 【纯开源uniapp】小程序APP网约车代驾顺风车货运货拉拉网约车司机入驻双端小程序包搭建

    系统是基于Thinkphp+uniapp开发的,开源源码买这套源码可以拿回去自己做二开 特此声明:本店所售程序只供娱乐模拟测试研究,不得使用于非法用途,不得违反国家法律,否则后果自负!购买以后使用一切法律责任后果都由购买者承担于本店无任何关系! 一、乘客端 1. 多版本

    2024年02月13日
    浏览(40)
  • 2023网约车考试报名全流程&考试内容

    一、网约车申请条件 1.取得相应准驾车型机动车驾驶证并具有3年以上驾驶经历; 2.无交通肇事犯罪、危险驾驶犯罪记录; 3.无吸毒记录,无饮酒后驾驶记录; 4.无暴力犯罪记录的人员; 5.地方政府规定的其他条件。 二、申请需要的材料 1.机动车驾驶证及复印件; 2.无交通肇

    2024年02月09日
    浏览(101)
  • 代币化对网约车区块链平台的影响

    再一次分析一下一篇关于区块链的文章,这篇文章比较新,2023年发表在POMS上。 由于这篇文章跟之前那几篇关注假货的文章的重点不一样,所以需要仔细读一下他的INTRODUCTION。 引言部分: 区块链近年来受到学术界和工业界的日益增长的关注。 在传统的对等市场(peer-to-peer

    2024年02月05日
    浏览(34)
  • 网约车围城:百万司机涌入,狼多肉少

    狼多肉少,网约车围城已经成为了一个不争的事实。虽然市场竞争越来越激烈,但机会还是有的,司机们也应该不断提高自身的服务质量和素质,以满足消费者的需求,获取更多的订单和收益。 网约车市场一直以来都是人们生活中不可或缺的一部分,它的便捷、高效和实惠受

    2024年02月09日
    浏览(41)
  • 第七期 | 网约车司机的“捞偏门”手段:作弊抢单、空跑刷单

    顶象防御云业务安全情报中心监测到,多个网约车出行平台存在作弊软件抢单、空跑刷单等欺诈行为,不仅损害乘客利益,更严重影响平台正常运营。 据顶象防御云业务安全情报BSL-2022-a3c7号显示,部分网约车平台上的司机与黑灰产勾结,通过作弊工具,截取长距离的订单,

    2024年02月02日
    浏览(37)
  • python数据分析案例——天猫订单综合分析

    前言 大家早好、午好、晚好吖 ❤ ~欢迎光临本文章 什么是数据分析 明确目的–获得数据(爬虫,现有,公开的数据)–数据预处理——数据可视化——结论 准备 环境使用: 在开始写我们的代码之前,我们要准备好运行代码的程序 Anaconda (python3.9) – 识别我们写的代码 开发工

    2024年02月03日
    浏览(46)
  • 电影数据可视化综合分析

    1.1 沈腾参演电影数据获取 1.2 电影数据可视化分析 大家好✨,这里是bio🦖。点赞+关注不迷路。数据可视化在数据科学和数据分析中非常重要,例如论文中配色精美的结果图、PPT汇报中突出数据差异数据分析图等。通过可视化,我们可以直观地观察和理解数据的分布、趋势、

    2024年02月12日
    浏览(42)
  • 第11章综合案例2影评大数据分析

    实验目的及要求 (1)现有电影、影评和用户信息3个数据文件,将对其进行大数据分析。 实验系统环境及版本 Linux Ubuntu 20.04 JDK1.8 Hadoop3.1.0 MySQL8.0.28 Hive3.1.2 实验任务 评分次数最多的10部电影; 性别当中评分最高的10部电影; 一部电影各年龄段的平均影评; 评分最高的10部电

    2024年02月07日
    浏览(42)
  • Springboot计算机毕业设计微信小程序的网约车代驾【附源码】开题+论文+mysql+程序+部署

    本系统 (程序+源码) 带文档lw万字以上   文末可获取一份本项目的java源码和数据库参考。 研究背景 随着移动互联网技术的飞速发展和智能手机普及率的提高,人们的出行方式正在发生深刻变革。网约车服务以其便捷、高效的特点,迅速成为城市出行的新宠。而代驾服务作

    2024年04月26日
    浏览(31)
  • 图解大数据 | 综合案例-使用Spark分析挖掘音乐专辑数据

    作者:韩信子@ShowMeAI 教程地址:http://www.showmeai.tech/tutorials/84 本文地址:http://www.showmeai.tech/article-detail/178 声明:版权所有,转载请联系平台与作者并注明出处 收藏ShowMeAI查看更多精彩内容 文娱影音是目前大数据与AI应用最广泛的场景之一,本案例以音乐专辑发行数据为背景

    2024年02月09日
    浏览(51)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包