【BST811】DATA ANALYTICS

这篇具有很好参考价值的文章主要介绍了【BST811】DATA ANALYTICS。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

For part one you are required to submit two files a PDF/Word Document and an Excel
working file.
PDF/Word document: 1) Introduction (150 words): explain from your perspective the
objective of this coursework. 2) Stage one (200 words): summarise the tasks completed in this
part and reflect on the change of sample size from one task to another. 3) Stage two (500
words): reflect on your analysis and designing your display panel. Stage three (500 words):
provide short essay type of answers to the questions. 4) Conclusion (150): a self- reflection on
your learning after completing this coursework by providing (e.g. pros vs cons analysis).
Excel Spreadsheet: you need to structure a similar spreadsheet to the one explained in the
“Case Study: US Crude Oil Trade Flows”, which is part of week 5 learning material.
In doing so you are required to complete the following tasks:
Stage One: cleaning, manipulating and structuring the dataset.
You are required to clean and manipulate the dataset “AG-Tradeflows-2020” prior running
analysis and structuring your final display panel.
1) Before cleaning the data (by deleting not needed data columns) you need to filter the
dataset to only include shipments from “Saudi Arabia” and only for the period 2020 by
filtering the data columns “Load Country” and “Departure Data”, respectively and
accordingly.
(2%)
2) Create the following new variable “Cargo” by multiplying the column “Volume” by
1000 (Volume × 1000).
(2%)
3) Filter the dataset to include only the following indicators (columns): Vessel Name,
Vessel IMO, Load Port, Departure Date, Discharge Country, Discharge Port, Product,
Grade, Cargo, Discharge Country/Sub-Country, Discharge Region, Discharge Zone.
(2%)
4) Filter the dataset to exclude observations with missing values by deleting observations
that include blank or error data for the following data columns: Vessel Name, Load
Port, Departure Date and Discharge Port. (2%)
5) Create the following new variables: Vessel Type and DWT by merging information
from the second table (LOOKUP) sheet into the main dataset. (6%)
6) After merging both datasets use the new structured column data indicator “Vessel
Type” to filter the dataset to only include four vessel types namely, 1) Crude Oil Tanker,
2) Products Tanker, 3) Chemical/ Products Tanker, 4) Crude/Oil Products Tanker.
(2%)
7) The final sample should only include the following variables “Vessel”, “Vessel Type”,
“DWT”, “Load Port”, “Departure Date”, “Discharge Country”, “Discharge Port”,
“Product”, “Grade”, “Cargo”, “Discharge Country/Sub-Country”, “Discharge
Region” and “Discharge Zone”. (2%)
i.
Check the data type (format) of these variables and if necessary, modify the data
type.
ii.
Check if these variables contain missing values. Exclude all observations where at
least one of these variables contain missing values. 1 (2%)
Note: each step should be clearly shown in a separate sheet of the spreadsheet
Stage Two: data analysis and designing the display panel.
8) Create a monthly time series of vessels shipments (a count of number of monthly fixed
ships), total cargo shipped and cargo capacity utilization.
(3%)
9) Plot a monthly time series showing total number of vessels shipments and total cargo
capacity loaded onboard ships. You need to provide a table with the data used to plot
the time series.
(3%)
10) Identify the month that had the highest number of vessel shipments, the most loaded
cargo in tonnes and percentages of cargo capacity utilization.
(3%)
11) Structure tables and provide suitable illustrations that categorises total shipments and
cargo capacity by vessel type, load port, type of product and discharge zone. (3%)
12) Similar to the Case Study: US Crude Oil Trade Flows, which is part of week 5 learning
material, you need to structure your spreadsheet providing tables and illustrations and
design a display panel.
(3%)
Note: each step should be clearly shown in a separate sheet of the spreadsheet
Stage Three: answer the following questions.
13) Historical time series may contain useful information that are useful for decision
makers. Do you see any pattern in the monthly time series of vessels shipments and
total cargo capacity?
(5%)
14) Forecasts are required to support decisions in the future. We need to provide forecast
that supports operational planning one month in advance. Use naïve and simple moving
average to provide one-month ahead forecast. Reflect on which approach do you
recommend using for this forecasting task? Explain your answer and plot your
forecasts.
(5%)
15) Reflect on how useful a Linear Programming method for this type of data (e.g. cargo
capacity, amount of cargo shipped, different sizes of vessels, … etc.).
(5%)
Data analytics is generally used to provide evidence and inform decisions. In a typical business
data analytic task, you can use data to inform decisions, verify claims and assumptions, answer
or refine questions. In this part of the coursework, you are first asked to choose a dataset and
discuss a relevant problem to the dataset that needs to be informed by data analysis, it could be
in the form of questions, claims or assumptions.
For you to have the greatest chance of success with this coursework it is important that you
choose a manageable dataset. This means that the data should be readily accessible and large
enough that multiple relationships can be explored. As such, your dataset must have at least 50
observations (rows) and between 3 to 5 variables (columns). The variables in the data should
include categorical variables, numerical variables, or date/time variables. The dataset format
could be in the format of text(.txt) or excel (.csv /.xls, .xlsx)
If you are using a dataset that comes in a format that we haven’t encountered in class, make
sure that you are able to load it into R as this can be tricky depending on the source. If you are
having trouble, ask for help before it is too late.
Note on reusing datasets from class: Do not reuse datasets used in examples, homework
assignments, or labs in the class.

文章来源地址https://www.toymoban.com/news/detail-824591.html

到了这里,关于【BST811】DATA ANALYTICS的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 2023年8月份华为H12-811更新了

    801、[单选题]178/832、在系统视图下键入什么命令可以切换到用户视图? A quit  B souter  C system-view  D user-view 试题答案:A 试题解析:在系统视图下键入quit命令退出到用户视图。因此答案选A。 802、[单选题]“网络管理员在三层交换机上创建了VLAN 10,并在该VLAN的虚拟接口下配置了

    2024年02月14日
    浏览(37)
  • 2023年3月华为HCIA认证新增题库(H12-811)

    850 、 SNMP 报文是通过 TCP来承载的。 A 、对 B 、错 试题答案:[[\\\"B\\\"]] 试题解析: 851 、 Trunk 端口可以允许多个 VLAN通过,包括 VLAN4096。 A 、对 B 、错 试题答案:[[\\\"B\\\"]] 试题解析: 852 、 RADIUS 是实现 AAA 的常见协议。 A 、对 B 、错 试题答案:[[\\\"A\\\"]] 试题解析: 853 、 ICMP 报文不包含

    2024年02月09日
    浏览(37)
  • 实现firebase FCM和Analytics

    前提:1.需要vpn 2.带有google 服务的手机 注意!!! 这个在2023年6月30日时还是测试版,所以手机有概率接收不到消息 编写代码前需要在https://console.firebase.google.com/ 配置好参数 这里的token值需要填写代码内的initFCM()的token值测试 这里需要填入FirebaseInstallations.getInstance().getId()

    2024年02月11日
    浏览(40)
  • R语言【taxa】——as_data_frame():将 taxa 的对象 转换为 data.frame

    Package  taxa  version 0.4.2         将 taxa 对象包含的信息转换为 data.frame,信息保存在列中。如果使用 as_tribble 则转换为表格。     参数【x】 :由 taxa 定义的一个对象。比如 taxon 或 taxon_id。 参数【row.names】 :NULL值,或者命名data frame行名的字符向量。不允许缺失值。 参数【

    2024年01月24日
    浏览(40)
  • 二叉搜索树(BST)详解

    二叉搜索树是一个有序树 若它的左子树不空,则左子树上所有结点的值均小于它的根结点的值; 若它的右子树不空,则右子树上所有结点的值均大于它的根结点的值; 左、右子树也分别为二叉搜索树; 如图所示,两棵都是二叉排序树; 如图所示,左右两棵树都是是二叉搜

    2024年02月02日
    浏览(45)
  • 【C++】二叉搜索树BST

    二叉搜索树又称二叉排序树,具有以下 性质 : 若它的左子树不为空,则左子树上所有节点的值都小于根节点的值 若它的右子树不为空,则右子树上所有节点的值都大于根节点的值 它的左右子树也分别为二叉搜索树 搜索二叉树不允许数据冗余,也就是说其中 没有重复的数据

    2023年04月25日
    浏览(44)
  • 使用 Footprint Analytics, 快速搭建区块链数据应用

    Nov 2022, daniel@footprint.network 如果你有一个处理 NFTs 或区块链的网站或应用程序,你可以在你的平台上直接向用户展示数据,以保持他们在网站或者应用内的参与,而不是链接以及跳出到其他网站。 对于任何区块链应用或者媒体、信息网站来说,通过在网站上展示数据图表(如

    2024年01月22日
    浏览(40)
  • 一张图读懂TuGraph Analytics开源技术架构

    TuGraph Analytics(内部项目名GeaFlow) 是蚂蚁集团开源的分布式实时图计算引擎,即流式图计算。通过SQL+GQL融合分析语言对表模型和图模型进行统一处理,实现了流、批、图一体化计算,并支持了Exactly Once语义、高可用以及一站式图研发平台等生产化能力。 开源项目代码目前托

    2024年02月12日
    浏览(50)
  • 从BST到LSM的进阶之路

    相信大家之前都了解过很多种 数据结构 ,我之前总是两两的,也就是从局部上去进行比较,没有从整体上进行这些树的发展脉络进行梳理,因此经常看完没多久就忘了。看来确实是需要从本源出发,不仅要知其然还要知其所以然,了解清楚前因后果,不仅可以方便我们记忆

    2024年02月05日
    浏览(53)
  • Java之二叉搜索树(BST)

    目录 一.二叉搜索树(BST) 1.什么是二叉搜索树 2.判断一颗二叉搜索树 二.二叉搜索树CRUD操作 1.二叉搜索树的数据结构 2.添加操作 3.查找操作 1.查找最大值 2.查找最小值 3.查找任意值 4.删除操作 1.删除最大值 2.删除最小值 3.删除任意值 5.其他操作 1.打印操作(toString的实现) 6.代码

    2023年04月25日
    浏览(41)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包