[R] How to communicate with your data? - ggplot2

这篇具有很好参考价值的文章主要介绍了[R] How to communicate with your data? - ggplot2。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

We have gone through the basic part of how to clean and process before analyzing your data.

How to communicate with your data?

R语言具有生成各种图形的多种可能性。

并非所有图形功能对初学者来说都是必要的。 复杂的图形需要长代码。

我们将从简单的图形元素开始,然后逐步定制复杂图形。

Which package do we need: ggplot 2

>library (ggplot2)

What can we do?

For continuous variables:

Creating, editing coloring histogram

For categorical variables

Creating, editing coloring bar plot
我们需要哪个包:

ggplot2 >库(ggplot2)

我们能做什么

对于连续变量: 创建,编辑着色直方图

对于分类变量: 创建,编辑着色条形图

# 导入 ggplot2 包
library(ggplot2)

# 创建一个数据框
data <- data.frame(
  x = c(1, 2, 3, 4, 5),
  y = c(2, 3, 4, 5, 6)
)

# 使用 ggplot 函数创建一个散点图
ggplot(data, aes(x = x, y = y)) +
  geom_point()

Separate parts or layers

In ggplot2, a plot can be subdivided into separate parts or layers, each of which contributes to the final appearance of the plot. This layering system allows you to add different elements to the plot, such as data points, lines, text, and annotations, in a flexible and customizable way.

Here's a brief explanation of the key components of a ggplot2 plot:

  1. Data: The data you want to visualize, typically in the form of a data frame.

  2. Aesthetic Mapping (aes) adj. 审美的,美学的;美的,艺术的: Aesthetic mappings define how variables in the data are mapped to visual properties of the plot, such as x and y positions, colors, shapes, and sizes. 

  3. Geoms (Geometric Objects): Geoms are the visual elements that represent the data in the plot, such as points, lines, bars, and polygons. Each geom function adds a new layer to the plot.

  4. Facets: Facets allow you to create multiple plots, each showing a different subset of the data. You can facet by one or more variables to create small multiples.

  5. Stats (Statistical Transformations): Stats are used to calculate summary statistics or perform transformations on the data before plotting. Each stat function can be thought of as a new dataset that is plotted using a geom.

  6. Scales: Scales control how the data values are mapped to the visual properties of the plot, such as axes, colors, and sizes. You can customize scales to change the appearance of the plot.

  7. Coordinate Systems: Coordinate systems determine how the plot is spatially arranged. The default is Cartesian coordinates, but ggplot2 also supports polar coordinates and other specialized coordinate systems.

By combining these components and adding them in layers, you can create complex and informative visualizations that effectively communicate insights from your data.

Using mtcars dataset to explore:

The mtcars dataset in R contains information about various features of 32 different automobiles from the early 1970s. Here are the meanings of the variables in the mtcars dataset:

  1. mpg: Miles per gallon (fuel efficiency).
  2. cyl: Number of cylinders.
  3. disp: Displacement (engine size) in cubic inches.
  4. hp: Gross horsepower.
  5. drat: Rear axle ratio.
  6. wt: Weight (in 1000 lbs).
  7. qsec: 1/4 mile time (in seconds).
  8. vs: Engine type, where 0 = V-shaped and 1 = straight.
  9. am: Transmission type, where 0 = automatic and 1 = manual.
  10. gear: Number of forward gears.
  11. carb: Number of carburetors.
#Load mtcars and ggplot2
data("mtcars")
str(mtcars)

library(ggplot2)
'data.frame':	32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

It tell the performances of cars in the US.

ggplot(mtcars,aes(x=mpg))+geom_histogram()

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

ggplot(mtcars,aes(x=cyl))+geom_histogram()

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

It look poor.

ggplot(mtcars,aes(x=mpg))+geom_dotplot()

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

The resulting image is a dot plot where each dot represents a car from the mtcars dataset, and the position of the dot on the x-axis represents its miles per gallon value. The dot plot can give you an idea of the distribution of miles per gallon values in the dataset and can help identify any patterns or outliers.

ggplot(mtcars,aes(x=qsec))+geom_area(stat="bin")

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

The code attempts to create an area plot using the qsec variable from the mtcars dataset.

ggplot(mtcars,aes(x=disp))+geom_density()

#or

ggplot(mtcars,aes(x=disp))+geom_density(kernel ="gaussian")

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

The code creates a density plot using the disp (displacement) variable from the mtcars dataset. Here's a breakdown of the code:

  • ggplot(mtcars, aes(x = disp)): This sets up the basic plot using the mtcars dataset and specifies that the x-axis of the plot should represent the disp variable.

  • geom_density(): This adds a layer to the plot, specifying that the data should be displayed as a density plot.

Density plots are useful for visualizing the distribution of a continuous variable and can help identify patterns such as peaks, valleys, and skewness偏度 in the data.

In a density plot created using geom_density(), the y-axis represents the density of the data at each point along the x-axis. Density is a way of representing the distribution of data values. It is calculated using kernel density estimation, which estimates the probability density function of the underlying variable.

Graphing

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

poor for publication

1.binwidth

2. color

3. title and labels

4. Gaussian curve: from a normal distribution or not

Change four parameters in my bar design= change to be made on Geom

Binwidth=nbr Change the bar width

Fill ="name of the colour" Change the colour with which the bar is filled

Colour="name of the colour” Change the outline of the bar

Alpha=nbr  Change the transparency of the colour

ggplot(mtcars,aes(x=mpg))+geom_histogram(binwidth = 5)
ggplot(mtcars,aes(x=mpg))++geom_histogram(fill="blue",binwidth=5)
ggplot(mtcars,aes(x=mpg))+geom_histogram(fill="skyblue",alpha=0.7,binwidth=5,colour="grey")

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

#Let's practice, hisogram of BMI in purple
#after importing the excel file with File->Import dataset->From excel
ggplot(SEE_students_data_2,aes(x=BMI))+geom_histogram(binwidth = 1, fill="purple",colour="black",alpha=0.5)

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

  • ggplot(SEE_students_data_2, aes(x = BMI)):使用SEE_students_data_2数据集,将BMI变量映射到x轴。

  • geom_histogram(binwidth = 1, fill = "purple", colour = "black", alpha = 0.5):添加直方图层,其中binwidth = 1指定每个直方柱的宽度为1(即每个单位)。fill = "purple"设置直方图的填充颜色为紫色,colour = "black"设置边框颜色为黑色,alpha = 0.5设置透明度为0.5,使得直方图具有一定的透明度。

  • ggplot(SEE_students_data_2, aes(x = BMI)): This sets up the basic plot using the SEE_students_data_2 dataset and maps the BMI variable to the x-axis.

  • geom_histogram(binwidth = 1, fill = "purple", colour = "black", alpha = 0.5): This adds a histogram layer to the plot. binwidth = 1 specifies the width of each histogram bin as 1 (i.e., each unit). fill = "purple" sets the fill color of the histogram bars to purple, colour = "black" sets the border color to black, and alpha = 0.5 sets the transparency to 0.5, giving the histogram bars some transparency.

Tips:

1. Since male and female depends on the variable Gender, the fill option should be specified in the aesthetics part

2. Geom_area require the option stat=bin when there is no variable plot on the Y axis

ggplot(SEE_students_data_2,aes(x=BMI, fill=Gender))+geom_density(colour="black",alpha=0.5)

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

  • ggplot(SEE_students_data_2, aes(x = BMI, fill = Gender)): Sets up the basic plot using the SEE_students_data_2 dataset. The aes() function maps the BMI variable to the x-axis and uses the Gender variable to fill the density curves by gender.

  • geom_density(colour = "black", alpha = 0.5): Adds a density plot layer to the plot. The colour = "black" argument sets the color of the density curve outlines to black, and the alpha = 0.5 argument sets the transparency of the density curves to 0.5, making them partially transparent.

 

ggplot(SEE_students_data_2,aes(x=BMI, fill=Gender)) + geom_area(stat="bin", colour="black",alpha=0.5,binwidth=1)

Geom_area require the option stat=bin when there is no variable to plot on the Y axis[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

ggplot(SEE_students_data_2,aes(x=BMI, fill=Gender))+geom_density(colour="black",alpha=0.5)+labs(title="Body Mass index per Gender\nSEE Students", y="Frequency",x="Body Mass Index")

#add a title and axis title to the BMI  geom_density graph[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

Unvariate categorical data

#Graphing a factor variable using geom_bar()

ggplot(SEE_students_data_2,aes(x=Gender))+geom_bar()

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

 

#adding color to the bar using a set, a given color, manually defined colors
ggplot(SEE_students_data_2,aes(x=Gender, fill=Gender))+geom_bar(alpha=0.5)+scale_fill_brewer(palette="Set1")
ggplot(SEE_students_data_2,aes(x=Gender, fill=Gender))+geom_bar()+scale_fill_brewer(palette = "Blues")
ggplot(SEE_students_data_2,aes(x=Gender,fill=Gender))+geom_bar(alpha=0.75)+scale_fill_manual(values=c("pink","blue"))
  1. ggplot(SEE_students_data_2, aes(x = Gender, fill = Gender)) + geom_bar(alpha = 0.5) + scale_fill_brewer(palette = "Set1"): This code creates a bar plot where each bar is filled with a color from the "Set1" color palette调色板, which is part of the RColorBrewer酿造师 package. The alpha = 0.5 argument sets the transparency of the bars to 0.5, making them partially transparent.

  2. ggplot(SEE_students_data_2, aes(x = Gender, fill = Gender)) + geom_bar() + scale_fill_brewer(palette = "Blues"): This code creates a bar plot with bars filled with shades of blue from the "Blues" color palette. The bars are fully opaque by default.

  3. Manually defined color: ggplot(SEE_students_data_2, aes(x = Gender, fill = Gender)) + geom_bar(alpha = 0.75) + scale_fill_manual(values = c("pink", "blue")): This code creates a bar plot with bars filled with the colors "pink" and "blue", using the scale_fill_manual() function to manually specify the colors. The alpha = 0.75 argument sets the transparency of the bars to 0.75, making them partially transparent.

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

[R] How to communicate with your data? - ggplot2,r语言,python,开发语言

Order the bar in the right order:

# Install and load the forcats package
install.packages("forcats")
library(forcats)

# Create the plot with the reordered factor levels
ggplot(CUHKSZ_employment_survey_1, aes(fct_infreq(Occupation, palette="Blues")) +
  geom_bar(fill = Occupation, alpha = 0.75) +
  scale_fill_brewer(palette = "Blues")
  • ggplot(CUHKSZ_employment_survey_1, aes(x = fct_infreq(Occupation), fill = Occupation)): This sets up the basic plot using the CUHKSZ_employment_survey_1 dataset. The x aesthetic uses the fct_infreq() function from the forcats package to reorder the Occupation variable based on frequency. The fill aesthetic fills the bars based on the Occupation variable.

  • geom_bar(alpha = 0.75): This adds a bar plot layer to the plot. The alpha parameter sets the transparency of the bars to 0.75, making them partially transparent.

  • scale_fill_brewer(palette = "Blues"): This sets the fill color of the bars using the "Blues" color palette from the RColorBrewer package.

  • the fill = Occupation aesthetic is used to fill the bars of the bar plot based on the levels of the Occupation variable. Each unique level of the Occupation variable will be represented by a different color in the plot, which can be helpful for distinguishing between different categories or groups in the data.

  • additional resources: STHDA - Homehttp://www.sthda.com/english/文章来源地址https://www.toymoban.com/news/detail-838362.html

到了这里,关于[R] How to communicate with your data? - ggplot2的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • R语言tidyverse教程:ggplot2绘图初步

    R语言系列: 编程基础💎循环语句💎向量、矩阵和数组💎列表、数据帧 排序函数💎apply系列函数 tidyverse :readr💎tibble ggplot2有其独特的绘图语法,想要实现数据绘图,除了数据和集合形状之外,还需要一个叫做 aesthetic 的东西,这个不知道应该怎么翻,后面就统一叫做美术

    2024年02月04日
    浏览(34)
  • R语言ggplot2 | R语言绘制物种组成面积图(三)

       利用R语言绘制物种组成图。本文以堆叠面积图的方式与大家分享。 面积图又叫区域图。它是在折线图的基础之上形成的, 它将折线图中折线与自变量坐标轴之间的区域使用颜色或者纹理填充,这样一个填充区域我们叫面积。颜色的填充可以更好地突出趋势信息(比如时

    2024年02月13日
    浏览(33)
  • R语言画图的-- ggplot2(实现图例的精细修改)

    ggplot2 是R中用来作图的很强的包,但是其用法比较多且各种参数比较复杂,我自己使用的时候还经常需要查阅一些关键参数等,因此想要写一个 r 中 ggplot2 的作图文档,方便自己查阅。 但是今天突然发现了一个网站,这个网站里面包含了 ggplot2 作图的几乎所有内容。有各种图

    2024年02月13日
    浏览(49)
  • R 语言 ggplot2 PCA 主成分分析(虚拟数据集)

    以上代码生成了100行基因,10列样本的矩阵 前五列命名 wt 开头+ 1-5 ,表示正常基因 后五列命名 ko 开头+ 1-5 ,表示缺少基因的样本(knock-out) 给每行基因都统一命名 gene + 1-100 head() 函数默认查看前6行 现在只是定义了矩阵的shape和name,还没填充数值 这段代码的作用是生成一个

    2024年02月11日
    浏览(38)
  • R语言实践——ggplot2+ggrepel绘制散点+优化注释文本位置

    书接adjustText实践——调整matplotlib散点图标签,避免重复 上文中,matplotlib+adjustText对于我的实例来说并没有起到很好的效果。所以,博主决定在R中利用gglot2+ggrepel绘制,期待效果。 博主不常使用R,在此过程中详细记录每一步骤,以作备忘。 2.1 快速绘制散点图(plot) 2.2 ge

    2023年04月11日
    浏览(73)
  • 跟着NatureMetabolism学作图:R语言ggplot2转录组差异表达火山图

    论文 Independent phenotypic plasticity axes define distinct obesity sub-types https://www.nature.com/articles/s42255-022-00629-2#Sec15 s42255-022-00629-2.pdf 论文中没有公开代码,但是所有作图数据都公开了,我们可以试着用论文中提供的数据模仿论文中的图 今天的推文重复一下论文中的Fig3b 差异表达火山图

    2024年02月08日
    浏览(39)
  • R语言数据绘图学习(0x01)-安装ggplot2与尝试

    一直听说数据分析里R语言是比较‘正统’,况且久闻ggplot2这些R语言的数据分析库大名,想到今后数据分析和整理的需要,这里开一个简单的系列学习一些R语言和ggplot2的绘图基础。本人学习的书籍是Winston Chang大佬的《R Graphics Cookbook》,且稍有一点Python里的Plotnine绘图基础。

    2024年02月04日
    浏览(34)
  • R语言中使用ggplot2绘制散点图箱线图,附加显著性检验

    散点图可以直观反映数据的分布,箱线图可以展示均值等关键统计量,二者结合能够清晰呈现数据蕴含的信息。 本篇笔记主要内容:介绍R语言中绘制箱线图和散点图的方法,以及二者结合展示教程,添加差异比较显著性分析,绘制如上结果图。 在实际数据可视化过程中,输

    2024年03月20日
    浏览(40)
  • how to protect your stomach

    To protect your stomach and maintain good digestive health, here are some tips: Eat a Balanced Diet: Consume a well-balanced diet that includes fruits, vegetables, whole grains, lean proteins, and healthy fats. Avoid excessive consumption of processed foods, sugary snacks, and fatty or fried foods, as they can irritate the stomach lining. Practice Portion Co

    2024年01月21日
    浏览(34)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包