These slides are viewed best by Chrome or Firefox and occasionally need to be refreshed if elements did not load properly. See here for the PDF .
Press the right arrow to progress to the next slide!
Lecturer: Emi Tanaka
ETC5521.Clayton-x@monash.edu
Week 6 - Session 1
Thrips | Spiders |
---|---|
16.6 | 0.8 |
16.4 | 0.8 |
11.0 | 0.6 |
16.8 | 0.4 |
10.6 | 0.6 |
18.4 | 0.8 |
14.2 | 0.0 |
10.2 | 0.6 |
Kevin Wright (2018). agridat: Agricultural Datasets. R package version 1.16.
L. A. Hothorn, 2005. Evaluation of Bt-Maize Field Trials by a Proof of Safety. http://www.seedtest.org/upload/cms/user/presentation7Hothorn.pdf
Thrips | Spiders |
---|---|
16.6 | 0.8 |
16.4 | 0.8 |
11.0 | 0.6 |
16.8 | 0.4 |
10.6 | 0.6 |
18.4 | 0.8 |
14.2 | 0.0 |
10.2 | 0.6 |
Kevin Wright (2018). agridat: Agricultural Datasets. R package version 1.16.
L. A. Hothorn, 2005. Evaluation of Bt-Maize Field Trials by a Proof of Safety. http://www.seedtest.org/upload/cms/user/presentation7Hothorn.pdf
Thrips | Spiders |
---|---|
16.6 | 0.8 |
16.4 | 0.8 |
11.0 | 0.6 |
16.8 | 0.4 |
10.6 | 0.6 |
18.4 | 0.8 |
14.2 | 0.0 |
10.2 | 0.6 |
Kevin Wright (2018). agridat: Agricultural Datasets. R package version 1.16.
L. A. Hothorn, 2005. Evaluation of Bt-Maize Field Trials by a Proof of Safety. http://www.seedtest.org/upload/cms/user/presentation7Hothorn.pdf
Thrips | Spiders |
---|---|
16.6 | 0.8 |
16.4 | 0.8 |
11.0 | 0.6 |
16.8 | 0.4 |
10.6 | 0.6 |
18.4 | 0.8 |
14.2 | 0.0 |
10.2 | 0.6 |
Kevin Wright (2018). agridat: Agricultural Datasets. R package version 1.16.
L. A. Hothorn, 2005. Evaluation of Bt-Maize Field Trials by a Proof of Safety. http://www.seedtest.org/upload/cms/user/presentation7Hothorn.pdf
Thrips | Spiders |
---|---|
16.6 | 0.8 |
16.4 | 0.8 |
11.0 | 0.6 |
16.8 | 0.4 |
10.6 | 0.6 |
18.4 | 0.8 |
14.2 | 0.0 |
10.2 | 0.6 |
Kevin Wright (2018). agridat: Agricultural Datasets. R package version 1.16.
L. A. Hothorn, 2005. Evaluation of Bt-Maize Field Trials by a Proof of Safety. http://www.seedtest.org/upload/cms/user/presentation7Hothorn.pdf
data(gathmann.bt, package = "agridat")df1 <- gathmann.bt %>% pivot_longer(-gen, values_to = "abundance", names_to = "species") %>% mutate(species = case_when(species=="thysan" ~ "Thrip", TRUE ~ "Spider"))skimr::skim(df1)
## ── Data Summary ────────────────────────## Values## Name df1 ## Number of rows 32 ## Number of columns 3 ## _______________________ ## Column type frequency: ## character 1 ## factor 1 ## numeric 1 ## ________________________ ## Group variables None ## ## ── Variable type: character ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate min max empty n_unique whitespace## 1 species 0 1 5 6 0 2 0## ## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate ordered n_unique top_counts ## 1 gen 0 1 FALSE 2 Bt: 16, ISO: 16## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 abundance 0 1 6.01 6.36 0 0.6 2.5 10.7 18.4 ▇▂▃▁▂
g1 <- ggplot(df1, aes(gen, abundance, color = species)) + geom_point(size = 3) + facet_wrap(~species, scales = "free") + scale_color_discrete_qualitative() + guides(color = FALSE) + labs(x = "", y = "Abundance", tag = "(A)") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))g2 <- ggplot(df1, aes(gen, abundance, color = species)) + geom_point(size = 3) + scale_color_discrete_qualitative() + guides(color = FALSE) + labs(x = "", y = "Abundance", tag = "(B)", color = "Species") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))g3 <- ggplot(df1, aes(gen, abundance, color = species)) + geom_point(size = 3) + facet_wrap(~species) + scale_color_discrete_qualitative() + labs(x = "", y = "Abundance", tag = "(C)", color = "Species") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))g4 <- ggplot(df1, aes(species, abundance, color = gen)) + geom_point(size = 3) + facet_wrap(~gen, scales = "free") + scale_color_discrete_qualitative(palette = "Harmonic") + guides(color = FALSE) + labs(x = "", y = "Abundance", tag = "(D)") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))g5 <- ggplot(df1, aes(species, abundance, color = gen)) + geom_point(size = 3) + scale_color_discrete_qualitative(palette = "Harmonic") + guides(color = FALSE) + labs(x = "", y = "Abundance", tag = "(E)") + theme(axis.text.x = element_text(angle = 90))g6 <- ggplot(df1, aes(species, abundance, color = gen)) + geom_point(size = 3) + facet_wrap(~gen) + scale_color_discrete_qualitative(palette = "Harmonic") + labs(x = "", y = "Abundance", tag = "(F)", color = "Genotype") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))(g1 + g2 + g3) / (g4 + g5 + g6)
Comparison should be fair - any differences should be due to the factor you wish to investigate.
Comparison should be fair - any differences should be due to the factor you wish to investigate.
Comparable populations and measurements
Image source: https://www.infonet-biovision.org/PlantHealth/Pests/Thrips and https://cropwatch.unl.edu/2016/managing-spider-mites-corn-and-soybean
Raymond Pearl, 1911. The Personal Equation In Breeding Experiments Involving Certain Characters of Maize, Biol. Bull., 21, 339-366.
Comparable conditions
Comparable variables and sources
data(australia.soybean, package = "agridat")skimr::skim(australia.soybean)
## ── Data Summary ────────────────────────## Values ## Name australia.soybean## Number of rows 464 ## Number of columns 10 ## _______________________ ## Column type frequency: ## factor 3 ## numeric 7 ## ________________________ ## Group variables None ## ## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate ordered n_unique top_counts ## 1 env 0 1 FALSE 8 B70: 58, B71: 58, L70: 58, L71: 58 ## 2 loc 0 1 FALSE 4 Bro: 116, Law: 116, Nam: 116, Red: 116## 3 gen 0 1 FALSE 58 G01: 8, G02: 8, G03: 8, G04: 8 ## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 year 0 1 1970. 0.501 1970 1970 1970. 1971 1971 ▇▁▁▁▇## 2 yield 0 1 2.05 0.752 0.282 1.52 2.07 2.56 4.38 ▂▆▇▃▁## 3 height 0 1 0.883 0.272 0.25 0.708 0.888 1.04 1.73 ▂▆▇▂▁## 4 lodging 0 1 2.31 0.976 1 1.5 2.25 3 4.75 ▇▅▅▂▁## 5 size 0 1 11.1 4.45 4 7.84 9.5 14.0 23.6 ▅▇▂▃▁## 6 protein 0 1 40.3 2.93 33.2 38.1 40.2 42.2 48.5 ▁▇▇▅▁## 7 oil 0 1 19.9 2.67 13.0 18.0 19.8 22.1 26.8 ▁▆▇▆▂
australia.soybean %>% mutate(gen = reorder(gen, yield)) %>% ggplot(aes(gen, yield, color = loc, shape = as.factor(year))) + geom_point(size = 3) + labs(x = "Genotype", y = "Yield (tons/hectare)", shape = "Year", color = "Location") + scale_color_discrete_qualitative() + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
ggplot(australia.soybean, aes(env, yield, group = gen)) + geom_point(size = 6, color = "gray") + geom_line(size = 1.3, color = "gray") + geom_point(data = filter(australia.soybean, gen %in% c("G49", "G48", "G50")), aes(color = gen), size = 6) + geom_line(data = filter(australia.soybean, gen %in% c("G49", "G48", "G50")), aes(color = gen), size = 1.3) + scale_color_discrete_qualitative() + labs(x = "Environment", y = "Yield", color = "Genotype")
data(urquhart.feedlot, package = "agridat")df4 <- urquhart.feedlot %>% pivot_longer(c(weight1, weight2), names_to = "when", values_to = "weight") %>% mutate(when = factor(as.character(when), labels = c("initial", "final"), levels = c("weight1", "weight2")), diet = factor(diet, levels = c("High", "Medium", "Low")))skimr::skim(df4)
## ── Data Summary ────────────────────────## Values## Name df4 ## Number of rows 134 ## Number of columns 5 ## _______________________ ## Column type frequency: ## factor 2 ## numeric 3 ## ________________________ ## Group variables None ## ## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate ordered n_unique top_counts ## 1 diet 0 1 FALSE 3 Low: 64, Med: 46, Hig: 24## 2 when 0 1 FALSE 2 ini: 67, fin: 67 ## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 animal 0 1 34 19.4 1 17.2 34 50.8 67 ▇▇▇▇▇## 2 herd 0 1 25.6 10.7 3 19 31 34 36 ▂▁▂▂▇## 3 weight 0 1 863. 182. 530 705 842. 1027 1248 ▃▇▃▆▃
df4 %>% ggplot(aes(diet, weight, color = diet)) + geom_point(size = 3) + facet_grid(when ~ herd, scale="free_y") + scale_color_discrete_qualitative() + labs(x = "Diet", y = "Weight", title = "Weight by herd, timing and diet") + guides(color = FALSE) + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
ggplot(df4, aes(when, weight, color = diet, group = animal)) + geom_point(size = 3) + facet_wrap(~herd, nrow = 2) + geom_line() + labs(x = "", y = "Weight", color = "Diet")
g1 <- urquhart.feedlot %>% mutate(diet = factor(diet, levels = c("High", "Medium", "Low"))) %>% ggplot(aes(diet, weight2 - weight1, color = diet)) + geom_boxplot() + labs(x = "", y = "Weight gain", color = "Diet") + guides(color = FALSE)g2 <- urquhart.feedlot %>% mutate(diet = factor(diet, levels = c("High", "Medium", "Low"))) %>% ggplot(aes(diet, (weight2 - weight1)/weight1, color = diet)) + geom_boxplot() + labs(x = "", y = "Relative weight\ngain", color = "Diet") + guides(color = FALSE)g1 + g2
Cleveland, William S., and Robert Mc Gill. n.d. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.”
data(bank, package = "gclus")df5 <- bank %>% mutate(status = ifelse(Status==0, "genuine", "forgery")) skimr::skim(bank)
## ── Data Summary ────────────────────────## Values## Name bank ## Number of rows 200 ## Number of columns 7 ## _______________________ ## Column type frequency: ## numeric 7 ## ________________________ ## Group variables None ## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 Status 0 1 0.5 0.501 0 0 0.5 1 1 ▇▁▁▁▇## 2 Length 0 1 215. 0.377 214. 215. 215. 215. 216. ▁▇▇▂▁## 3 Left 0 1 130. 0.361 129 130. 130. 130. 131 ▁▅▇▇▁## 4 Right 0 1 130. 0.404 129 130. 130 130. 131. ▃▆▇▅▁## 5 Bottom 0 1 9.42 1.44 7.2 8.2 9.1 10.6 12.7 ▇▆▅▅▂## 6 Top 0 1 10.7 0.803 7.7 10.1 10.6 11.2 12.3 ▁▂▆▇▃## 7 Diagonal 0 1 140. 1.15 138. 140. 140. 142. 142. ▂▇▅▅▇
g1 <- ggplot(df5, aes(Diagonal, fill = status)) + geom_histogram(binwidth = 0.2, color = "white") + facet_grid(status ~ . ) + labs(x = "Diagonal length (mm)", y = "Count") + guides(fill = FALSE) + scale_fill_manual(values = c("#C7A76C", "#7DB0DD"))g1
data("barley", package = "lattice")skimr::skim(barley)
## ── Data Summary ────────────────────────## Values## Name barley## Number of rows 120 ## Number of columns 4 ## _______________________ ## Column type frequency: ## factor 3 ## numeric 1 ## ________________________ ## Group variables None ## ## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate ordered n_unique top_counts ## 1 variety 0 1 FALSE 10 Sva: 12, No.: 12, Man: 12, No.: 12## 2 year 0 1 FALSE 2 193: 60, 193: 60 ## 3 site 0 1 FALSE 6 Gra: 20, Dul: 20, Uni: 20, Mor: 20## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 yield 0 1 34.4 10.3 14.4 26.9 32.9 41.4 65.8 ▃▇▅▂▁
ggplot(barley, aes(yield, variety, shape = year)) + geom_point(size = 3) + facet_wrap(~site) + theme(plot.title.position = "plot", plot.title = element_text(face = "bold")) + labs(x = "Yield", shape = "Year", y = "Variety")
Immer, R. F., H. K. Hayes, and LeRoy Powers. (1934). Statistical Determination of Barley Varietal Adaptation. Journal of the American Society of Agronomy 26 403–419
data("barley", package = "lattice")skimr::skim(barley)
## ── Data Summary ────────────────────────## Values## Name barley## Number of rows 120 ## Number of columns 4 ## _______________________ ## Column type frequency: ## factor 3 ## numeric 1 ## ________________________ ## Group variables None ## ## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate ordered n_unique top_counts ## 1 variety 0 1 FALSE 10 Sva: 12, No.: 12, Man: 12, No.: 12## 2 year 0 1 FALSE 2 193: 60, 193: 60 ## 3 site 0 1 FALSE 6 Gra: 20, Dul: 20, Uni: 20, Mor: 20## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 yield 0 1 34.4 10.3 14.4 26.9 32.9 41.4 65.8 ▃▇▅▂▁
ggplot(barley, aes(yield, variety, color = year)) + geom_point(size = 3) + facet_wrap(~site) + theme(plot.title.position = "plot", plot.title = element_text(face = "bold")) + labs(x = "Yield", y = "Variety", color = "Year") + scale_color_discrete_qualitative()
How about now?
Immer, R. F., H. K. Hayes, and LeRoy Powers. (1934). Statistical Determination of Barley Varietal Adaptation. Journal of the American Society of Agronomy 26 403–419
data("barley", package = "lattice")skimr::skim(barley)
## ── Data Summary ────────────────────────## Values## Name barley## Number of rows 120 ## Number of columns 4 ## _______________________ ## Column type frequency: ## factor 3 ## numeric 1 ## ________________________ ## Group variables None ## ## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate ordered n_unique top_counts ## 1 variety 0 1 FALSE 10 Sva: 12, No.: 12, Man: 12, No.: 12## 2 year 0 1 FALSE 2 193: 60, 193: 60 ## 3 site 0 1 FALSE 6 Gra: 20, Dul: 20, Uni: 20, Mor: 20## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 yield 0 1 34.4 10.3 14.4 26.9 32.9 41.4 65.8 ▃▇▅▂▁
ggplot(barley, aes(yield, variety, color = year)) + geom_point(size = 3, alpha = 0.4) + geom_point(data = subset(barley, (site=="University Farm" & variety == "No. 475") | (site=="Grand Rapids" & variety == "Velvet")), size = 3) + facet_wrap(~site) + theme(plot.title.position = "plot", plot.title = element_text(face = "bold")) + labs(x = "Yield", y = "Variety", color = "Year") + scale_color_discrete_qualitative()
How about now?
Immer, R. F., H. K. Hayes, and LeRoy Powers. (1934). Statistical Determination of Barley Varietal Adaptation. Journal of the American Society of Agronomy 26 403–419
Cleveland, W. S. (1993) Visualising Data, Summit, NJ: Hobart Press.
Wright, Kevin (2013). Revisiting Immer's Barley Data. The American Statistician. 67 (3) 129–133.
data(olives, package = "classifly")df2 <- olives %>% mutate(Region = factor(Region, labels = c("South", "Sardinia", "North")))skimr::skim(df2)
## ── Data Summary ────────────────────────## Values## Name df2 ## Number of rows 572 ## Number of columns 10 ## _______________________ ## Column type frequency: ## factor 2 ## numeric 8 ## ________________________ ## Group variables None ## ## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate ordered n_unique top_counts ## 1 Region 0 1 FALSE 3 Sou: 323, Nor: 151, Sar: 98 ## 2 Area 0 1 FALSE 9 Sou: 206, Inl: 65, Cal: 56, Umb: 51## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 palmitic 0 1 1232. 169. 610 1095 1201 1360 1753 ▁▂▇▆▁## 2 palmitoleic 0 1 126. 52.5 15 87.8 110 169. 280 ▂▇▅▃▁## 3 stearic 0 1 229. 36.7 152 205 223 249 375 ▂▇▃▁▁## 4 oleic 0 1 7312. 406. 6300 7000 7302. 7680 8410 ▁▇▇▇▁## 5 linoleic 0 1 981. 243. 448 771. 1030 1181. 1470 ▃▅▃▇▃## 6 linolenic 0 1 31.9 13.0 0 26 33 40.2 74 ▂▅▇▂▁## 7 arachidic 0 1 58.1 22.0 0 50 61 70 105 ▂▁▇▇▂## 8 eicosenoic 0 1 16.3 14.1 1 2 17 28 58 ▇▃▅▂▁
g1 <- df2 %>% mutate(Area = fct_reorder(Area, palmitic)) %>% ggplot(aes(Area, palmitic, color = Region)) + geom_boxplot() + scale_color_discrete_qualitative() + guides(color = FALSE, x = guide_axis(n.dodge = 2))g2 <- ggplot(df2, aes(Region, palmitic, color = Region)) + geom_boxplot() + scale_color_discrete_qualitative() + guides(color = FALSE)g3 <- ggplot(df2, aes(palmitic, color = Region)) + geom_density() + scale_color_discrete_qualitative() + guides(color = FALSE)g4 <- ggplot(df2, aes(palmitic, color = Region)) + stat_ecdf() + scale_color_discrete_qualitative()g1 / (g2 | (g3 / g4)) + plot_layout(guides = "collect", byrow = FALSE)
data(EastIndiesTrade, package = "GDAdata")skimr::skim(EastIndiesTrade)
## ── Data Summary ────────────────────────## Values ## Name EastIndiesTrade## Number of rows 81 ## Number of columns 3 ## _______________________ ## Column type frequency: ## numeric 3 ## ________________________ ## Group variables None ## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 Year 0 1 1740 23.5 1700 1720 1740 1760 1780 ▇▇▇▇▇## 2 Exports 0 1 518. 421. 100 145 370 840 1395 ▇▂▃▂▂## 3 Imports 0 1 1005. 320. 460 835 975 1000 1550 ▃▃▇▁▅
g1 <- ggplot(EastIndiesTrade, aes(Year, Exports)) + annotate("rect", xmin = 1701, xmax = 1714, ymin = -Inf, ymax = Inf, fill = "red", alpha = 0.3) + annotate("rect", xmin = 1756, xmax = 1763, ymin = -Inf, ymax = Inf, fill = "red", alpha = 0.3) + annotate("rect", xmin = 1775, xmax = 1780, ymin = -Inf, ymax = Inf, fill = "red", alpha = 0.3) + geom_line(color = "#339933", size = 2) + geom_line(aes(Year, Imports), color = "red", size = 2) + geom_ribbon(aes(ymin = Exports, ymax = Imports), fill = "gray") + labs(y = "<span style='color:#339933'>Export</span>/<span style='color:red'>Import</span>", tag = "(A)") + theme(axis.title.y = ggtext::element_markdown())g2 <- ggplot(EastIndiesTrade, aes(Year, Exports - Imports)) + annotate("rect", xmin = 1701, xmax = 1714, ymin = -Inf, ymax = Inf, fill = "red", alpha = 0.3) + annotate("rect", xmin = 1756, xmax = 1763, ymin = -Inf, ymax = Inf, fill = "red", alpha = 0.3) + annotate("rect", xmin = 1775, xmax = 1780, ymin = -Inf, ymax = Inf, fill = "red", alpha = 0.3) + geom_line(size = 2) + labs(tag = "(B)")g3 <- ggplot(EastIndiesTrade, aes(Year, (Exports - Imports)/(Exports + Imports) * 2)) + annotate("rect", xmin = 1701, xmax = 1714, ymin = -Inf, ymax = Inf, fill = "red", alpha = 0.3) + annotate("rect", xmin = 1756, xmax = 1763, ymin = -Inf, ymax = Inf, fill = "red", alpha = 0.3) + annotate("rect", xmin = 1775, xmax = 1780, ymin = -Inf, ymax = Inf, fill = "red", alpha = 0.3) + geom_line(color = "#001a66", size = 2) + labs(y = "Relative difference", tag = "(C)")g1 / g2 / g3
df9 <- read_csv(here::here("data", "melb_temp.csv")) %>% janitor::clean_names() %>% rename(temp = maximum_temperature_degree_c) %>% filter(!is.na(temp)) %>% dplyr::select(year, month, day, temp)skimr::skim(df9)
## ── Data Summary ────────────────────────## Values## Name df9 ## Number of rows 18310 ## Number of columns 4 ## _______________________ ## Column type frequency: ## character 2 ## numeric 2 ## ________________________ ## Group variables None ## ## ── Variable type: character ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate min max empty n_unique whitespace## 1 month 0 1 2 2 0 12 0## 2 day 0 1 2 2 0 31 0## ## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 year 0 1 1995. 14.5 1970 1983 1995 2008 2020 ▇▇▇▇▇## 2 temp 0 1 19.9 6.48 5.7 14.8 18.6 23.6 46.8 ▃▇▃▁▁
ggplot(df9, aes(month, temp)) + geom_boxplot() + labs(x = "Month", y = "Maximum temperature (°C)")
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Lecturer: Emi Tanaka
ETC5521.Clayton-x@monash.edu
Week 6 - Session 1
Lecturer: Emi Tanaka
ETC5521.Clayton-x@monash.edu
Week 6 - Session 1
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |