Welcome to the part three of this training module on data visualization using ggplot2. If this is your first time seeing this post, please go back and see the previous posts where I covered the basic steps of using ggplot2.

In this part I will cover scales, axes, legends, positioning and themes.

Scales

Before we move on, let’s load the tidyverse library with the code below. If you have not installed tidyverse remove the “#” symbol and install it now.

# install.packages("tidyverse")
library("tidyverse")
## ── Attaching packages ────────────────────────────────────────── tidyverse 1.2.1 ──

## ✔ ggplot2 3.0.0     ✔ purrr   0.2.5
## ✔ tibble  1.4.2     ✔ dplyr   0.7.6
## ✔ tidyr   0.8.1     ✔ stringr 1.3.1
## ✔ readr   1.1.1     ✔ forcats 0.3.0

## ── Conflicts ───────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

We will still be using our dataset on United States colleges which consists of the school name, city, state, region, highest degree offered by each school, SAT average scores, tution and so on. Once again, I would like to reiterate that this dataset is not mine. I found it online and it can be accessed through “http://672258.youcanlearnit.net”.

Let’s import our data and store it in R with the variable name visualData

visualData <- read_csv('https://olawaleayilara.github.io/visualData.csv')
## Parsed with column specification:
## cols(
##   id = col_integer(),
##   name = col_character(),
##   city = col_character(),
##   state = col_character(),
##   region = col_character(),
##   highest_degree = col_character(),
##   control = col_character(),
##   gender = col_character(),
##   admission_rate = col_double(),
##   sat_avg = col_integer(),
##   undergrads = col_integer(),
##   tuition = col_integer(),
##   faculty_salary_avg = col_integer(),
##   loan_default_rate = col_character(),
##   median_debt = col_double(),
##   lon = col_double(),
##   lat = col_double()
## )
head(visualData)
## # A tibble: 6 x 17
##       id name  city  state region highest_degree control gender
##    <int> <chr> <chr> <chr> <chr>  <chr>          <chr>   <chr> 
## 1 102669 Alas… Anch… AK    West   Graduate       Private CoEd  
## 2 101648 Mari… Mari… AL    South  Associate      Public  CoEd  
## 3 100830 Aubu… Mont… AL    South  Graduate       Public  CoEd  
## 4 101879 Univ… Flor… AL    South  Graduate       Public  CoEd  
## 5 100858 Aubu… Aubu… AL    South  Graduate       Public  CoEd  
## 6 100663 Univ… Birm… AL    South  Graduate       Public  CoEd  
## # ... with 9 more variables: admission_rate <dbl>, sat_avg <int>,
## #   undergrads <int>, tuition <int>, faculty_salary_avg <int>,
## #   loan_default_rate <chr>, median_debt <dbl>, lon <dbl>, lat <dbl>

Take time to play with the data to ensure it is clean enough before you proceed to visualization. In this case, our data is reasonably clean. Therefore, we can proceed to the next stage.

Scales are integral part of ggplot2. They control the mapping from data to aesthetics. Scales also provide the tools that let you read the plot: the axes and legends. Scale is made up of three parts separated by “_ “: 1. scale 2. The name of the aesthetic (e.g., colour, shape or x) 3. The name of the scale (e.g., continuous, discrete, brewer).

Technically, scale is in every aesthetic used on our plot. For example, when we write:

ggplot(data=visualData,mapping=aes(x=tuition, y=faculty_salary_avg)) +
  geom_point(aes(colour = control))

unnamed-chunk-3-1.png

This is what is going on in the background

ggplot(data=visualData,mapping=aes(x=tuition, y=faculty_salary_avg)) +
  geom_point(aes(colour = control)) +
  scale_x_continuous() +
  scale_y_continuous() +
  scale_colour_discrete()

unnamed-chunk-4-1.png

This is because manually adding a scale every time we use a new aesthetic makes it tedious to use ggplot2, hence, ggplot2 adds it automatically. We can override the defaults to what we want easily. For example,

ggplot(data=visualData,mapping=aes(x=tuition, y=faculty_salary_avg)) +
  geom_point(aes(colour = control)) +
  scale_x_continuous("Tuition in United States") +
  scale_y_continuous("Average Salary of Faculty")

unnamed-chunk-5-1.png

You can as well use different scales together. For example,

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  scale_x_reverse("Tuition") +
  scale_colour_brewer() 

unnamed-chunk-6-1.png

In the name argument for scale function, we can supply text strings (using for line breaks) or mathematical expressions in quote(). For example,

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  scale_x_continuous("Tuition") +
  scale_y_continuous(name = quote(Salary + mathematical ^ expression)) 

unnamed-chunk-7-1.png

We can as well use the xlab(), ylab() and labs() to change axis names. For example;

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  xlab("Tuition") +
  ylab("Salary") +
  labs(title = "Tution vs Salary", subtitle = "Can we say faculty earn more in Private Universities?")

unnamed-chunk-8-1.png

We can remove the axis label if we choose to do so. This can be done by setting the axis to “”, this quotes omits the label, but still allocates space; NULL removes the label and its space. Try it out and observe the difference.

Another attribute you can specify is the breaks ** and **labels argument. The breaks argument controls which values appear as tick marks on axes and keys on legends. Each break has an associated label, controlled by the labels argument. For example,

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  scale_x_continuous(name = "Tuition", breaks = c(20000,45000)) +
  scale_y_continuous(name = "Salary", breaks = c(1000,20000)) 

unnamed-chunk-9-1.png

Let’s add the labels argument.

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  scale_x_continuous(name = "Tuition", breaks = c(20000,45000), labels = c("20K", "45K")) +
  scale_y_continuous(name = "Salary", breaks = c(1000,20000),labels = c("1K", "20K")) 

unnamed-chunk-10-1.png

We can as well relabel the breaks in a categorical scale. For example,

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  scale_x_continuous(name = "Tuition", breaks = c(20000,45000), labels = c("Moderate", "Too Expensive")) +
  scale_y_continuous(name = "Salary", breaks = c(1000,5000,10000,20000),labels = c("Very Bad", "Bad","Manageable","Good Pay")) 

unnamed-chunk-11-1.png

Also we can supply a function to breaks or labels. For example;

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  scale_x_continuous(name = "Tuition",labels = scales::dollar_format("$")) +
  scale_y_continuous(name = "Salary",labels = scales::dollar_format("$")) 

unnamed-chunk-12-1.png

We suppress breaks (and for axes, grid lines) or labels, by setting them to NULL, For example,

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  scale_x_continuous(labels = NULL) +
  scale_y_continuous(labels = NULL) 

unnamed-chunk-13-1.png

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  scale_x_continuous(name = NULL, breaks = NULL) +
  scale_y_continuous(name = NULL, breaks = NULL) 

unnamed-chunk-13-2.png

We can also modify the limits using the ** limits ** parameter of the scale. Continuous scales should have a numeric vector of length two, and if we only want to set the upper or lower limit, we can set the other value to NA. Discrete scales should have a character vector which lists all possible values.

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  scale_x_continuous(name = "Tuition",labels = scales::dollar_format("$"), limits = c(3000, 50000)) +
  scale_y_continuous(name = "Salary",labels = scales::dollar_format("$"),limits = c(1000, 20000)) 
## Warning: Removed 3 rows containing missing values (geom_point).

unnamed-chunk-14-1.png

We can as well use the xlim(), ylim() and lims() functions.

ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
 geom_point(aes(colour = control)) +
  xlim(3000, 50000) +
  ylim(1000, 20000) 
## Warning: Removed 3 rows containing missing values (geom_point).

unnamed-chunk-15-1.png

Another aesthetic that we commonly use is colour. I’ll briefly show how we can use this aesthetic effectively by dividing this part into two categories: (i) continuous, and (ii) discrete.

Continuous Variables

There are four continuous colour scales:

  1. scale_colour_gradient() and scale_fill_gradient(): a two-colour gradient, low-high (light blue-dark blue)
ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
  geom_bin2d()  +
  scale_fill_gradient(low = "white", high = "black")

unnamed-chunk-16-1.png

  1. scale_colour_gradient2() and scale_fill_gradient2(): a three-colour gradient, low-med-high (red-white-blue)
ggplot(data=visualData,mapping=aes(x=tuition, y = faculty_salary_avg)) +
  geom_bin2d()  +
  scale_fill_gradient2(low = "red", mid = "blue", high = "green")

unnamed-chunk-17-1.png

  1. scale_colour_gradientn() and scale_fill_gradientn(): a custom n-colour gradient.
ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
  geom_bin2d()  +
  scale_fill_gradientn(colours = terrain.colors(5))

unnamed-chunk-18-1.png

  1. scale_color_distiller() and scale_fill_gradient() apply the Color-Brewer colour scales to continuous data.
ggplot(data=visualData,mapping=aes(x=tuition, y= faculty_salary_avg)) +
  geom_bin2d()  +
  scale_fill_distiller()

unnamed-chunk-19-1.png

Discrete Variables

There are four discrete colour scales in ggplot2.

  1. The default colour scheme, scale_colour_hue(), chooses evenly spaced hues around the HCL colour wheel. We can control the default chroma (c), luminance (l), and the range of hues (h) arguments. For example,
ggplot(data=visualData,mapping=aes(x=highest_degree, fill = highest_degree)) +
  geom_bar() 

unnamed-chunk-20-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree, fill = highest_degree)) +
  geom_bar() +
  scale_fill_hue(c = 30)

unnamed-chunk-21-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree, fill = highest_degree)) +
  geom_bar() +
  scale_fill_hue(l = 30)

unnamed-chunk-22-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree, fill = highest_degree)) +
  geom_bar() +
  scale_fill_hue(h = c(200, 400))

unnamed-chunk-23-1.png

  1. scale_colour_brewer() allows us to choose colors from a built in color palettes in “ColorBrewer” which can be accessed through “http://colorbrewer2.org/”. We can use RColorBrewer::display.brewer.all() to list all palettes.
ggplot(data=visualData,mapping=aes(x=highest_degree, fill = highest_degree)) +
  geom_bar() +
  scale_fill_brewer(palette =  "Spectral")

unnamed-chunk-24-1.png

We can as well change the order of the colors by specifying the direction. For example,

ggplot(data=visualData,mapping=aes(x=highest_degree, fill = highest_degree)) +
  geom_bar() +
  scale_fill_brewer(palette =  "Spectral", direction = -1)

unnamed-chunk-25-1.png

  1. scale_colour_grey() maps discrete data to grays, from light to dark. For example,
ggplot(data=visualData,mapping=aes(x=highest_degree, fill = highest_degree)) +
  geom_bar() +
  scale_fill_grey()

unnamed-chunk-26-1.png

We can change the direction as well by specifying start and end parameters.

ggplot(data=visualData,mapping=aes(x=highest_degree, fill = highest_degree)) +
  geom_bar() +
  scale_fill_grey(start = 1, end = 0)

unnamed-chunk-27-1.png

  1. scale_colour_manual allows us to set specify our own color palettes. For example,
ggplot(data=visualData,mapping=aes(x=highest_degree, fill = highest_degree)) +
  geom_bar() +
  scale_fill_manual(values = c(Associate = "blue", Bachelor = "red", Graduate = "green"))

unnamed-chunk-28-1.png

Positioning

The four components that control positioning are: Position adjustments, Position scales, Facetting and Coordinate systems. Let’s look at these controls one after the other.

  1. Position Adjustments: Position adjustments apply minor tweaks to the position of elements within a layer. I will show adjustments that apply directly to bars and points.
ggplot(data=visualData,mapping=aes(x=highest_degree, fill = region)) +
  geom_bar()

unnamed-chunk-29-1.png

#geom_bar() is similar to geom_bar(position ="stack")
ggplot(data=visualData,mapping=aes(x=highest_degree, fill = region)) +
  geom_bar(position = "fill") 

unnamed-chunk-30-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree, fill = region)) +
  geom_bar(position = "dodge") 

unnamed-chunk-31-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point()

unnamed-chunk-32-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point(position = "jitter")

unnamed-chunk-33-1.png

I explained jittering in detail in my previous post.

  1. Position Scales (discussed at the beginning of this module)

  2. Facetting: It is a very useful command to present visuals with the same varables, but for different subsets using the same data. There are three types of facetting:

  1. facet_null(): a single plot, the default.
  2. facet_wrap(): “wraps” a 1d ribbon of panels into 2d.
  3. facet_grid(): produces a 2d grid of panels defined by variables which form the rows and columns.
ggplot(data=visualData,mapping=aes(x=highest_degree,  fill = highest_degree)) +
  geom_bar() +
  facet_wrap(~region)

unnamed-chunk-34-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree,  fill = highest_degree)) +
  geom_bar() +
  facet_grid(.~region)

unnamed-chunk-35-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree,  fill = highest_degree)) +
  geom_bar() +
  facet_grid(region~.)

unnamed-chunk-36-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree,  fill = highest_degree)) +
  geom_bar() +
  facet_grid(region~control)

unnamed-chunk-37-1.png

Just to show we can do some more intresting facetting using facet_grid. let’s look at the code below

ggplot(data=visualData,mapping=aes(x=highest_degree,  fill = highest_degree)) +
  geom_bar() +
  facet_grid(control~region+gender)

unnamed-chunk-38-1.png

Note the use of the sign ~ in the facet_grid command.

Before we move to the next section, let’s briefly see how we can control whether the position scales are the same in all panels (fixed) or allowed to vary between panels (free) with the scales parameter for both facet_wrap() and facet_grid(). Also, we can specify the space parameter. This is most useful for categorical scales, where we can assign space proportionally based on the number of levels in each facet.

ggplot(economics_long, aes(date, value)) +
     geom_line() +
     facet_wrap(~variable, scales = "free_y")

unnamed-chunk-39-1.png

ggplot(data=visualData,mapping=aes(x=tuition,  y = region , colour = region)) +
geom_point() +
facet_grid(control ~ ., scales = "free", space = "free")

unnamed-chunk-40-1.png

  1. Coordinate System: The coordinate system combines the two position aesthetics to produce a 2d position on the plot. Also, in coordination with the faceter, coordinate systems draw axes and panel backgrounds.There are two types of coordinate system.

  2. Linear coordinate systems which preserves the shape of geoms:

  • coord_cartesian(): the default Cartesian coordinate system, where the 2d position of an element is given by the combination of the x and y positions. This function allows us to zoom in on our plot.
ggplot(data=visualData,mapping=aes(x=region ,  y = tuition , colour = region)) +
   geom_point(position = "jitter") +
   scale_y_continuous(limits= c(10000,30000))
## Warning: Removed 704 rows containing missing values (geom_point).

unnamed-chunk-41-1.png

ggplot(data=visualData,mapping=aes(x=region ,  y = tuition , colour = region)) +
geom_point(position = "jitter") +
coord_cartesian(ylim = c(10000, 30000))

unnamed-chunk-42-1.png

Essentially, the coord_cartesian function is advantageous because the data outside the limits are not discarded as we saw with setting axis limits using the scale_ function.

  • coord flip(): Cartesian coordinate system with x and y axes flipped. This is useful if we are interested in x conditional on y, or we just want to rotate the plot 90 degrees.
ggplot(data=visualData,mapping=aes(x=region ,  y = tuition , colour = region)) +
geom_point(position = "jitter") +
coord_flip()

unnamed-chunk-43-1.png

  • coord fixed(): Cartesian coordinate system with a fixed aspect ratio. The default ratio ensures that the x and y axes have equal scales.
ggplot(mtcars, aes(mpg, wt)) + 
  geom_point()

unnamed-chunk-44-1.png

ggplot(mtcars, aes(mpg, wt)) + 
  geom_point() +
  coord_fixed(ratio = 1)

unnamed-chunk-44-2.png

ggplot(mtcars, aes(mpg, wt)) + 
  geom_point() +
  coord_fixed(ratio = 5)

unnamed-chunk-44-3.png

ggplot(mtcars, aes(mpg, wt)) + 
  geom_point() +
  coord_fixed(ratio = 1/5)

unnamed-chunk-44-4.png

ggplot(mtcars, aes(mpg, wt)) + 
  geom_point() +
  coord_fixed(xlim = c(15, 30))

unnamed-chunk-44-5.png

  1. Non-linear coordinate systems which can change the shape geoms. For example, a straight line may no longer be straight. The closest distance between two points may no longer be a straight line.
  • coord polar(): Polar coordinates.
ggplot(data=visualData,mapping=aes(x=region ,  y = tuition , colour = region)) +
geom_point(alpha = 0.4, position = "jitter") +
coord_polar("x")

unnamed-chunk-45-1.png

ggplot(data=visualData,mapping=aes(x=region ,  y = tuition , colour = region)) +
geom_point(alpha = 0.4, position = "jitter") +
coord_polar("y")

unnamed-chunk-46-1.png

ggplot(data=visualData,mapping=aes(x=control,  fill = region)) +
  geom_bar() 

unnamed-chunk-47-1.png

ggplot(data=visualData,mapping=aes(x=control,  fill = region)) +
  geom_bar() +
  coord_polar()

unnamed-chunk-48-1.png

  • coord map()/coord quickmap(): Map projections.
ggplot(data=map_data("state"), mapping = aes(x=long,y=lat, group=group)) +
  geom_polygon(fill = "white", color = "black") +
  coord_map()
## 
## Attaching package: 'maps'

## The following object is masked from 'package:purrr':
## 
##     map

unnamed-chunk-49-1.png

ggplot(data=map_data("state"), mapping = aes(x=long,y=lat, group=group)) +
  geom_polygon(fill = "white", color = "black") +
  coord_map("ortho")

unnamed-chunk-50-1.png

ggplot(data=map_data("state"), mapping = aes(x=long,y=lat, group=group)) +
  geom_polygon(fill = "white", color = "black") +
  coord_map("stereographic")

unnamed-chunk-51-1.png

ggplot(data=map_data("world"), mapping = aes(x=long,y=lat, group=group)) +
  geom_polygon(fill = "white", color = "black") +
  coord_map("ortho")

unnamed-chunk-52-1.png

  • coord trans(): Apply arbitrary transformations to x and y positions, after the data has been processed by the stat. See documentation in R.

Finally let’s look at themes and legends.

Themes and Legends

ggplot2 has several built in themes. Let’s look at some of the themes. The complete themes are a great place to start but we don’t have enough control. For example,

ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point() +
  theme_grey() + 
  ggtitle("theme_grey()")

unnamed-chunk-53-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point() +
  theme_bw() + 
  ggtitle("theme_bw()")

unnamed-chunk-54-1.png

ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point() +
  theme_dark() + 
  ggtitle("theme_dark()")

unnamed-chunk-55-1.png

We can as well load the ggthemes package for more themes.

library(ggthemes)
ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point() +
  theme_excel() + 
  ggtitle("theme_excel()")

unnamed-chunk-56-1.png

For a full control on the appearance of our theme, we need to use the theme() function to override the default settings. This can be achieved by using code like plot + theme(element.name = element function()). There are four basic types of built-in element functions: text, lines, rectangles, and blank. Each element function has a set of parameters that control the appearance. For example,

ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point(position = "jitter") +
  ggtitle("Salary by Degree") +
  theme(plot.title = element_text(size = 16, face = "bold", colour = "red"),
        panel.grid.major = element_line(colour = "black", size = 0.5 , linetype = "dotted"),
        plot.background = element_rect(colour = "red", size = 2))

unnamed-chunk-57-1.png

The last element function, element_blank() draws nothing. We can use this if we don’t want anything drawn, and no space allocated for that element. For example,

ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point(position = "jitter") +
  ggtitle("Salary by Degree") +
  theme(panel.grid.minor = element_blank(),
        panel.grid.major = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank()
    
  )

unnamed-chunk-58-1.png

The elements that control the appearance of the plot can be roughly grouped into five categories: plot, axis, legend, panel and facet.

Plot Elements

Element Setter Description
plot.background element_rect() Plot background
plot.title element_text() Plot title
plot.margin margin() Margins around plot

Axis Elements

Element Setter Description
axis.line element_line() Line parallel to axis
axis.text element_text() Tick labels
axis.text.x element_text() x-axis tick labels
axis.text.y element_text() y-axis tick labels
axis.title element_text() Axis titles
axis.title.x element_text() x-axis title
axis.title.y element_text() y-axis title
axis.ticks element_line() Axis tick marks
axis.ticks.length unit() Length of tick marks
ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point(position = "jitter") +
  ggtitle("Salary by Degree") +
  theme(axis.line = element_line(colour = "grey50", size = 1),
        axis.text = element_text(color = "blue", size = 12),
        axis.text.x = element_text(angle = -90, vjust = 0.5)
       )

unnamed-chunk-59-1.png

Legend Elements

The legend elements control the appearance of all legends.

Element Setter Description
legend.background element_rect() Legend background
legend.key element rect() Background of legend keys
legend.key.size unit() Legend key size
legend.key.height unit() Legend key height
legend.key.width unit() Legend key width
legend.margin unit() Legend margin
legend.text element_text() Legend labels
legend.text.align 0–1 Legend label alignment (0 = right, 1 = left)
legend.title element_text() Legend name
legend.title.align 0–1 Legend name alignment (0 = right, 1 = left)
ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point(aes(colour = region), position = "jitter") +
  ggtitle("Salary by Degree") +
  theme(legend.background = element_rect(fill = "red", colour = "grey50", size = 1),
        legend.key = element_rect(color = "grey50"),
        legend.key.width = unit(0.5, "cm"),
        legend.key.height = unit(0.5, "cm"),
        legend.text = element_text(size = 12),
        legend.title = element_text(size = 12, face = "bold")
    
  ) 

unnamed-chunk-60-1.png

Panel Elements

Panel elements control the appearance of the plotting panels.

Element Setter Description
panel.background element_rect() Panel background (under data)
panel.border element_rect() Panel border (over data)
panel.grid.major element_line() Major grid lines
panel.grid.major.x element_line() Vertical major grid lines
panel.grid.major.y element_line() Horizontal major grid lines
panel.grid.minor element_line() Minor grid lines
panel.grid.minor.x element_line() Vertical minor grid lines
panel.grid.minor.y element_line() Horizontal minor grid lines
aspect.ratio numeric Plot aspect ratio
ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point(aes(colour = region), position = "jitter") +
  ggtitle("Salary by Degree") +
  theme(panel.background = element_rect(fill = "lightblue"),
        panel.grid.major = element_line(color = "gray60", size = 0.8),
        panel.grid.minor = element_line(color = "gray60", size = 0.8),
        aspect.ratio = 1
  ) 

unnamed-chunk-61-1.png

Facetting Elements

Element Setter Description
strip.background element_rect() Background of panel strips
strip.text element_text() Strip text
strip.text.x element_text() Horizontal strip text
strip.text.y element_text() Vertical strip text
panel.margin unit() Margin between facets
panel.margin.x unit() Margin between facets (vertical)
panel.margin.y unit() Margin between facets (horizontal)
ggplot(data=visualData,mapping=aes(x=highest_degree, y=faculty_salary_avg)) +
  geom_point(aes(colour = region), position = "jitter") +
  ggtitle("Salary by Degree") +
  facet_wrap(~control) +
  theme(panel.margin = unit(0.5, "in"),
       strip.background = element_rect(fill = "red", color = "blue", size = 2),
       strip.text = element_text(colour = "white")
  ) 
## Warning: `panel.margin` is deprecated. Please use `panel.spacing` property
## instead

unnamed-chunk-62-1.png

Saving our outputs

There are two ways to save output from ggplot2.

  1. The standard R approach where you open a graphics device, create the plots, then close the device.
pdf("MyOutput.pdf", width = 6, height = 6)

ggplot(data=visualData,mapping=aes(x=region, fill=control)) +
  geom_bar() +
  theme(panel.background=element_blank()) +
  theme(plot.background=element_blank()) +
  scale_y_continuous(limits=c(0,500)) +
  scale_fill_manual(values=c("blue","red")) +
  labs(x = "Region",
       y = "Number of Schools",
       title = "Do we have more private schools than public schools in US ?",
       fill = "Institution Type")
  
dev.off()
## quartz_off_screen 
##                 2
  1. Use the ggsave function
ggplot(data=visualData,mapping=aes(x=region, fill=control)) +
  geom_bar() +
  theme(panel.background=element_blank()) +
  theme(plot.background=element_blank()) +
  scale_y_continuous(limits=c(0,500)) +
  scale_fill_manual(values=c("blue","red")) +
  labs(x = "Region",
       y = "Number of Schools",
       title = "Do we have more private schools than public schools in US ?",
       fill = "Institution Type")

unnamed-chunk-64-1.png

ggsave("MyOutput.pdf")
## Saving 7 x 5 in image

In this post, I have covered scales, axes, legends, positioning and themes. Right now, you should be confident of creating some fancy visuals using ggplot2. In my next post, I will cover how to program with ggplot2. Until then, you can go ahead and pick different variables to produce the plots we have covered in this session.

Reference

  • Wickham H. (2016). ggplot2: Elegant Graphics for Data Analysis (Use R!), 2nd Edition. Springer, New York

For any question and contributions, please feel free to email ayilarof@myumanitoba.ca