This cookbook provides examples of the code used to produce various chart types using sgplot. There are also examples to demonstrate how to apply further customisation to sgplot charts.
If there is a chart type or task which you think would be useful to include here, please submit a suggestion.
use_sgplot
The examples in this cookbook use
the sgplot theme and colour functions explicitly, however it may be
easier to make use of the use_sgplot()
function if your
charts all require a similar style. More information on
use_sgplot
can be found on the homepage.
Note on use of titles, subtitles and captions
Titles, subtitles and captions have been embedded in the charts in this
cookbook for demonstration purposes. However, for accessibility reasons,
it is usually preferable to provide titles in the body of the page
rather than embedded within the image of the plot.
The following packages are required to produce the example charts in this cookbook:
library(sgplot)
library(ggplot2)
library(dplyr)
# Use gapminder data for cookbook charts
library(gapminder)
Line charts
Line chart with one line
gapminder |>
filter(country == "United Kingdom") |>
ggplot(aes(x = year, y = lifeExp)) +
geom_line(linewidth = 1, colour = sg_colour_values["dark-blue"]) +
theme_sg() +
scale_y_continuous(limits = c(0, 82),
breaks = seq(0, 80, 20),
expand = c(0, 0)) +
scale_x_continuous(breaks = seq(1952, 2007, 5)) +
labs(
x = "Year",
y = NULL,
title = "Living Longer",
subtitle = "Life Expectancy in the United Kingdom 1952-2007",
caption = "Source: Gapminder"
)
Line chart with multiple lines
gapminder |>
filter(country %in% c("United Kingdom", "China")) |>
ggplot(aes(x = year, y = lifeExp, colour = country)) +
geom_line(linewidth = 1) +
theme_sg(legend = "bottom") +
scale_colour_discrete_sg() +
scale_y_continuous(limits = c(0, 82),
breaks = seq(0, 80, 20),
expand = c(0, 0)) +
scale_x_continuous(breaks = seq(1952, 2007, 5)) +
labs(
x = "Year",
y = NULL,
title = "Living Longer",
subtitle = "Life Expectancy in the United Kingdom and China 1952-2007",
caption = "Source: Gapminder",
colour = NULL
)
An example with line labels and no legend can be found in the Adding annotations section.
Bar charts
bar_data <-
gapminder |>
filter(year == 2007 & continent == "Europe") |>
slice_max(order_by = lifeExp, n = 5)
ggplot(bar_data, aes(x = reorder(country, -lifeExp), y = lifeExp)) +
geom_col(fill = sg_colour_values["dark-blue"]) +
theme_sg() +
scale_y_continuous(expand = c(0, 0)) +
labs(
x = NULL,
y = NULL,
title = "Iceland has the highest life expectancy in Europe",
subtitle = "Life expectancy in European countries, 2007",
caption = "Source: Gapminder"
)
A bar chart can sometimes look better with horizontal bars. This can
also be a good option if your bar labels are long and difficult to
display horizontally on the x axis. To produce a horizontal bar chart,
swap the variables defined for x and y in aes()
and make a
few tweaks to theme_sg()
; draw grid lines for the x axis
only by setting the grid
argument, and draw an axis line
for the y axis only by setting the axis
argument.
ggplot(bar_data, aes(x = lifeExp, y = reorder(country, lifeExp))) +
geom_col(fill = sg_colour_values["dark-blue"]) +
theme_sg(grid = "x", axis = "y") +
scale_x_continuous(expand = c(0, 0)) +
labs(
x = NULL,
y = NULL,
title = "Iceland has the highest life expectancy in Europe",
subtitle = "Life expectancy in European countries, 2007",
caption = "Source: Gapminder"
)
Grouped bar chart
To create a grouped bar chart, set stat = "identity"
and
position = "dodge"
in the call to geom_bar()
.
Also assign a variable to fill
within aes()
to
determine what variable is used to create bars within groups. The
legend
argument in theme_sg()
can be used to
set the position of the legend.
grouped_bar_data <-
gapminder |>
filter(year %in% c(1967, 2007) &
country %in% c("United Kingdom", "Ireland", "France", "Belgium"))
ggplot(grouped_bar_data,
aes(x = country, y = lifeExp, fill = as.factor(year))) +
geom_bar(stat = "identity", position = "dodge") +
scale_y_continuous(expand = c(0, 0)) +
theme_sg(legend = "bottom") +
scale_fill_discrete_sg() +
labs(
x = "Country",
y = NULL,
fill = NULL,
title = "Living longer",
subtitle = "Difference in life expectancy, 1967-2007",
caption = "Source: Gapminder"
)
Stacked bar chart
To create a stacked bar chart, set stat = "identity
and
position = "fill"
in the call to geom_bar()
and assign a variable to fill
as before. This will plot
your data as part-to-whole. To plot counts, set
position = "identity"
.
Caution should be taken when producing stacked bar charts. They can quickly become difficult to interpret if plotting non part-to-whole data, and/or if plotting more than two categories per stack. First and last categories in the stack will always be easier to compare across bars than those in the middle. Think carefully about the story you are trying to tell with your chart.
stacked_bar_data <-
gapminder |>
filter(year == 2007) |>
mutate(lifeExpGrouped = cut(lifeExp,
breaks = c(0, 75, Inf),
labels = c("Under 75", "75+"))) |>
group_by(continent, lifeExpGrouped) |>
summarise(n_countries = n(), .groups = "drop")
ggplot(stacked_bar_data,
aes(x = continent, y = n_countries, fill = lifeExpGrouped)) +
geom_bar(stat = "identity", position = "fill") +
theme_sg(legend = "bottom") +
scale_y_continuous(expand = c(0, 0), labels = scales::percent) +
coord_cartesian(clip = "off") +
scale_fill_discrete_sg() +
labs(
x = NULL,
y = NULL,
fill = "Life Expectancy",
title = "How life expectancy varies across continents",
subtitle = "Percentage of countries by life expectancy band, 2007",
caption = "Source: Gapminder"
)
Histograms
gapminder |>
filter(year == 2007) |>
ggplot(aes(x = lifeExp)) +
geom_histogram(binwidth = 5,
colour = "white",
fill = sg_colour_values["dark-blue"]) +
theme_sg() +
scale_y_continuous(expand = c(0, 0)) +
labs(
x = NULL,
y = "Number of \ncountries",
title = "How life expectancy varies",
subtitle = "Distribution of life expectancy, 2007",
caption = "Source: Gapminder"
)
Scatterplots
gapminder |>
filter(year == 2007) |>
ggplot(aes(x = gdpPercap, y = lifeExp, size = pop)) +
geom_point(colour = sg_colour_values["dark-blue"]) +
theme_sg(axis = "none", grid = "xy") +
scale_x_continuous(
labels = function(x) scales::dollar(x, prefix = "£")
) +
scale_size_continuous(labels = scales::comma) +
labs(
x = "GDP",
y = "Life\nExpectancy",
size = "Population",
title = stringr::str_wrap(
"The relationship between GDP and Life Expectancy is complex", 40
),
subtitle = "GDP and Life Expectancy for all countires, 2007",
caption = "Source: Gapminder"
)
Small multiples
gapminder |>
filter(continent != "Oceania") |>
group_by(continent, year) |>
summarise(pop = sum(as.numeric(pop)), .groups = "drop") |>
ggplot(aes(x = year, y = pop, fill = continent)) +
geom_area() +
theme_sg(axis = "none", ticks = "none", legend = "none") +
scale_fill_discrete_sg() +
facet_wrap(~ continent, ncol = 2) +
scale_y_continuous(breaks = c(0, 2e9, 4e9),
labels = c(0, "2bn", "4bn")) +
coord_cartesian(clip = "off") +
theme(axis.text.x = element_blank()) +
labs(
x = NULL,
y = NULL,
title = "Asia's rapid growth",
subtitle = "Population growth by continent, 1952-2007",
caption = "Source: Gapminder"
)
Pie charts
stacked_bar_data |>
filter(continent == "Europe") |>
ggplot(aes(x = "", y = n_countries, fill = lifeExpGrouped)) +
geom_col(colour = "white", position = "fill") +
coord_polar(theta = "y") +
theme_sg(grid = "none", axis = "none", ticks = "none") +
theme(axis.text = element_blank()) +
scale_fill_discrete_sg() +
labs(
x = NULL,
y = NULL,
fill = NULL,
title = "How life expectancy varies in Europe",
subtitle = "Percentage of countries by life expectancy band, 2007",
caption = "Source: Gapminder"
)
Focus charts
bar_data |>
ggplot(
aes(x = reorder(country, -lifeExp), y = lifeExp,
fill = country == "Sweden")
) +
geom_col() +
theme_sg(legend = "none") +
scale_y_continuous(expand = c(0, 0)) +
scale_fill_discrete_sg("focus", reverse = TRUE) +
labs(
x = NULL,
y = NULL,
title = "Sweden has the fourth highest life expectancy in Europe",
subtitle = "Life expectancy in European countries, 2007",
caption = "Source: Gapminder"
)
Interactive charts
To make a ggplot2
chart interactive, use
ggplotly()
from the plotly
package. Note
however that ggplotly()
has a number of ‘quirks’, including
the following:
sgplot uses the ‘sans’ font family, however
plotly
does not recognise this font. To work around this you should add a further call totheme
to set the font family for text to""
.Subtitles and captions are not supported in
ggplotly()
. As stated elsewhere in this guidance, titles and subtitles should ideally be included in the body of text surrounding a chart rather than embedded in the chart itself, and so this is hopefully not a big issue. This example therefore has no title, subtitle or caption.
p <-
bar_data |>
# Format text for tooltips
mutate(tooltip = paste0(
"Country: ", country, "\n",
"Life Expectancy: ", round(lifeExp, 1)
)) |>
ggplot(aes(x = reorder(country, -lifeExp), y = lifeExp, text = tooltip)) +
geom_col(fill = sg_colour_values["dark-blue"]) +
theme_sg(ticks = "x") +
theme(text = element_text(family = "")) +
scale_y_continuous(expand = c(0, 0)) +
labs(
x = NULL,
y = NULL
)
plotly::ggplotly(p, tooltip = "text") |>
plotly::config(
modeBarButtons = list(list("resetViews")),
displaylogo = FALSE
)
sgplot currently only works with ggplot2
charts, however
there are plans to develop the
package further to support interactive Highcharts produced using the
highcharter
package.
Annotations
Labelling your chart is often preferable to using a legend, as often
this relies on a user matching the legend to the data using colour
alone. The legend can be removed from a chart by setting
legend = "none"
in theme_sg()
.
The easiest way to add an annotation is to manually define the co-ordinates of the required position.
ann_data |>
ggplot(aes(x = year, y = lifeExp, colour = country)) +
geom_line(linewidth = 1) +
theme_sg(legend = "none") +
scale_colour_discrete_sg() +
scale_y_continuous(limits = c(0, 82),
breaks = seq(0, 80, 20),
expand = c(0, 0)) +
scale_x_continuous(limits = c(1952, 2017),
breaks = seq(1952, 2017, 5)) +
annotate(geom = "label", x = 2008, y = 73, label = "China",
colour = sg_colour_values[1],
label.size = NA,
hjust = 0, vjust = 0.5) +
annotate(geom = "label", x = 2008, y = 79.4, label = "United Kingdom",
colour = sg_colour_values[2],
label.size = NA,
hjust = 0, vjust = 0.5) +
labs(
x = "Year",
y = NULL,
title = "Living Longer",
subtitle = "Life Expectancy in the United Kingdom and China 1952-2007",
caption = "Source: Gapminder"
)
However, this makes the code difficult to reuse as values are hard coded and not automatically generated from the data. Automating the position of annotations is possible, but more fiddly.
The following examples use geom_label()
to use values
from the data to position annotations. geom_label()
draws a
rectangle behind the text (white by default) and a border the same
colour as the text (label_size = NA
can be used to remove
the border). geom_text()
is also an option for annotations,
but this does not include a background and so can be harder for text to
read if it overlaps with other chart elements. These functions also have
nudge
arguments that can be used to displace text to
improve the positioning.
Note that in the previous examples, annotate()
also
requires a geom (label
or text
). These operate
in the same way as geom_label()
and
geom_text()
, but as discussed, annotate()
is
only able to deal with fixed values.
ann_labs <- ann_data |>
group_by(country) |>
mutate(min_year = min(year)) |>
filter(year == max(year)) |>
ungroup()
ann_data |>
ggplot(aes(x = year, y = lifeExp, colour = country)) +
geom_line(linewidth = 1) +
theme_sg(legend = "none") +
scale_colour_discrete_sg() +
scale_y_continuous(limits = c(0, 82),
breaks = seq(0, 80, 20),
expand = c(0, 0)) +
scale_x_continuous(limits = c(1952, 2017),
breaks = seq(1952, 2017, 5)) +
geom_label(data = ann_labs,
aes(x = year, y = lifeExp, label = country, colour = country),
hjust = 0,
vjust = 0.5,
nudge_x = 0.5,
label.size = NA) +
labs(
x = "Year",
y = NULL,
title = "Living Longer",
subtitle = "Life Expectancy in the United Kingdom and China 1952-2007",
caption = "Source: Gapminder"
)
Annotations may also be used to add value labels to a bar chart. Note
that geom_text()
is used here as a background is not
required.
ggplot(bar_data, aes(x = reorder(country, -lifeExp), y = lifeExp)) +
geom_col(fill = sg_colour_values["dark-blue"]) +
geom_text(aes(label = round(lifeExp, 1)),
nudge_y = -5, colour = "white") +
theme_sg() +
scale_y_continuous(expand = c(0, 0)) +
labs(
x = NULL,
y = NULL,
title = "Iceland has the highest life expectancy in Europe",
subtitle = "Life expectancy in European countries, 2007",
caption = "Source: Gapminder"
)
Note: The annotate()
function should be
used to add annotations with manually defined positioning co-ordinates,
whereas geom_label()
and geom_text()
should be
used when using co-ordinates defined in a data frame. Although the
reverse may work, text can appear blurry.
Other customisations
theme_sg()
has arguments to control the legend position
and appearance of grid lines, axis lines and axis ticks. More
information on accepted values can be found in the help
file.
Sorting a bar chart
To control the order of bars in a chart, wrap the variable you want
to arrange with reorder()
and specify what variable you
want to sort by. The following example sorts bars in ascending order of
life expectancy. To sort in descending order, you would change this to
reorder(country, desc(lifeExp))
.
bar_data |>
ggplot(aes(x = lifeExp, y = reorder(country, lifeExp))) +
geom_col(fill = sg_colour_values["dark-blue"]) +
theme_sg(axis = "y", grid = "x")
Examples in the following sections build on this chart.
Changing chart titles
Chart titles such as the main title, subtitle, caption, axis titles
and legend titles, can be controlled using labs()
. A title
can be removed using NULL
.
Reducing space between chart and axis
By default, a bar chart will have a gap between the bottom of the bars and the axis. This can be removed as follows:
last_plot() + scale_x_continuous(expand = c(0, 0))
The equivalent adjustment can be made for the y axis using
scale_y_continuous
.
Changing axis limits, breaks and labels
Axis limits, breaks and labels for continuous variables can be
controlled using scale_x/y_continuous()
. For discrete
variables, labels can be changes using scale_x/y_discrete()
or alternatively by recoding the variable in the data before creating a
chart.
last_plot() +
scale_x_continuous(expand = c(0, 0),
limits = c(0, 85),
breaks = seq(0, 80, 20),
labels = c(seq(0, 70, 20), "80 years"))
Note that further calls to scale_x/y_continuous
will
overwrite previous calls, hence why expand = c(0, 0)
has
been included again in this example.
Formatting labels
Formatting axis labels or legend labels is easily handled using the
scales
package. The following example formats y axis labels
as percentages, however scales
can also handle currency and
thousands separators.
stacked_bar_data |>
ggplot(aes(x = continent, y = n_countries, fill = lifeExpGrouped)) +
geom_bar(stat = "identity", position = "fill") +
theme_sg(legend = "bottom") +
scale_y_continuous(expand = c(0, 0), labels = scales::percent) +
scale_fill_discrete_sg() +
labs(
x = NULL,
y = NULL,
fill = "Life Expectancy",
title = "How life expectancy varies across continents",
subtitle = "Percentage of countries by life expectancy band, 2007",
caption = "Source: Gapminder"
)
Avoiding axis/grid lines being cut off
Axis lines and grid lines can sometimes appear ‘cut off’ if they are drawn at the limits of the chart range. You can see in the example in the previous section that the top grid line is slightly narrower than the adjacent tick mark on the y axis. This is because the y axis limit is 100%. As the grid line is centred at 100%, the top half of the line is ‘cut off’. This can be corrected as follows:
last_plot() + coord_cartesian(clip = "off")
Adding a line
To add a horizontal or vertical line across the whole plot, use
geom_hline()
or geom_vline()
. This can be
useful to highlight a threshold or average level.
gapminder |>
filter(country == "United Kingdom") |>
ggplot(aes(x = year, y = lifeExp)) +
geom_line(linewidth = 1, colour = sg_colour_values[1]) +
geom_hline(yintercept = 75, colour = sg_colour_values[2],
linewidth = 1, linetype = "dashed") +
annotate(geom = "text", x = 2007, y = 70, label = "Age 70") +
theme_sg() +
scale_y_continuous(limits = c(0, 82),
breaks = seq(0, 80, 20),
expand = c(0, 0)) +
scale_x_continuous(breaks = seq(1952, 2007, 5)) +
labs(
x = "Year",
y = NULL,
title = "Living Longer",
subtitle = "Life Expectancy in the United Kingdom 1952-2007",
caption = "Source: Gapminder"
)
Wrapping text
If text is too long, it may be cut off or distort the dimensions of the chart.
plot <-
ggplot(bar_data, aes(x = reorder(country, -lifeExp), y = lifeExp)) +
geom_col(fill = sg_colour_values["dark-blue"]) +
theme_sg() +
scale_y_continuous(expand = c(0, 0)) +
labs(
x = NULL,
subtitle = "Life expectancy in European countries, 2007",
caption = "Source: Gapminder"
)
plot +
labs(
y = "Percentage of countries",
title = paste("Iceland has the highest life expectancy in Europe",
"followed closely by Switzerland")
)
There are two suggested ways to solve this issue; Insert
\n
within a string to force a line break; Use
stringr::str_wrap()
to set a maximum character width of the
string. See examples of both of these methods as follows:
Adjusting theme elements
If you find you need to adjust theme elements for your chart, this
can be done using theme()
. Note that this should be done
after the call to theme_sg()
, otherwise
theme_sg()
may overwrite the specifications you’ve
made.
ggplot(bar_data, aes(x = reorder(country, -lifeExp), y = lifeExp)) +
geom_col(fill = sg_colour_values["dark-blue"]) +
theme_sg(axis = "xy") +
theme(axis.line = element_line(colour = "black"),
axis.ticks = element_line(colour = "black")) +
scale_y_continuous(expand = c(0, 0)) +
labs(
x = NULL,
y = NULL,
title = "Iceland has the highest life expectancy in Europe",
subtitle = "Life expectancy in European countries, 2007",
caption = "Source: Gapminder"
)
Using different colour palettes
sgplot provides colour palettes as set out by the Scottish Government Design System. These palettes have been developed to meet the Web Content Accessibility Guidelines 2.1 for graphical objects.
The main palette is the default for discrete colour/fill functions, and the sequential palette for continuous colour/fill functions.
The Scottish Government palettes are used by default, however palettes are also available for Social Security Scotland and the Government Analysis Function.
More information on the colours used in sgplot can be found at
vignette("colours")
.
Using non-default sgplot colour palettes
There may be instances where you want to use an sgplot colour palette
other than the default. The full list of available palettes can be found
by running either sgplot::sg_colour_palettes
or
sgplot::af_colour_palettes
.
To use the Scottish Government main-extended
palette:
gapminder |>
filter(country %in% c("United Kingdom", "China", "India",
"Sweden", "Namibia", "Brazil")) |>
ggplot(aes(x = year, y = lifeExp, colour = country)) +
geom_line(linewidth = 1) +
theme_sg(legend = "bottom") +
scale_colour_discrete_sg("main-extended") +
scale_y_continuous(limits = c(0, 82),
breaks = seq(0, 80, 20),
expand = c(0, 0)) +
scale_x_continuous(breaks = seq(1952, 2007, 5)) +
labs(
x = "Year",
y = NULL,
title = "Living Longer",
subtitle = "Life Expectancy in the United Kingdom and China 1952-2007",
caption = "Source: Gapminder",
colour = NULL
)
Note: This chart is for demonstration purposes only. Accessibility guidance recommends using a maximum of four colours to avoid clutter.
To use an Analysis Function palette, set
palette_type = "af"
when using any of the
scale_
colour
functions. For example, to use the Analysis Function
main2
palette:
gapminder |>
filter(country %in% c("United Kingdom", "China")) |>
ggplot(aes(x = year, y = lifeExp, colour = country)) +
geom_line(linewidth = 1) +
theme_sg(legend = "bottom") +
scale_colour_discrete_sg("main2", palette_type = "af") +
scale_y_continuous(limits = c(0, 82),
breaks = seq(0, 80, 20),
expand = c(0, 0)) +
scale_x_continuous(breaks = seq(1952, 2007, 5)) +
labs(
x = "Year",
y = NULL,
title = "Living Longer",
subtitle = "Life Expectancy in the United Kingdom and China 1952-2007",
caption = "Source: Gapminder",
colour = NULL
)
Using your own colour palette
There may be instances where you’d like to use a colour palette that is not available in sgplot. If so, this should be carefully considered to ensure it meets accessibility requirements. The Analysis Function guidance outlines appropriate steps for choosing your own accessibile colour palette and should be used.
my_palette <- c("#0F820D", "#000000")
gapminder |>
filter(country == "United Kingdom") |>
ggplot(aes(x = year, y = lifeExp)) +
geom_line(linewidth = 1, colour = my_palette[1]) +
theme_sg() +
scale_y_continuous(limits = c(0, 82),
breaks = seq(0, 80, 20),
expand = c(0, 0)) +
scale_x_continuous(breaks = seq(1952, 2007, 5)) +
labs(
x = "Year",
y = NULL,
title = "Living Longer",
subtitle = "Life Expectancy in the United Kingdom 1952-2007",
caption = "Source: Gapminder"
)
gapminder |>
filter(country %in% c("United Kingdom", "China")) |>
ggplot(aes(x = year, y = lifeExp, colour = country)) +
geom_line(linewidth = 1) +
theme_sg(legend = "bottom") +
scale_colour_manual(values = my_palette) +
scale_y_continuous(limits = c(0, 82),
breaks = seq(0, 80, 20),
expand = c(0, 0)) +
scale_x_continuous(breaks = seq(1952, 2007, 5)) +
labs(
x = "Year",
y = NULL,
title = "Living Longer",
subtitle = "Life Expectancy in the United Kingdom and China 1952-2007",
caption = "Source: Gapminder",
colour = NULL
)
Adding a new colour palette to sgplot
If you use a different palette regularly and feel it would be useful for this to be added to sgplot, please make a suggestion as per the contributing guidance.
Acknowledgments
This cookbook and the examples it contains have been inspired by the BBC Visual and Data Journalism cookbook for R graphics and their bbplot package.