Changing ggplot colors with scale_color_brewer
What you will learn
- Learn to transform data from wide to long format with tidyr;
- Be able to choose a different color palette with RColorBrewer;
Table of Contents
- Data source
- Coding the past: change colors in ggplot with RColorBrewer
‘War is over, if you want it, war is over, now’
Ever visited a webpage with clashing colors or poor text contrast? You’re not alone! Choosing the perfect color palette for data visualizations can be complex.
Fortunately R offers you several libraries made by professional designers that offer excellent color palettes for you. In this lesson, you will learn about one of these libraries, the RColorBrewer package. To make things more interesting, we’ll use data from the military expenses of leading capitalist countries during the Cold War era. Let’s paint your data story!
Data used in this lesson is available on the World Bank website.
Coding the past: change colors in ggplot with RColorBrewer
1. Importing data into R
Download the data file here and load the libraries we will need, according to the code below. To read the data, use the R function
read_csv(). Additionally, we are only interested in the five first rows and in columns 3 and 5 to 36. They are selected with
[1:5, c(3, 5:36)].
2. How to use pivot_longer?
If you take a look at the dataframe you just loaded, you will see that it has one column for each year. To use ggplot your data has to be tidy. According to Hadley Wickham, in a tidy dataframe:
- Each variable must have its own column;
- Each observation must have its own row;
- Each value must have its own cell;
To make our data tidy, we will transform all the year columns in one variable called “year” and we will also transfer the values contained in these columns to a single new variable called “expense”. Note the syntax of the
pivot_longer function. The first argument is the dataframe we want to transform, the second are the columns we would like to treat. Finally,
names_to indicates the name of the new column that will receive the years and
values_to indicates the name of the new column that will receive the values of the year columns.
mutate function makes two adjustments in the new long dataset. First, it eliminates the second part of the year names, e.g.,
[YR1960]. Second, it rounds the expenses values to two decimal places.
Finally, we change the names of the columns (variables) in our dataset.
3. Using scale_color_brewer to improve your plots’ colors
To see all the colors palettes the RColorBrewer offers, use the following code:
We will be using palette
Set1 in our line plot. To set it, add the layer
scale_color_brewer(palette = 'Set1'). Note that we also set the x-axis to have labels every 4 years with
scale_x_discrete(breaks = seq(1960, 1990, by=4)). Color and group aesthetics were mapped to countries so that each country has a different color.
4. Adding a theme to the plot
To customize our plot, we will use the ggplot theme developed in the lesson ‘Climate data visualization’. Small adjustments were made to adapt the theme to this plot. For instance, the legend position was set to be at the bottom of the plot and its title was deleted.
Feel free to test other color palettes and check the one you like the most!
- You can transform your dataframe from wide to long format using
- RColorBrewer offers color palettes to make your plots more effective and beautiful.