GGPLOT examples

Zulquar Nain
AMU

Importing and Exploring the Data


library(readxl)
gdata <- read_excel("ggplotdata.xlsx")


head(gdata,5)
# A tibble: 5 × 28
  States    CS  Rank Rankc   HCR   CFI   GER FLFPR   GDP  GDP1  EODB    UR  LFPR
  <chr>  <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Andhr…    72     4 Fron…  15.6    87  32.4  0.55  3.84   3.8 52.4    5.7  63.5
2 Aruna…    60    22 Perf…  24.4    68  29.7  0.26  2.52   2.5  0      7.7  44.8
3 Assam     57    26 Perf…  36.2    85  18.7  0.17  5.26   5.3  5.93   7.1  49.4
4 Bihar     52    28 Perf…  52.5    94  13.6  0.06  7.63   7.6  2.78  10.6  41.4
5 Chand…    79     1 Fron…   4.8    77  50.6  0.35  6.11   6.1  0      7.8  54.9
# ℹ 15 more variables: SCSTLA <chr>, RTCSC <chr>, TOTP <dbl>, TOTM <dbl>,
#   TOTF <dbl>, P_LIT <dbl>, M_LIT <dbl>, F_LIT <dbl>, TOT_P_U <dbl>,
#   TOT_M_U <dbl>, TOT_F_U <dbl>, P_LIT_U <dbl>, M_LIT_U <dbl>, F_LIT_U <dbl>,
#   scol <dbl>

About the data I

  • The data constitute different measure of Sustainable Development Goals(SDG)

  • Data is for selected states for the year 2020-21 (30 states/union territories)

  • Source of the data is NITI Ayog and Census 2011

  • Main variables are

    • CS: Composit Score of SDG Ranking
    • Rankc: Composite score based category
    • Gross Enrolment Ratio (GER) in higher education (18-23 years)
    • FLFPR: Ratio of female to male Labour Force Participation Rate (LFPR) (15-59 years)
    • GDP: Annual growth rate of GDP (constant prices) per capita
    • HCR: Head count ratio as per the Multidimensional Poverty Index (%)
    • TOT_P: Total population;TOT_M: Total Male; TOT_F: Total Female
  • There are total 28 variables

An Example

  • Lets explore the relation between Female employment and Education
  • We will use a scatter plot
library(ggplot2)
ggplot(gdata, aes(x=GER, y=FLFPR)) +
  geom_point()

An example

  • Adding the details of the graph (labels)
Click here to see the code
ggplot(gdata, aes(x=GER, y=FLFPR)) +
  geom_point()+
  labs(subtitle="FLFPR Vs GER", 
      y="Ratio of Female to Male LFPR", 
      x="Gross Enrollment Ratio", 
      title="Female Employment Vs Education", 
      caption = "Data Source:Niti Ayog")

An Example

  • Changing the colour of the dots in the grapgh
Click here to see the code
ggplot(gdata, aes(x=GER, y=FLFPR)) +
  geom_point(col="red")+
  labs(subtitle="FLFPR Vs GER", 
        y="Ratio of Female to Male LFPR", 
        x="Gross Enrollment Ratio", 
        title="Female Employment Vs Education", 
        caption = "Data Source: Niti Ayog")

An Example

  • Adding the colour based on the categories of the states.
  • Categorisation states have been done based on composite score by NITI Ayog
Click here to see the code
ggplot(gdata, aes(x=GER, y=FLFPR)) +
  geom_point(aes(col=Rankc))+
  labs(subtitle="FLFPR Vs GER", 
        y="Ratio of Female to Male LFPR", 
        x="Gross Enrollment Ratio", 
        title="Female Employment Vs Education", 
        caption = "Data Source: Niti Ayog")

An Example

  • Adding the size aesthetic based on the value of composite score
Click here to see the code
ggplot(gdata, aes(x=GER, y=FLFPR)) +
  geom_point(aes(col=Rankc,size=CS))+
  labs(subtitle="FLFPR Vs GER", 
        y="Ratio of Female to Male LFPR", 
        x="Gross Enrollment Ratio", 
        title="Female Employment Vs Education", 
        caption = "Data Source: Niti Ayog")

An Example

  • Adding the legend titles with description instead of abbreviation
Click here to see the code
ggplot(gdata, aes(x=GER, y=FLFPR)) +
  geom_point(aes(col=Rankc,size=CS))+
  labs(subtitle="FLFPR Vs GER", 
        y="Ratio of Female to Male LFPR", 
        x="Gross Enrollment Ratio", 
        title="Female Employment Vs Education", 
        caption = "Data Source: Niti Ayog")+
   labs(size="Composite Score",col="Rank Categories")

An Example

  • Change the colour based on GDP growth rate
Click here to see the code
ggplot(gdata, aes(x=GER, y=FLFPR)) +
  geom_point(aes(col=GDP1,size=CS))+
  labs(subtitle="FLFPR Vs GER", 
        y="Ratio of Female to Male LFPR", 
        x="Gross Enrollment Ratio", 
        title="Female Employment Vs Education", 
        caption = "Data Source: Niti Ayog")+
   labs(size="Composite Score",col="Rank Categories")

An Example

  • Data label
Click here to see the code
library(ggrepel)
ggplot(gdata, aes(x=GER, y=FLFPR,label=States)) +
  geom_point(aes(col=GDP1,size=CS))+
  geom_label_repel(aes(label = States),
                  box.padding   = 0.35, 
                  point.padding = 0.5,
                  segment.color = 'grey50')+
  labs(subtitle="FLFPR Vs GER", 
        y="Ratio of Female to Male LFPR", 
        x="Gross Enrollment Ratio", 
        title="Female Employment Vs Education", 
        caption = "Data Source: Niti Ayog")+
   labs(size="Composite Score",col="Rank Categories")

An Example

  • A Bar Chart based on Composite Score
Click here to see the code
ggplot(gdata, aes(x=States, y=CS)) + 
  geom_bar(stat="identity", width=.4, fill="tomato3") + 
  labs(title="Rank based on CS",
       y= "Composite Score",
       caption="sorce:NITI Ayog") + 
  theme(axis.text.x = element_text(angle=90, vjust=0.2))+
  theme(axis.text.x = element_text(colour = "blue"))

An Example

  • Share of Male and Female population in total population
  • To do this first we create a subset of data consisting male and female population
Click here to see the code
gdata1 <- gdata[,c(1,17:18)]
class(gdata1)
[1] "tbl_df"     "tbl"        "data.frame"
Click here to see the code
head(gdata1,6)
# A tibble: 6 × 3
  States                TOTM     TOTF
  <chr>                <dbl>    <dbl>
1 Andhra Pradesh    42442146 42138631
2 Arunachal Pradesh   713912   669815
3 Assam             15939443 15266133
4 Bihar             54278157 49821295
5 Chandigarh          580663   474787
6 Chhattisgarh      12832895 12712303

An Example

  • Now melt the data using function melt from reshape package
Click here to see the code
library(reshape)
df.m <- melt(as.data.frame(gdata1),id.vars = "States")
head(df.m,5)
             States variable    value
1    Andhra Pradesh     TOTM 42442146
2 Arunachal Pradesh     TOTM   713912
3             Assam     TOTM 15939443
4             Bihar     TOTM 54278157
5        Chandigarh     TOTM   580663
Click here to see the code
tail(df.m,5)
          States variable    value
54    Tamil Nadu     TOTF 36009055
55     Telangana     TOTF  1799541
56 Uttar Pradesh     TOTF 95331831
57   Uttarakhand     TOTF  4948519
58   West Bengal     TOTF 44467088

An Example

  • Now plotting the graph
Click here to see the code
ggplot(df.m, aes(x=States, y=value, fill=variable))+ geom_bar(stat="identity", width=.2) + 
  labs(title="Male and Female", 
       caption="sorce:Census 2011") + 
  theme(axis.text.x = element_text(angle=90, vjust=0.6))

Let’s try

Practice makes a man perfect 🏃