library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
data(UN, package="carData")
UN <- as_tibble(UN,rownames="country")

3. Exercises

  1. What is the fertility rate and ppgdp of the 4 country with highest female life expectancy (lifeExpF)? Make sure you are using no country where region is missing.
UN |> 
  drop_na(region) |> 
  arrange(-lifeExpF) |> 
  select(country,fertility,ppgdp) |> 
  slice_head(n=4)
## # A tibble: 4 × 3
##   country   fertility  ppgdp
##   <chr>         <dbl>  <dbl>
## 1 Japan          1.42 43141.
## 2 Hong Kong      1.14 31824.
## 3 France         1.99 39546.
## 4 Spain          1.50 30543.
  1. What is the fertility rate and ppgdp of the 2 most urbanised country (pctUrban) for each region? Make sure you are using no country where region is missing.
UN |> 
  drop_na(region) |> 
  group_by(region) |> 
  arrange(-pctUrban) |> 
  select(country,fertility,ppgdp) |> 
  slice_head(n=2)
## Adding missing grouping variables: `region`
## # A tibble: 15 × 4
## # Groups:   region [8]
##    region        country       fertility  ppgdp
##    <fct>         <chr>             <dbl>  <dbl>
##  1 Africa        Gabon              3.20 12469.
##  2 Africa        Libya              2.41 11321.
##  3 Asia          Hong Kong          1.14 31824.
##  4 Asia          Macao              1.16 49990.
##  5 Caribbean     Anguilla           2    13750.
##  6 Caribbean     Bermuda            1.76 92625.
##  7 Europe        Belgium            1.84 43815.
##  8 Europe        Malta              1.28 19599.
##  9 Latin Amer    Venezuela          2.39 13503.
## 10 Latin Amer    Argentina          2.17  9162.
## 11 North America United States      2.08 46546.
## 12 North America Canada             1.69 46361.
## 13 NorthAtlantic Greenland          2.22 35293.
## 14 Oceania       Nauru              3.3   6190.
## 15 Oceania       Australia          1.95 57119.
  1. Create a variable called fertility1000, the result of this operation fertility*1000-infantMortality. What is the highest country on fertility1000 in each region? Make sure to use only countries where both fertility and infantMortality are known.
UN |> 
  drop_na(fertility,infantMortality) |> 
  mutate(fertility1000=fertility*1000-infantMortality) |> 
  group_by(region) |> 
  arrange(-fertility1000) |> 
  slice_head(n=1)
## # A tibble: 7 × 9
## # Groups:   region [7]
##   country        region group fertility  ppgdp lifeExpF pctUrban infantMortality
##   <chr>          <fct>  <fct>     <dbl>  <dbl>    <dbl>    <dbl>           <dbl>
## 1 Niger          Africa afri…      6.92   358.     55.8       17           85.8 
## 2 Timor Leste    Asia   other      5.92   706.     64.2       29           56.5 
## 3 Haiti          Carib… other      3.16   613.     63.9       54           58.3 
## 4 Iceland        Europe other      2.10 39278      83.8       94            2.06
## 5 Guatemala      Latin… other      3.84  2882.     75.1       50           26.3 
## 6 United States  North… oecd       2.08 46546.     81.3       83            6.46
## 7 Marshall Isla… Ocean… other      4.38  3069.     70.6       72           21   
## # ℹ 1 more variable: fertility1000 <dbl>