library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
data(UN, package="carData")
UN <- as_tibble(UN,rownames="country")
3. Exercises
- What is the
fertility
rate and ppgdp
of
the 4 country with highest female life expectancy
(lifeExpF
)? Make sure you are using no country where
region
is missing.
UN |>
drop_na(region) |>
arrange(-lifeExpF) |>
select(country,fertility,ppgdp) |>
slice_head(n=4)
## # A tibble: 4 × 3
## country fertility ppgdp
## <chr> <dbl> <dbl>
## 1 Japan 1.42 43141.
## 2 Hong Kong 1.14 31824.
## 3 France 1.99 39546.
## 4 Spain 1.50 30543.
- What is the
fertility
rate and ppgdp
of
the 2 most urbanised country (pctUrban
) for each
region
? Make sure you are using no country where
region
is missing.
UN |>
drop_na(region) |>
group_by(region) |>
arrange(-pctUrban) |>
select(country,fertility,ppgdp) |>
slice_head(n=2)
## Adding missing grouping variables: `region`
## # A tibble: 15 × 4
## # Groups: region [8]
## region country fertility ppgdp
## <fct> <chr> <dbl> <dbl>
## 1 Africa Gabon 3.20 12469.
## 2 Africa Libya 2.41 11321.
## 3 Asia Hong Kong 1.14 31824.
## 4 Asia Macao 1.16 49990.
## 5 Caribbean Anguilla 2 13750.
## 6 Caribbean Bermuda 1.76 92625.
## 7 Europe Belgium 1.84 43815.
## 8 Europe Malta 1.28 19599.
## 9 Latin Amer Venezuela 2.39 13503.
## 10 Latin Amer Argentina 2.17 9162.
## 11 North America United States 2.08 46546.
## 12 North America Canada 1.69 46361.
## 13 NorthAtlantic Greenland 2.22 35293.
## 14 Oceania Nauru 3.3 6190.
## 15 Oceania Australia 1.95 57119.
- Create a variable called
fertility1000
, the result of
this operation fertility*1000-infantMortality
. What is the
highest country on fertility1000
in each region? Make sure
to use only countries where both fertility
and
infantMortality
are known.
UN |>
drop_na(fertility,infantMortality) |>
mutate(fertility1000=fertility*1000-infantMortality) |>
group_by(region) |>
arrange(-fertility1000) |>
slice_head(n=1)
## # A tibble: 7 × 9
## # Groups: region [7]
## country region group fertility ppgdp lifeExpF pctUrban infantMortality
## <chr> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Niger Africa afri… 6.92 358. 55.8 17 85.8
## 2 Timor Leste Asia other 5.92 706. 64.2 29 56.5
## 3 Haiti Carib… other 3.16 613. 63.9 54 58.3
## 4 Iceland Europe other 2.10 39278 83.8 94 2.06
## 5 Guatemala Latin… other 3.84 2882. 75.1 50 26.3
## 6 United States North… oecd 2.08 46546. 81.3 83 6.46
## 7 Marshall Isla… Ocean… other 4.38 3069. 70.6 72 21
## # ℹ 1 more variable: fertility1000 <dbl>