class: center, middle, inverse, title-slide # Tidyverse ### Italo Cegatta ### 20/10/2021 --- # Temas de hoje - O que é Tidyverse - Livro de Referência - Os pacotes - Exemplos - Cartas de comandos - Referências --- # O que é Tidyverse Até 2016 o termo **hadleyverse** era utilizado como referência aos pacotes criados pelo Hadley Wickham. Mas na UseR 2016 ele cunhou o termo **Tidyverse**, utilizado desde então. .pull-left[ <div class="figure" style="text-align: center"> <img src="img/hadley_site.jpg" alt="<a href="http://hadley.nz/">http://hadley.nz</a>" width="100%" /> <p class="caption"><a href="http://hadley.nz/">http://hadley.nz</a></p> </div> ] .pull-right[ <div class="figure" style="text-align: center"> <img src="img/tidyverse_talk.jpg" alt="<a href="https://channel9.msdn.com/Events/useR-international-R-User-conference/useR2016/Towards-a-grammar-of-interactive-graphics">Tidy tools for data science (aka Tidyverse)</a>" width="100%" /> <p class="caption"><a href="https://channel9.msdn.com/Events/useR-international-R-User-conference/useR2016/Towards-a-grammar-of-interactive-graphics">Tidy tools for data science (aka Tidyverse)</a></p> </div> ] --- # O que é Tidyverse <div class="figure" style="text-align: center"> <img src="img/tidyverse_site.jpg" alt="<a href="https://www.tidyverse.org">https://www.tidyverse.org</a>" width="80%" /> <p class="caption"><a href="https://www.tidyverse.org">https://www.tidyverse.org</a></p> </div> --- # Livro de Referência <div class="figure" style="text-align: center"> <img src="img/r4ds_site.jpg" alt="<a href="https://r4ds.had.co.nz/">https://r4ds.had.co.nz</a>" width="80%" /> <p class="caption"><a href="https://r4ds.had.co.nz/">https://r4ds.had.co.nz</a></p> </div> --- # Os pacotes .pull-left[ ```r install.packages("tidyverse") ``` ```r install.packages("ggplot2") install.packages("dplyr") install.packages("tidyr") install.packages("readr") install.packages("purrr") install.packages("tibble") install.packages("stringr") install.packages("forcats") install.packages("lubridate") install.packages("hms") install.packages("DBI") install.packages("haven") install.packages("httr") install.packages("jsonlite") install.packages("readxl") install.packages("rvest") install.packages("xml2") install.packages("modelr") install.packages("broom") ``` ] .pull-right[ ```r library("tidyverse") ``` ```r library("ggplot2") library("dplyr") library("tidyr") library("readr") library("purrr") library("tibble") library("stringr") library("forcats") ``` ] --- # Os pacotes <div class="figure" style="text-align: center"> <img src="img/data-science_cicle_packages.png" alt="<a href="https://osf.io/69gub/wiki/home/">https://osf.io/69gub/wiki/home/</a>" width="90%" /> <p class="caption"><a href="https://osf.io/69gub/wiki/home/">https://osf.io/69gub/wiki/home/</a></p> </div> --- class: inverse, center # Exemplos! <img src="img/cat-computer.gif" width="50%" style="display: block; margin: auto;" /> --- ```r library(tidyverse) ``` ``` ## -- Attaching packages --------------------------------------- tidyverse 1.3.1 -- ``` ``` ## v ggplot2 3.3.5 v purrr 0.3.4 ## v tibble 3.1.5 v dplyr 1.0.7 ## v tidyr 1.1.4 v stringr 1.4.0 ## v readr 2.0.2 v forcats 0.5.1 ``` ``` ## -- Conflicts ------------------------------------------ tidyverse_conflicts() -- ## x dplyr::filter() masks stats::filter() ## x dplyr::lag() masks stats::lag() ``` ```r library(broom) ``` --- # Importando dados com **readr** e **tibble** ```r dados_raw <- read_csv2( "https://github.com/italocegatta/italocegatta.github.io_source/raw/master/content/dados/tume_55_24.csv" ) dados_raw ``` ``` ## # A tibble: 1,881 x 9 ## N_tume I_meses Esp Parc_m2 N_arv DAP_cm H_m Cod Cod2 ## <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 55 24 E_botryoides 600 1 4.1 6.5 NA NA ## 2 55 24 E_botryoides 600 2 9.7 8 NA NA ## 3 55 24 E_botryoides 600 3 NA NA 5 NA ## 4 55 24 E_botryoides 600 4 7.6 7.5 2 NA ## 5 55 24 E_botryoides 600 5 3.8 5 NA NA ## 6 55 24 E_botryoides 600 6 NA NA 1 NA ## 7 55 24 E_botryoides 600 7 12.6 9 6 NA ## 8 55 24 E_botryoides 600 8 NA NA 1 NA ## 9 55 24 E_botryoides 600 9 7 8 NA NA ## 10 55 24 E_botryoides 600 10 7.5 7.5 NA NA ## # ... with 1,871 more rows ``` --- # Manipulando dados com **dplyr** e **pipe** ```r dados <- dados_raw %>% select(Esp, DAP_cm, H_m) %>% filter(Esp %in% c("E_saligna", "E_urophylla")) dados ``` ``` ## # A tibble: 160 x 3 ## Esp DAP_cm H_m ## <chr> <dbl> <dbl> ## 1 E_saligna 8.6 9.5 ## 2 E_saligna 11.8 11 ## 3 E_saligna NA NA ## 4 E_saligna 9.9 10.5 ## 5 E_saligna 9.9 10.5 ## 6 E_saligna 6.4 8.5 ## 7 E_saligna 9.9 11 ## 8 E_saligna 10.5 10.5 ## 9 E_saligna 8.1 9.5 ## 10 E_saligna 10 10 ## # ... with 150 more rows ``` --- # Manipulando dados com **dplyr** e **pipe** ```r dados %>% select( especie = Esp, dap = DAP_cm ) %>% group_by(especie) %>% summarise( n_vivas = sum(!is.na(dap)), dap_medio = mean(dap, na.rm = TRUE) ) ``` ``` ## # A tibble: 2 x 3 ## especie n_vivas dap_medio ## <chr> <int> <dbl> ## 1 E_saligna 70 9.56 ## 2 E_urophylla 74 8.96 ``` --- # Fazendo gráficos com **ggplot2** ```r ggplot(dados, aes(DAP_cm, H_m)) + geom_point(alpha=0.4) + geom_smooth(method="lm") + facet_wrap(~Esp) + theme_bw() ``` <img src="index_files/figure-html/unnamed-chunk-16-1.png" width="40%" style="display: block; margin: auto;" /> --- # Reformatando dados com **tidyr** ```r dados %>% group_by(Esp) %>% nest() ``` ``` ## # A tibble: 2 x 2 ## # Groups: Esp [2] ## Esp data ## <chr> <list> ## 1 E_saligna <tibble [80 x 2]> ## 2 E_urophylla <tibble [80 x 2]> ``` --- # Programação funcional com **purrr** ```r dados_modl <- dados %>% group_by(Esp) %>% nest() %>% mutate( ajuste = map(data, ~ lm(log(H_m) ~ I(1/DAP_cm), data = .x)), resumo = map(ajuste, glance) ) dados_modl ``` ``` ## # A tibble: 2 x 4 ## # Groups: Esp [2] ## Esp data ajuste resumo ## <chr> <list> <list> <list> ## 1 E_saligna <tibble [80 x 2]> <lm> <tibble [1 x 12]> ## 2 E_urophylla <tibble [80 x 2]> <lm> <tibble [1 x 12]> ``` --- # Desempacotando os dados ```r dados_modl %>% select(Esp, resumo) %>% unnest(resumo) ``` ``` ## # A tibble: 2 x 13 ## # Groups: Esp [2] ## Esp r.squared adj.r.squared sigma statistic p.value df logLik AIC ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 E_salig~ 0.798 0.795 0.0607 269. 2.47e-25 1 97.9 -190. ## 2 E_uroph~ 0.769 0.766 0.126 240. 1.22e-24 1 49.5 -93.1 ## # ... with 4 more variables: BIC <dbl>, deviance <dbl>, df.residual <int>, ## # nobs <int> ``` --- # Aplicando os modelos ```r dados_pred <- dados_modl %>% mutate( hpred = map2(ajuste, data, ~ exp(predict(.x, newdata = .y))) ) %>% select(Esp, data, hpred) dados_pred ``` ``` ## # A tibble: 2 x 3 ## # Groups: Esp [2] ## Esp data hpred ## <chr> <list> <list> ## 1 E_saligna <tibble [80 x 2]> <dbl [80]> ## 2 E_urophylla <tibble [80 x 2]> <dbl [80]> ``` --- # Dados preditos ```r dados_compl <- dados_pred %>% unnest(data, hpred) dados_compl ``` ``` ## # A tibble: 160 x 4 ## # Groups: Esp [2] ## Esp DAP_cm H_m hpred ## <chr> <dbl> <dbl> <dbl> ## 1 E_saligna 8.6 9.5 10.5 ## 2 E_saligna 11.8 11 11.7 ## 3 E_saligna NA NA NA ## 4 E_saligna 9.9 10.5 11.1 ## 5 E_saligna 9.9 10.5 11.1 ## 6 E_saligna 6.4 8.5 9.20 ## 7 E_saligna 9.9 11 11.1 ## 8 E_saligna 10.5 10.5 11.3 ## 9 E_saligna 8.1 9.5 10.3 ## 10 E_saligna 10 10 11.1 ## # ... with 150 more rows ``` --- # Observado vs Predito ```r dados_compl %>% ggplot(aes(H_m, hpred)) + geom_point() + geom_abline() + facet_wrap(~Esp) + theme_bw() ``` <img src="index_files/figure-html/unnamed-chunk-22-1.png" width="40%" height="20%" style="display: block; margin: auto;" /> --- # Cartas de comandos <div class="figure" style="text-align: center"> <img src="img/cheatsheets.jpg" alt="<a href="https://www.rstudio.com/resources/cheatsheets/">https://www.rstudio.com/resources/cheatsheets/</a>" width="80%" /> <p class="caption"><a href="https://www.rstudio.com/resources/cheatsheets/">https://www.rstudio.com/resources/cheatsheets/</a></p> </div> --- # Referências - Gimenez, Olivier. 2020. Introduction to the Tidyverse. https://github.com/oliviergimenez/intro_tidyverse - Grolemund, Garrett. 2019. Remaster the Tidyverse. https://github.com/rstudio-education/remaster-the-tidyverse - Wickham, Charlotte. 2018. Data Science in the tidyverse https://github.com/cwickham/data-science-in-tidyverse - Wickham, Hadley. 2014. Tidy data. The Journal of Statistical Software 59 (10). http://www.jstatsoft.org/v59/i10/ - Wickham, Hadley. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag. https://ggplot2-book.org/ - Wickham, Hadley and Garrett Grolemund. 2016. R for Data Science. O’Reilly Media. http://r4ds.had.co.nz/ --- class: inverse, center, middle ## Obrigado!