Tidyverse

Collection of R packages From Wikipedia, the free encyclopedia

The tidyverse is a collection of open source packages for the R programming language introduced by Hadley Wickham[4] and his team that "share an underlying design philosophy, grammar, and data structures" of tidy data.[5] Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging piping.[6][7][8]

Initial releaseSeptember 15, 2016; 9 years ago (2016-09-15)[1][2]
Stable release
2.0.0[3] Edit this on Wikidata / 23 February 2023; 3 years ago (23 February 2023)
Written inR
TypePackage collection
Quick facts Initial release, Stable release ...
Tidyverse
Initial releaseSeptember 15, 2016; 9 years ago (2016-09-15)[1][2]
Stable release
2.0.0[3] Edit this on Wikidata / 23 February 2023; 3 years ago (23 February 2023)
Written inR
TypePackage collection
LicenseMIT
Websitewww.tidyverse.org Edit this at Wikidata
Repositorygithub.com/tidyverse/tidyverse
Close

As of November 2018, the tidyverse package and some of its individual packages comprise 5 out of the top 10 most downloaded R packages.[9] The tidyverse is the subject of multiple books and papers.[10][11][12][13] In 2019, the ecosystem has been published in the Journal of Open Source Software.[14]

Its syntax has been referred to as "supremely readable",[15] and some[16] have argued that tidyverse is an effective way to introduce complete beginners to programming, as pedagogically it allows students to quickly begin doing data processing tasks.[17][16] Moreover, some practitioners have pointed out that data processing tasks are intuitively easier to chain together with tidyverse compared to Python's equivalent data processing package, pandas.[18] There is also an active R community around the tidyverse. For example, there is the TidyTuesday social data project organised by the Data Science Learning Community (DSLC),[19] where varied real-world datasets are released each week for the community to participate, share, practice, and make learning to work with data easier.[20] Critics of the tidyverse have argued it promotes tools that are harder to teach and learn than their built-in, base R equivalents and are too dissimilar to some programming languages.[21][22]

The tidyverse principles more generally encourage and help ensure that a universe of streamlined packages, in principle, will help alleviate dependency issues and compatibility with current and future features.[23] An example of such a tidyverse principled approach is the pharmaverse, which is a collection of R packages for clinical reporting usage in pharma.[24]

Packages

The core tidyverse packages, which provide functionality to model, transform, and visualize data, include:[25]

  • tidyr help transform data specifically into tidy data, a table where each row is a single observation. The value of each observation is given in a column of the table. Variables describing the observation are included as additional columns.
  • ggplot2 – for data visualization
  • dplyr – for wrangling and transforming data
  • readr help read in common delimited, text files with data
  • purrr a functional programming toolkit
  • tibble a modern implementation of the built-in data frame data structure
  • stringr helps to manipulate string data types
  • forcats helps to manipulate category data types

Additional packages assist the core collection.[26] Other packages based on the tidy data principles are regularly developed, such as tidytext[27] for text analysis, tidymodels[28] for machine learning, or tidyquant[29] for financial operations.

References

Related Articles

Wikiwand AI