Using dplyr's slice functions to pick specific and random rows from a data frame in R (CC042)

Описание к видео Using dplyr's slice functions to pick specific and random rows from a data frame in R (CC042)

In this screencast tutorial, Pat Schloss shows how you can use dplyr's slice functions including slice, slice_head, slice_tail, and slice_sample to pick specific and random rows from a data frame in R. We'll then modify code from a previous episode to recalculate the specificity of 16S rRNA genes for each taxonomic group at each taxonomic rank using slice_sample to incorporate randomness into the analysis. This episode is part of a larger arc of episodes investigating the sensitivity and specificity of amplicon sequence variants (ASVs), also known as exact sequence variants (ESVs). ASVs are growing in popularity for analyzing microbial communities using 16S rRNA gene sequences. Pat demonstrates these concepts by live coding at the command line interface using RStudio, GitHub Flow, and make.

0:00 Introduction
2:13 Today's issue
5:46 Slice commands
9:50 Outlining approach to downsample species with pseudocode
14:10 Filling in code to address uneven sampling of species
21:41 Trying different number of genomes per species
23:07 Comparing results using git diff
25:18 Conclusion

The accompanying blog post contains the exercises and solutions can be found at http://www.riffomonas.org/code_club/2...

Комментарии

Информация по комментариям в разработке