Text analysis in R. Part 1: Preprocessing

Описание к видео Text analysis in R. Part 1: Preprocessing

This is a short series of videos on the basics of computational text analysis in R. It is loosely inspired by our Text analysis in R paper (http://vanatteveldt.com/p/welbers-tex..., closely related to our R course material Github page (https://github.com/ccs-amsterdam/r-co..., and 42% love letter to quanteda.



#### Useful links ####

Low-level string processing:
A good place to start is by learning how to use the stringr package. (I personally prefer the stringi package because I'm used to it, but stringr is probably more accessible to most, as it has this tidyverse flair).

stringr vignette:
https://cran.r-project.org/web/packag...



Another great resource on stringr is the R for data science book, which also does more regular expression stuff:
https://r4ds.had.co.nz/strings.html

Character encoding
'What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text' by David C. Zentgraf: https://kunststube.net/encoding/

'String encoding and R' by Kevin Ushey: https://kevinushey.github.io/blog/201...

Комментарии

Информация по комментариям в разработке