Jordan Hiatt Jordan Hiatt

PySpark with Reddit Data

Project to test the limits of PySpark to deal with big data. Uses a Random Forest model to classify Reddit comments into subreddits.

Read More