Project Update
Please obtain a text file (.txt is best) containing at least 1000 words of text that you find interesting or entertaining. The text could be from a book, a news article, a Wikipedia page, or any other source. Ideal text will have a distinctive style that you can recognize.
Some good sources of text include:
- Project Gutenberg (a large collection of free e-books)
- Kaggle hosts various text data sets, including dumps from Wikipedia and collections of news articles.
- You can also
select allon essentially any website to highlight all text and copy and paste it into a new.txtfile. You’ll likely want to do some cleaning to remove extraneous text (e.g. navigation bars, ads, etc.) and formatting (e.g. newlines, tabs, etc.).
Please upload your .txt file to Gradescope and save it somewhere you can find it for class.