Goals:

Introduction into webscraping, or how one can efficiently collect lots of information from the Internet .

Software:

wget (https://www.gnu.org/software/wget/), a free software package for retrieving files using HTTP, HTTPS, FTP and FTPS the most widely-used Internet protocols. It is a non-interactive command line tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc.
NB: on installing wget, see Ian Milligan’s “Automated Downloading with Wget”, https://programminghistorian.org/lessons/automated-downloading-with-wget
- Alternatively, for Windows: https://builtvisible.com/download-your-website-with-wget/
- On Mac (and, possibly, Linux): brew install wget

practical examples of working with wget
single link download
batch download
- web-page analysis
- extraction of links with regular expressions
- modification of links with regular expressions

wget link
wget -i file_with_links.txt
wget -i links.txt -P ./folderYouWantToSaveTo/ -nc 

Where:

-P is a folder parameter, which instructs wget where you want to store downloaded files (optional).
-nc is a no-clobber parameter, which instructs wget to skips files, if they already exist (optional)

NB: there are many other parameters with which you can adjust wget to your needs.

download issues of “Richmond Times Dispatch” (Years 1860-1865, only!), which are available at: http://www.perseus.tufts.edu/hopper/collection?collection=Perseus:collection:RichTimes)

Milligan, Ian. 2012. “Automated Downloading with Wget.” Programming Historian, June. https://programminghistorian.org/lessons/automated-downloading-with-wget.
Kurschinski, Kellen. 2013. “Applied Archival Downloading with Wget.” Programming Historian, September. https://programminghistorian.org/lessons/applied-archival-downloading-with-wget.
Alternatively, this operation can be done with a Python script: Turkel, William J., and Adam Crymble. 2012. “Downloading Web Pages with Python.” Programming Historian, July. https://programminghistorian.org/lessons/working-with-web-pages.

Scraping the “Dispatch”: download issues of “Richmond Times Dispatch” (Years 1860-1865, only!), which are available at: http://www.perseus.tufts.edu/hopper/collection?collection=Perseus:collection:RichTimes)
Publish a step-by-step explanation of what you have done as a blogpost on your website.
Codecademy’s Learn Python, Unit 7-8.
Github: publish the confirmation screenshot as a post on your new site.