Web Crawling with node.js and cheerio
If you are a web developer and want to get started with web crawling without any experience in a language like Python it is a good idea to use node.js with your knowledge in JavaScript. To start web crawling you dont need anything else than an installed Node Environment and access to a shell. If you dont have Node already installed you can download the installer here: https://nodejs.org/en/download/
After you installed the node environment you have to create a new directory. Open your terminal and change your current working directory to the directory you just have created. Now execute the command npm init
. Just fill in the requested data (they are all optional and you can just press ENTER). The command npm init
lets you create a new node project by creating a package.json file. To create the first crawling project, you have to install cheerio
for parsing and working with html data and axios
for making the http requests to a site.