
mkdir puppeteer-scraper & cd puppeteer-scraper Now that we know our environment checks out, let’s create a new project and install Puppeteer. Related ➡️ How to install Node.js properly If you’re missing either Node.js or NPM or have unsupported versions, visit the installation tutorial to get started. To get the most out of this tutorial, you need Node.js version 16 or higher. You can confirm their existence on your machine by running: node -v & npm -v We’ll use NPM, which comes preinstalled with Node.js. To use Puppeteer you’ll need Node.js and a package manager. We will use Puppeteer to start a browser, open the GitHub topic page, click the Load more button to display more repositories, and then extract the following information: You’ll be able to select a topic and the scraper will return information about repositories tagged with this topic. To showcase the basics of Puppeteer, we will create a simple scraper that extracts data about GitHub Topics. You don’t need to be familiar with Puppeteer or web scraping to enjoy this tutorial, but knowledge of HTML, CSS, and JavaScript is expected. This makes Puppeteer a really powerful tool for web scraping, but also for automating complex workflows on the web.

With Puppeteer, you can use (headless) Chromium or Chrome to open websites, fill forms, click buttons, extract data and generally perform any action that a human could when using a computer.
