Web Automation Tutorial #1 - Setup, Button Clicking, Web Scraping

TLDR;

This video introduces a web automation series using Node.js and Puppeteer. It guides viewers through setting up their environment, installing necessary packages, and writing basic code to control a browser. The tutorial covers downloading Node.js, installing Puppeteer, and using Puppeteer Extra Plugin to mask the browser for automation tasks. It also demonstrates how to navigate to a website and click a button using JavaScript injection.

Setting up Node.js and Puppeteer
Installing Puppeteer Extra Plugin for browser masking
Navigating to websites and automating button clicks

Introduction to Web Automation [0:00]

The video introduces a comprehensive web automation series suitable for all skill levels, from beginners to advanced coders. The series aims to teach viewers how to control a browser using code, enabling tasks such as clicking buttons, navigating websites, scraping data, and performing various other web automation activities.

Setting Up Node.js and Puppeteer [0:30]

To begin, you need to download Node.js from the official website. The site automatically detects your operating system. After downloading, install Node.js. Puppeteer, a Node.js library, will be used to control the browser. Create a new folder on your desktop, such as "Web Automation", and open it in Visual Studio Code. Create a new file named "index.js". Open the terminal in Visual Studio Code and check if Node.js is installed correctly by running the command node --version. If Node.js is properly installed, the version number will be displayed. Next, run npm init and press enter multiple times to create a package.json file. Then, install Puppeteer by running npm install puppeteer.

Basic Puppeteer Code [2:39]

Copy the initial code snippet from the Puppeteer documentation and paste it into index.js. Modify the code by setting headless to false to visualise the browser. Set the browser to navigate to google.com, ensuring the URL is correctly formatted. Run the script using node index.js to see the browser open and navigate to Google. To adjust the browser's viewport, use the setViewport function from the Puppeteer documentation to define the dimensions of the page.

Installing Puppeteer Extra Plugin [4:38]

To enhance Puppeteer and prevent websites from detecting the automated browser, install the puppeteer-extra-plugin-stealth package using npm install puppeteer-extra puppeteer-extra-plugin-stealth. This plugin masks the browser, making it appear as a real user. After installation, integrate the plugin into your code by requiring puppeteer-extra and using the stealth plugin. To resolve potential errors, adjust the code structure by using an async function named run and placing the Puppeteer code inside it, then call the run function.

Automating Button Clicks [6:19]

To automate clicking a button on a website, navigate to the desired page, such as a product page on walmart.com. Use the browser's developer tools to inspect the HTML of the button. Identify a unique selector for the button, such as a data automation ID. In the console, verify the selector using document.querySelector to ensure it correctly identifies the button. In your code, define a variable for the selector. Use page.waitForSelector to wait for the button to appear on the page. Then, use page.evaluate to inject JavaScript code into the browser to click the button using document.querySelector(selector).click(). This will simulate a user clicking the button.