How To Make Playwright Undetectable To Anti-Bots
Different techniques can be used to make your Playwright undetectable to websites’ anti-bot mechanisms.
Here are some of the most commonly used methods.
1. Setting User-Agent Strings
The User-Agent request header is a characteristic string that lets servers and network peers identify the application, operating system, vendor, and/or version of the requestingbrowser.
We can modify the user-agent string to mimic popular browsers, ensuring it matches the browser version and operating system. We can set up the browser user agent string like this in Playwright
const context = await browser.newContext({
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
});
2. Enabling WebGL and Hardware Acceleration
We can ensure WebGL and hardware acceleration are enabled to replicate the typical capabilities of a human-operated web browser by specifying specific arguments when launching our Chromium browser.
const browser = await chromium.launch({
args: [
'--enable-webgl',
'--use-gl=swiftshader',
'--enable-accelerated-2d-canvas'
]
});
3. Mimicking Human-Like Browser Environments
We can configure our browser context to reflect common user settings, such as viewport size, language, and time zone.
const context = await browser.newContext({
viewport: { width: 1280, height: 720 },
locale: 'en-US',
timezoneId: 'America/New_York'
});
4. Using Rotating Residential Proxies
Residential proxies are a type of proxy server that uses IP addresses assigned by Internet Service Providers (ISPs) to regular household users. These proxies differ from datacenter proxies, which use IP addresses provided by cloud service providers and are more easily detectable and blockable by websites. You can employ residential proxies that use real IP addresses from ISPs, making it harder for websites to detect and block your requests.
You can also use rotating residential proxies to change your IP addresses regularly to avoid multiple requests from the same IP, which can raise red flags.
const browser = await chromium.launch({
proxy: {
server: 'http://myproxy.com:3128',
username: 'usr',
password: 'pwd'
}
});
5. Mimicking Human Behavior
We can mimic human behavior when interacting with a website to avoid detection. Introducing realistic delays, random mouse movements and natural scrolling is essential. Here are some techniques to achieve this:
const { chromium } = require("playwright");
(async () => {
// Launch a new browser instance
const browser = await chromium.launch({ headless: false }); // Set headless to true if you don't need to see the browser
const context = await browser.newContext();
const page = await context.newPage();
// Navigate to the target website and wait for the page to load completely
await page.goto("https://www.saucedemo.com/", {
waitUntil: "domcontentloaded",
});
// Function to generate random delays between 50ms and 200ms
const getRandomDelay = () => Math.random() * (200 - 50) + 50;
// Type into the username field with a random delay to simulate human typing
await page.type("#user-name", "text", { delay: getRandomDelay() });
// Type into the password field with a random delay to simulate human typing
await page.type("#password", "text", { delay: getRandomDelay() });
// Click the login button with a random delay to simulate human click
await page.click("#login-button", { delay: getRandomDelay() });
// Scroll down the page to simulate reading content after login
await page.evaluate(() => {
window.scrollBy(0, window.innerHeight);
});
// Introduce random mouse movements to simulate human interaction
await page.mouse.move(Math.random() * 800, Math.random() * 600);
// Close the browser
await browser.close();
})();
6. Using Playwright with Real Browsers
We can run Playwright with full browser environments rather than headless modes to reduce the likelihood of detection. You can launch your Chromium or Firefox browser by following the script below: