Deploying web scraping tasks for a mission critical application takes a lot more than merely writing a scraper and running it periodically. For extracting e-commerce listings like scraping Amazon product data, you need an infrastructure that offers it as a service, assuming end to end responsibility from executing the scraping crawler to managing scraped data.

This post was originally published in Apify.


Use Case: Scraping Amazon Product Data

Problem Statement

Problem Statement


Scraping product data from e-commerce websites at scale is tough with restrictions on requests and the risk of getting blocked. Additionally, websites employ countermeasures to prevent automated scraping which results in increased difficulty and risk of detection.

Realization Approach

Realization Approach


A hosted scraping service offers infrastructure scaling and sophisticated approaches to allow developers and data analysts to build and deploy mission critical web scraping tasks.

Solution Space

Solution Space


Scraping Amazon product data using this approach ensures that the scrapers leverage a large pool of servers with intelligent human-like browser fingerprints to avoid blocking. Furthermore, the crawled data can be stored, shared, and the scraping process can be monitored for performance over time.

Amazon is one of the most complex websites to scrape. That’s why we built an Amazon scraper you can use on the Apify cloud platform. It provides the infrastructure you need to scrape Amazon.

Approach for Scraping Amazon Product Data

Amazon Product Scraper is one of many ready-made e-commerce scraping tools available on Apify Store. This tool effectively creates an unofficial Amazon scraper API that enables you to get the Amazon product data you need without limits.

Here’s how you can use it to scrape Amazon in 7 simple steps:

Step 1. Go to Amazon Product Scraper on Apify Store

Click on Try for free. If you already have an Apify account, you’ll be taken straight to Apify Console, so you can skip ahead to step 3.

Screenshot of Amazon Product Scraper page on Apify Store
Go to Amazon Product Scraper to start scraping Amazon right away

Step 2. Sign up for a free Apify account

If you don’t have an Apify account, you can sign up for free using your email address, Google, or GitHub.

Screenshot of signup page for Apify scraping platform
It’s easy to sign up for Apify with your email, Google, or GitHub account, and you can scrape Amazon products for free with complimentary credits.

Step 3. Copy and paste the Amazon URL you want to scrape

Once you’re in Apify Console, insert the Amazon category or product URL from which you want to extract data. In the example below, we’ve copied and pasted the URL for the Headphones, Earbuds & Accessories category on Amazon.com. You can click on the + Add button to insert more categories or product URLs.

Screenshot of Amazon Product Scraper on Apify Store with Amazon product category URL
Use Amazon Product Scraper to extract product data from Amazon category URLs

Step 4. Select the maximum number of results you want to scrape

Insert the maximum number of items you want to scrape in the Max items field. In our example, we have set the number low and opted for just 10 results.

Select the maximum number of Amazon product items you want to scrape
Select the maximum number of Amazon product items you want to scrape

Note: You can also enable optional settings to get better results:

Use Captcha solver and Scrape product variant prices
Use Captcha solver and Scrape product variant prices

If you enable Captcha solver the scraper will automatically solve captchas thrown by Amazon. This will decrease the amount of request retries and increase the speed of the scraper.

However, this option works well only for the '.com' Amazon domain, but even then, Amazon doesn’t show a few product fields after solving a captcha (specifically: ‘attributes’, ‘manufacturer attributes’, and ‘bestseller ranks’)

Enabling the Scrape product variant prices lets you extract prices of different variations of a product. This is useful when you need prices for each variant.

But be warned: this will increase the number of requests and extend the scraping time.

Step 5. Select the proxy option you want to use

You won’t get far scraping Amazon without a proxy. You can set proxy groups from specific countries. Amazon shows you the products that can be shipped to your address based on the proxy you use. You don’t need to worry about it if globally shipped products are enough for you.

The default setting is Residential proxy, as this is the most effective for bypassing anti-scraping technologies. But you can also opt for Datacenter or your Own proxies.

Screenshot of Amazon Product Scraper on Apify Store with proxy configuration set to automatic
Different proxy options will enable you to avoid blocking or change the Amazon country

Step 6. Start Amazon Product Scraper

Now just click Start and wait for your results to come in. Your task will change from Running to Succeeded when it has finished.

Screenshot of Amazon Product Scraper with a completed run
Amazon Product Scraper extracts Amazon data quickly and easily

Step 7. Get your data

Go to the Export results tab to see your results. You can preview and download your Amazon data in several formats: HTML table, JSON, JSONL, CSV, Excel, XML, and RSS feed.

Screenshot of Amazon Product Scraper dataset storage format options
The Apify platform offers you a range of different formats for your extracted datasets

Here’s just some of the data from our scraping example in CSV:

Now you can download and keep the data to use it in spreadsheets, reports, or other apps. You can create as many variations on the input parameters as you like and schedule the scraper to extract Amazon product data as often as you need it.

This video goes into more detail on how to use Amazon Product Scraper:

The Legalities of Scraping Amazon Product Data

It is legal to scrape publicly available data on the internet and that includes scraping Amazon. Scraping information such as product descriptions, details, ratings, prices, or the number of reactions to a particular product is perfectly legal. You just need to be careful with personal data and copyright protection.

For instance, you may need to consider these when scraping product reviews, as the name and avatar of the reviewer may constitute personal data, while the text of the review itself may, in some cases, be copyright-protected. Always use extra caution and possibly consult with a lawyer when scraping this kind of data.

Restrictions From Amazon

While scraping publicly available data is legal, Amazon sometimes takes action to prevent scraping by rate-limiting requests, banning IP addresses, and engaging in browser fingerprinting to detect scraping bots.

Amazon will generally block web scraping with 200 OK success status response code and a requirement to pass a CAPTCHA or with HTTP Error 503 Service Unavailable with a message to contact sales for paid API.

There are ways to circumvent these measures, but ethical web scraping can help avoid triggering them in the first place. This includes limiting the frequency of requests, using appropriate user agents, and avoiding excessive scraping that could impact website performance.

Ethical web scraping reduces the risk of getting banned or facing legal consequences while still letting you extract useful data at scale from Amazon.

About the author 

Radiostud.io Staff

Showcasing and curating a knowledge base of tech use cases from across the web.

TechForCXO Weekly Newsletter
TechForCXO Weekly Newsletter

TechForCXO - Our Newsletter Delivering Technology Use Case Insights Every Two Weeks

>