Skip to content

Basic Web Scrapper

Abstract

This is a basic web scrapper that scrapes the data from the website and stores it in a csv file. This is a beginner level project. We are going to use the BeautifulSoup library for this project and requests library to get the data from the website. In this application, we are going to scrape the data from the website https://webscraper.io/test-sites/e-commerce/allinone/phones/touch and store the data in a csv file. We are going to scrape the data of the product name, price, and description of the product.

Prerequisites

  • Python 3.6 or above
  • BeautifulSoup library
  • requests library
  • Text editor or IDE

Before we start

Before we start, we need to install the BeautifulSoup library and requests library. To install the BeautifulSoup library, we need to run the following command in the terminal.

command
C:\Users\username>pip install beautifulsoup4
command
C:\Users\username>pip install beautifulsoup4

To install the requests library, we need to run the following command in the terminal.

command
C:\Users\username>pip install requests
command
C:\Users\username>pip install requests

Getting Started

Creating a project

  1. Create a folder named basicwebscrapperbasicwebscrapper and open it in the text editor or IDE.
  2. Create a file named basicwebscrapper.pybasicwebscrapper.py in the basicwebscrapperbasicwebscrapper folder.
  3. Open the basicwebscrapper.pybasicwebscrapper.py file in the text editor or IDE.
  4. Copy the code below and paste it into the basicwebscrapper.pybasicwebscrapper.py file.

Write the code

  1. Copy and paste the following code into the basicwebscrapper.pybasicwebscrapper.py file.
βš™οΈ Basic Web Scrapper
Basic Web Scrapper
# Basic Web Scrapper
 
# Importing Libraries
import requests
from bs4 import BeautifulSoup
 
# URL
url = "https://webscraper.io/test-sites/e-commerce/allinone/phones/touch"
 
# Requesting the URL
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
 
# Finding the phones
phones = soup.find_all("div", class_="card-body")
 
# Creating a CSV file
open_file = open("phones.csv", "a")
headers = "Name, Price, Description, Reviews, Rating, Image\n"
open_file.write(headers)
 
# Looping through the phones
for phone in phones:
    name = phone.find("a", class_="title")
    price = phone.find("h4", class_="price")
    description = phone.find("p", class_="description")
    reviews = phone.find("p", class_="float-end review-count")
    rating = phone.find("p", attrs={"data-rating": True})
    image = phone.find("img", class_="img-responsive")["src"]
    
    # Writing to the CSV file
    open_file.write(f'{name.text}, {price.text}, {description.text}, {reviews.text}, {rating["data-rating"]}, {image}\n')
    print(f'Name: {name.text} \nPrice: {price.text} \nDescription: {description.text} \nReviews: {reviews.text} \nRating: {rating["data-rating"]} \nImage: {image} \n')
 
# Closing the CSV file
open_file.close()   
     
Basic Web Scrapper
# Basic Web Scrapper
 
# Importing Libraries
import requests
from bs4 import BeautifulSoup
 
# URL
url = "https://webscraper.io/test-sites/e-commerce/allinone/phones/touch"
 
# Requesting the URL
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
 
# Finding the phones
phones = soup.find_all("div", class_="card-body")
 
# Creating a CSV file
open_file = open("phones.csv", "a")
headers = "Name, Price, Description, Reviews, Rating, Image\n"
open_file.write(headers)
 
# Looping through the phones
for phone in phones:
    name = phone.find("a", class_="title")
    price = phone.find("h4", class_="price")
    description = phone.find("p", class_="description")
    reviews = phone.find("p", class_="float-end review-count")
    rating = phone.find("p", attrs={"data-rating": True})
    image = phone.find("img", class_="img-responsive")["src"]
    
    # Writing to the CSV file
    open_file.write(f'{name.text}, {price.text}, {description.text}, {reviews.text}, {rating["data-rating"]}, {image}\n')
    print(f'Name: {name.text} \nPrice: {price.text} \nDescription: {description.text} \nReviews: {reviews.text} \nRating: {rating["data-rating"]} \nImage: {image} \n')
 
# Closing the CSV file
open_file.close()   
     
  1. Save the file.
  2. Open the terminal in the basicwebscrapperbasicwebscrapper folder.
  3. Run the following command in the terminal.
command
C:\Users\username\basicwebscrapper>python basicwebscrapper.py
Name: Nokia 123 
Price: $24.99 
Description: 7 day battery
Reviews: 11 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: LG Optimus
Price: $57.99
Description: 3.2" screen
Reviews: 11 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Samsung Galaxy
Price: $93.99
Description: 5 mpx. Android 5.0
Reviews: 3 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Nokia X
Price: $109.99
Description: Andoid, Jolla dualboot
Reviews: 4 reviews
Rating: 4
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Sony Xperia
Price: $118.99
Description: GPS, waterproof
Reviews: 6 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Ubuntu Edge
Price: $499.99
Description: Sapphire glass
Reviews: 2 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: White
Reviews: 10 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: Silver
Reviews: 8 reviews
Rating: 2
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: Black
Reviews: 1 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
command
C:\Users\username\basicwebscrapper>python basicwebscrapper.py
Name: Nokia 123 
Price: $24.99 
Description: 7 day battery
Reviews: 11 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: LG Optimus
Price: $57.99
Description: 3.2" screen
Reviews: 11 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Samsung Galaxy
Price: $93.99
Description: 5 mpx. Android 5.0
Reviews: 3 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Nokia X
Price: $109.99
Description: Andoid, Jolla dualboot
Reviews: 4 reviews
Rating: 4
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Sony Xperia
Price: $118.99
Description: GPS, waterproof
Reviews: 6 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Ubuntu Edge
Price: $499.99
Description: Sapphire glass
Reviews: 2 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: White
Reviews: 10 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: Silver
Reviews: 8 reviews
Rating: 2
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: Black
Reviews: 1 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png

Explanation

  1. First, we import the requestsrequests library and the BeautifulSoupBeautifulSoup library.
basicwebscrapper.py
import requests
from bs4 import BeautifulSoup
basicwebscrapper.py
import requests
from bs4 import BeautifulSoup
  1. Then, we assign the URL to the variable urlurl.
basicwebscrapper.py
url = "https://webscraper.io/test-sites/e-commerce/allinone/phones/touch"
basicwebscrapper.py
url = "https://webscraper.io/test-sites/e-commerce/allinone/phones/touch"
  1. Next, we request the URL and assign it to the variable responseresponse. Then, we parse the HTML using the html.parserhtml.parser and assign it to the variable soupsoup.
basicwebscrapper.py
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
basicwebscrapper.py
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
  1. After that, we find all the phones and assign it to the variable phonesphones.
basicwebscrapper.py
phones = soup.find_all("div", class_="card-body")
basicwebscrapper.py
phones = soup.find_all("div", class_="card-body")
  1. Then, we create a CSV file named phones.csvphones.csv and open it in append mode. Then, we write the headers to the CSV file.
basicwebscrapper.py
open_file = open("phones.csv", "a")
headers = "Name, Price, Description, Reviews, Rating, Image\n"
open_file.write(headers)
basicwebscrapper.py
open_file = open("phones.csv", "a")
headers = "Name, Price, Description, Reviews, Rating, Image\n"
open_file.write(headers)
  1. Next, we loop through the phones and find the name, price, description, reviews, rating, and image of the phone. Then, we write the data to the CSV file.
basicwebscrapper.py
name = phone.find("a", class_="title")
price = phone.find("h4", class_="price")
basicwebscrapper.py
name = phone.find("a", class_="title")
price = phone.find("h4", class_="price")

and so on…

after that, we write the data to the CSV file.

basicwebscrapper.py
open_file.write(f'{name.text}, {price.text}, {description.text}, {reviews.text}, {rating["data-rating"]}, {image}\n')
basicwebscrapper.py
open_file.write(f'{name.text}, {price.text}, {description.text}, {reviews.text}, {rating["data-rating"]}, {image}\n')
  1. Finally, we close the CSV file.
basicwebscrapper.py
open_file.close()
basicwebscrapper.py
open_file.close()

Usage

  1. Open the terminal in the basicwebscrapperbasicwebscrapper folder.
  2. Run the following command in the terminal.
command
C:\Users\username\basicwebscrapper>python basicwebscrapper.py
command
C:\Users\username\basicwebscrapper>python basicwebscrapper.py
  1. The data will be scraped from the website and stored in the phones.csvphones.csv file.
  2. The data will be printed in the terminal.
  3. The data will be stored in the phones.csvphones.csv file.
phones.csv
Name, Price, Description, Reviews, Rating, Image
Nokia 123, $24.99, 7 day battery, 11 reviews, 3, /images/test-sites/e-commerce/items/cart2.png
LG Optimus, $57.99, 3.2" screen, 11 reviews, 3, /images/test-sites/e-commerce/items/cart2.png
Samsung Galaxy, $93.99, 5 mpx. Android 5.0, 3 reviews, 3, /images/test-sites/e-commerce/items/cart2.png
Nokia X, $109.99, Andoid, Jolla dualboot, 4 reviews, 4, /images/test-sites/e-commerce/items/cart2.png
Sony Xperia, $118.99, GPS, waterproof, 6 reviews, 1, /images/test-sites/e-commerce/items/cart2.png
Ubuntu Edge, $499.99, Sapphire glass, 2 reviews, 1, /images/test-sites/e-commerce/items/cart2.png
Iphone, $899.99, White, 10 reviews, 1, /images/test-sites/e-commerce/items/cart2.png
Iphone, $899.99, Silver, 8 reviews, 2, /images/test-sites/e-commerce/items/cart2.png
Iphone, $899.99, Black, 1 reviews, 1, /images/test-sites/e-commerce/items/cart2.png
phones.csv
Name, Price, Description, Reviews, Rating, Image
Nokia 123, $24.99, 7 day battery, 11 reviews, 3, /images/test-sites/e-commerce/items/cart2.png
LG Optimus, $57.99, 3.2" screen, 11 reviews, 3, /images/test-sites/e-commerce/items/cart2.png
Samsung Galaxy, $93.99, 5 mpx. Android 5.0, 3 reviews, 3, /images/test-sites/e-commerce/items/cart2.png
Nokia X, $109.99, Andoid, Jolla dualboot, 4 reviews, 4, /images/test-sites/e-commerce/items/cart2.png
Sony Xperia, $118.99, GPS, waterproof, 6 reviews, 1, /images/test-sites/e-commerce/items/cart2.png
Ubuntu Edge, $499.99, Sapphire glass, 2 reviews, 1, /images/test-sites/e-commerce/items/cart2.png
Iphone, $899.99, White, 10 reviews, 1, /images/test-sites/e-commerce/items/cart2.png
Iphone, $899.99, Silver, 8 reviews, 2, /images/test-sites/e-commerce/items/cart2.png
Iphone, $899.99, Black, 1 reviews, 1, /images/test-sites/e-commerce/items/cart2.png
  1. The data will be printed in the terminal.
command
C:\Users\username\basicwebscrapper>python basicwebscrapper.py
Name: Nokia 123
Price: $24.99
Description: 7 day battery
Reviews: 11 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: LG Optimus
Price: $57.99
Description: 3.2" screen
Reviews: 11 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Samsung Galaxy
Price: $93.99
Description: 5 mpx. Android 5.0
Reviews: 3 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Nokia X
Price: $109.99
Description: Andoid, Jolla dualboot
Reviews: 4 reviews
Rating: 4
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Sony Xperia
Price: $118.99
Description: GPS, waterproof
Reviews: 6 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Ubuntu Edge
Price: $499.99
Description: Sapphire glass
Reviews: 2 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: White
Reviews: 10 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: Silver
Reviews: 8 reviews
Rating: 2
Image: /images/test-sites/e-commerce/items/cart2.png
command
C:\Users\username\basicwebscrapper>python basicwebscrapper.py
Name: Nokia 123
Price: $24.99
Description: 7 day battery
Reviews: 11 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: LG Optimus
Price: $57.99
Description: 3.2" screen
Reviews: 11 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Samsung Galaxy
Price: $93.99
Description: 5 mpx. Android 5.0
Reviews: 3 reviews
Rating: 3
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Nokia X
Price: $109.99
Description: Andoid, Jolla dualboot
Reviews: 4 reviews
Rating: 4
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Sony Xperia
Price: $118.99
Description: GPS, waterproof
Reviews: 6 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Ubuntu Edge
Price: $499.99
Description: Sapphire glass
Reviews: 2 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: White
Reviews: 10 reviews
Rating: 1
Image: /images/test-sites/e-commerce/items/cart2.png
 
Name: Iphone
Price: $899.99
Description: Silver
Reviews: 8 reviews
Rating: 2
Image: /images/test-sites/e-commerce/items/cart2.png

Next Steps

Congratulations πŸŽ‰ you have successfully created a basic web scrapper that scrapes the data from the website and stores it in a csv file.

Here are some ideas to get you started:

  1. Scrape the data from the website https://webscraper.io/test-sites/e-commerce/allinone/computers/laptops and store the data in a csv file.
  2. Scrape the data from the website https://webscraper.io/test-sites/e-commerce/allinone/computers/tablets and store the data in a csv file.
  3. Add a feature to scrape the data from the website https://webscraper.io/test-sites/e-commerce/allinone/phones/touch and store the data in a json file.
  4. Add a feature to scrape the data from the website https://webscraper.io/test-sites/e-commerce/allinone/phones/touch and store the data in a database.
  5. Create a GUI for the application.
  6. Create a web application for the application.
  7. Add a cron_job to scrape the data from the website https://webscraper.io/test-sites/e-commerce/allinone/phones/touch and store the data in a csv file every 24 hours.

Resources

Conclusion

In this tutorial, we learned how to create a basic web scrapper that scrapes the data from the website and stores it in a csv file. This is a beginner level project. We used the BeautifulSoup library for this project and requests library to get the data from the website. In this application, we scraped the data from the website https://webscraper.io/test-sites/e-commerce/allinone/phones/touch and stored the data in a csv file. For more information, visit the resources listed above. For more projects like this, visit Python Central Hub.

Was this page helpful?

Let us know how we did