w3resource

Python Image Downloader: Download Images from a URL


Image Downloader:

Build a program that downloads images from a given URL.

Input values:
User provides the URL of a website containing images to be downloaded

Output value:
Images are downloaded from the specified URL and saved to a local directory.

Example:

Input values:
Enter the URL of the website with images: https://example.com/images
Output value:
Images downloaded successfully and saved to the local directory.

Here are two different solutions for an "Image Downloader" program in Python. This program will take a URL from the user, find all images on the specified webpage, and download them to a local directory.

Prerequisites:

Before running the code, ensure that the following libraries are installed:

pip install requests beautifulsoup4

Solution 1: Basic Approach using 'requests' and 'BeautifulSoup'

Code:

# Solution 1: Basic Approach Using `requests` and `BeautifulSoup`

import os  # Provides functions for interacting with the operating system
import requests  # Allows sending HTTP requests
from bs4 import BeautifulSoup  # Used for parsing HTML content

def download_image(url, folder_path):
    """Download an image from a URL to the specified folder."""
    # Get the image name from the URL
    image_name = url.split("/")[-1]
    # Send a GET request to the image URL
    response = requests.get(url, stream=True)
    
    # Check if the request was successful
    if response.status_code == 200:
        # Save the image to the local directory
        with open(os.path.join(folder_path, image_name), 'wb') as file:
            for chunk in response.iter_content(1024):
                file.write(chunk)
        print(f"Downloaded: {image_name}")
    else:
        print(f"Failed to download: {url}")

def download_images_from_url(website_url, folder_path):
    """Download all images from a specified URL."""
    # Send a GET request to the website URL
    response = requests.get(website_url)
    
    # Check if the request was successful
    if response.status_code == 200:
        # Parse the website content with BeautifulSoup
        soup = BeautifulSoup(response.text, 'html.parser')
        # Find all image tags in the HTML content
        img_tags = soup.find_all('img')
        
        # Create the directory if it doesn't exist
        os.makedirs(folder_path, exist_ok=True)

        # Download each image
        for img in img_tags:
            img_url = img.get('src')
            # If the image URL is valid, download the image
            if img_url and img_url.startswith(('http://', 'https://')):
                download_image(img_url, folder_path)
            else:
                print(f"Invalid URL found: {img_url}")
    else:
        print(f"Failed to retrieve the webpage. Status code: {response.status_code}")

# Get the URL and folder path from the user
website_url = input("Enter the URL of the website with images: ")
folder_path = "downloaded_images"  # Folder to save the images

# Download images from the specified URL
download_images_from_url(website_url, folder_path) 

Output:

(base) C:\Users\ME>python test.py
Enter the URL of the website with images: https://www.wikipedia.org/
Invalid URL found: portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png
Downloaded: Wikimedia_Foundation_logo_-_wordmark.svg

Explanation:

  • Uses requests to send HTTP GET requests to the specified URL and BeautifulSoup to parse HTML content and extract image URLs.
  • Defines two functions:
    • download_image(url, folder_path): Downloads an individual image from a given URL and saves it to a local directory.
    • download_images_from_url(website_url, folder_path): Downloads all images from the specified website.
  • Checks if the img tags contain valid URLs and downloads only those starting with http:// or https://.
  • Simple and effective, but lacks modularity for future improvements.

Solution 2: Using a Class-Based approach for better Organization

Prerequisites

Ensure that the following libraries are installed:

pip install requests beautifulsoup4

Code:

# Solution 2: Using a Class-Based Approach for Better Organization

import os  # Provides functions for interacting with the operating system
import requests  # Allows sending HTTP requests
from bs4 import BeautifulSoup  # Used for parsing HTML content

class ImageDownloader:
    """Class to handle image downloading from a given URL."""

    def __init__(self, url, folder_path='downloaded_images'):
        """Initialize the downloader with a URL and a folder path."""
        self.url = url
        self.folder_path = folder_path

    def download_image(self, image_url):
        """Download a single image from a URL to the specified folder."""
        image_name = image_url.split("/")[-1]
        response = requests.get(image_url, stream=True)
        
        if response.status_code == 200:
            with open(os.path.join(self.folder_path, image_name), 'wb') as file:
                for chunk in response.iter_content(1024):
                    file.write(chunk)
            print(f"Downloaded: {image_name}")
        else:
            print(f"Failed to download: {image_url}")

    def download_all_images(self):
        """Download all images from the specified website URL."""
        response = requests.get(self.url)
        
        if response.status_code == 200:
            soup = BeautifulSoup(response.text, 'html.parser')
            img_tags = soup.find_all('img')

            os.makedirs(self.folder_path, exist_ok=True)

            for img in img_tags:
                img_url = img.get('src')
                if img_url and img_url.startswith(('http://', 'https://')):
                    self.download_image(img_url)
                else:
                    print(f"Invalid URL found: {img_url}")
        else:
            print(f"Failed to retrieve the webpage. Status code: {response.status_code}")

# Get the URL from the user
website_url = input("Enter the URL of the website with images: ")

# Create an instance of ImageDownloader and start the download process
downloader = ImageDownloader(website_url)
downloader.download_all_images() 

Output:

(base) C:\Users\ME>python test.py
Enter the URL of the website with images: https://commons.wikimedia.org/wiki/Main_Page
Downloaded: 500px-Castillo_de_Hohenwerfen%2C_Werfen%2C_Austria%2C_2019-05-17%2C_DD_143-149_PAN.jpg
Downloaded: 14px-Magnify-clip_%28sans_arrow%29.svg.png
Downloaded: 100px-Generic_Camera_Icon.svg.png
Downloaded: 45px-Wikimedia-logo_black.svg.png
Downloaded: 32px-Wikipedia-logo-v2.svg.png
Downloaded: 32px-Wikinews-logo.svg.png
Downloaded: 32px-Wiktionary-logo.svg.png
Downloaded: 32px-Wikibooks-logo.svg.png
Downloaded: 32px-Wikiquote-logo.svg.png
---------------------------------------- 
----------------------------------------

Explanation:

  • Encapsulates the image downloading logic in an 'ImageDownloader' class, making the code more modular and organized.
  • The '__init__' method initializes the class with the URL and folder path where images will be saved.
  • The 'download_image()' method handles downloading individual images, while 'download_all_images()' manages the overall downloading process.
  • The class-based approach makes the code more reusable, scalable, and easier to maintain or extend with additional features.

Note:
Both solutions effectively implement an image downloader that retrieves images from a given URL and saves them to a local directory, with Solution 1 using a straightforward function-based approach and Solution 2 using a class-based design for better organization and extensibility.



Follow us on Facebook and Twitter for latest update.