patternpythonMinor
Image downloader for a website
Viewed 0 times
websitedownloaderimagefor
Problem
This code takes a website and downloads all .jpg images in the webpage. It supports only websites that have the `
(Tested here)
element and src` contains a .jpg link.(Tested here)
import random
import urllib.request
import requests
from bs4 import BeautifulSoup
def Download_Image_from_Web(url):
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, "html.parser")
raw_text = r'links.txt'
with open(raw_text, 'w') as fw:
for link in soup.findAll('img'):
image_links = link.get('src')
if '.jpg' in image_links:
for i in image_links.split("\\n"):
fw.write(i + '\n')
num_lines = sum(1 for line in open('links.txt'))
if num_lines == 0:
print("There is 0 photo in this web page.")
elif num_lines == 1:
print("There is", num_lines, "photo in this web page:")
else:
print("There are", num_lines, "photos in this web page:")
k = 0
while k <= (num_lines-1):
name = random.randrange(1, 1000)
fullName = str(name) + ".jpg"
with open('links.txt', 'r') as f:
lines = f.readlines()[k]
urllib.request.urlretrieve(lines, fullName)
print(lines+fullName+'\n')
k += 1
Download_Image_from_Web("https://pixabay.com")Solution
Unnecessary file operations
This is horribly inefficient:
Re-reading the same file
Btw, do you really need to write the list of urls to a file?
Why not just keep them in a list?
Even if you want the urls in a file,
you could keep them in a list in memory and never read that file, only write.
Code organization
Instead of having all the code in a single function that does multiple things,
it would be better to organize your program into smaller functions,
each with a single responsibility.
Python conventions
Python has a well-defined set of coding conventions in PEP8,
many of which are violated here.
I suggest to read through that document,
and follow as much as possible.
This is horribly inefficient:
k = 0
while k <= (num_lines-1):
name = random.randrange(1, 1000)
fullName = str(name) + ".jpg"
with open('links.txt', 'r') as f:
lines = f.readlines()[k]
urllib.request.urlretrieve(lines, fullName)
print(lines+fullName+'\n')
k += 1Re-reading the same file
num_lines times, to download the k-th!Btw, do you really need to write the list of urls to a file?
Why not just keep them in a list?
Even if you want the urls in a file,
you could keep them in a list in memory and never read that file, only write.
Code organization
Instead of having all the code in a single function that does multiple things,
it would be better to organize your program into smaller functions,
each with a single responsibility.
Python conventions
Python has a well-defined set of coding conventions in PEP8,
many of which are violated here.
I suggest to read through that document,
and follow as much as possible.
Code Snippets
k = 0
while k <= (num_lines-1):
name = random.randrange(1, 1000)
fullName = str(name) + ".jpg"
with open('links.txt', 'r') as f:
lines = f.readlines()[k]
urllib.request.urlretrieve(lines, fullName)
print(lines+fullName+'\n')
k += 1Context
StackExchange Code Review Q#162123, answer score: 4
Revisions (0)
No revisions yet.