patternrubyMinor
HTML tag counter
Viewed 0 times
counterhtmltag
Problem
What can I do with this program to improve its performance?
#!/usr/bin/env ruby
require 'open-uri'
print "URL: "
add = gets
puts "Info from #{add}"
begin
open(add) do |f|
puts "Fetching images..."
puts "Fetching links..."
puts "Fetching div tags..."
puts "Fetching headers..."
puts "Fetching forms..."
puts "Processing..."
img = f.read.scan(/<img/).length
puts "\t#{img} images"
f.close
end
open(add) do |f|
links = f.read.scan(/<a/).length
puts "\t#{links} links"
f.close
end
open(add) do |f|
div = f.read.scan(/<div/).length
puts "\t#{div} div tags"
f.close
end
open(add) do |f|
head = f.read.scan(/<h1/).length
puts "\t#{head} 'h1 type' headers"
f.close
end
open(add) do |f|
form = f.read.scan(/<form/).length
puts "\t#{form} forms"
f.close
end
rescue
puts "An error occured, either you entered an invalid URL or your internet connection is messed up!"
endSolution
Instead of opening, reading and closing the file all the time, you should read it once and then just use the string multiple times (this will also safe time and bandwidth).
This won't necessarily give you an accurate count of `
I'd also recommend to put the tag-counting code into a method. This way you don't need to repeat the same code for each tag.
If you're okay with changing the output, so that it's uniform for all tags, you might even turn the whole thing into a loop, so there's no repetition at all. Something like this:
img = f.read.scan(/<img/).lengthThis won't necessarily give you an accurate count of `
tags. For example it will also count occurrences of I'd also recommend to put the tag-counting code into a method. This way you don't need to repeat the same code for each tag.
If you're okay with changing the output, so that it's uniform for all tags, you might even turn the whole thing into a loop, so there's no repetition at all. Something like this:
%w(img a div h1 form).each do |tag|
count = contents.scan(/<#{tag}\b/).length
puts "\t#{count} #{tag} tags"
endCode Snippets
img = f.read.scan(/<img/).length%w(img a div h1 form).each do |tag|
count = contents.scan(/<#{tag}\b/).length
puts "\t#{count} #{tag} tags"
endContext
StackExchange Code Review Q#10737, answer score: 5
Revisions (0)
No revisions yet.