HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyMinor

HTML tag counter

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
counterhtmltag

Problem

What can I do with this program to improve its performance?

#!/usr/bin/env ruby
require 'open-uri'
print "URL: "
add = gets
puts "Info from #{add}"
begin
  open(add) do |f|
    puts "Fetching images..."
    puts "Fetching links..."
    puts "Fetching div tags..."
    puts "Fetching headers..."
    puts "Fetching forms..."
    puts "Processing..."
    img = f.read.scan(/<img/).length
    puts "\t#{img} images"
    f.close
  end
  open(add) do |f|
    links = f.read.scan(/<a/).length
    puts "\t#{links} links"
    f.close 
  end
  open(add) do |f|
    div = f.read.scan(/<div/).length
    puts "\t#{div} div tags"
    f.close
  end
  open(add) do |f|
    head = f.read.scan(/<h1/).length
    puts "\t#{head} 'h1 type' headers"
    f.close
  end
  open(add) do |f|
    form = f.read.scan(/<form/).length
    puts "\t#{form} forms"
    f.close
  end
rescue
  puts "An error occured, either you entered an invalid URL or your internet connection is messed up!"
end

Solution

Instead of opening, reading and closing the file all the time, you should read it once and then just use the string multiple times (this will also safe time and bandwidth).

img = f.read.scan(/<img/).length


This won't necessarily give you an accurate count of ` tags. For example it will also count occurrences of

I'd also recommend to put the tag-counting code into a method. This way you don't need to repeat the same code for each tag.

If you're okay with changing the output, so that it's uniform for all tags, you might even turn the whole thing into a loop, so there's no repetition at all. Something like this:

%w(img a div h1 form).each do |tag|
  count = contents.scan(/<#{tag}\b/).length
  puts "\t#{count} #{tag} tags"
end

Code Snippets

img = f.read.scan(/<img/).length
%w(img a div h1 form).each do |tag|
  count = contents.scan(/<#{tag}\b/).length
  puts "\t#{count} #{tag} tags"
end

Context

StackExchange Code Review Q#10737, answer score: 5

Revisions (0)

No revisions yet.