patternpythonMinor
Scraping scores from flashscore.com
Viewed 0 times
flashscorescrapingscorescomfrom
Problem
I built a bot with Python to scrape scores on flashscore.com but the data scraped from the site loads into its listbox very slowly. I am curious about the speed of selenium so I made a button that prints all the text and it turns out the speed was fast so it must be the
```
from tkinter import *
from selenium import webdriver
import threading
def LoadSite():
lblStatus2.config(text="loading")
m = 0
Table = browser.find_elements_by_tag_name('table')
for tables in Table:
abc = tables.find_elements_by_class_name('country_part')
aaa = tables.find_elements_by_class_name('tournament_part')
C = len(tables.find_elements_by_class_name("padr"))
for countrys in abc:
LbCountry.insert(END, countrys.text+aaa[m].text)
n = 1
while (n < C):
LbCountry.insert(END, "")
n += 1
m +=1
Time = browser.find_elements_by_tag_name('td')
for g in Time:
if (g.get_attribute('class').find('cell_ab team-home')) != -1 or (g.get_attribute('class').find('cell_ab team-home bold')) != -1:
LbHome.insert(END,g.text)
if g.get_attribute('innerHTML').find('rhcard rhcard1') != -1:
LbHRed.insert(END,"1")
elif g.get_attribute('innerHTML').find('rhcard rhcard2') != -1:
LbHRed.insert(END,"2")
else:
LbHRed.insert(END,"")
elif (g.get_attribute('class').find('cell_ac team-away')) != -1 or (g.get_attribute('class').find('cell_ac team-away bold')) != -1:
LbAway.insert(END,g.text)
if g.get_attribute('innerHTML').find('racard racard1') != -1:
LbARed.insert(END,"1")
elif g.get_attribute('innerHTML').find('racard racard2') != -1:
LbARed.insert(END,"2")
else:
LbARed.insert(END,"")
elif g.
if elif block that is slowing down the program.```
from tkinter import *
from selenium import webdriver
import threading
def LoadSite():
lblStatus2.config(text="loading")
m = 0
Table = browser.find_elements_by_tag_name('table')
for tables in Table:
abc = tables.find_elements_by_class_name('country_part')
aaa = tables.find_elements_by_class_name('tournament_part')
C = len(tables.find_elements_by_class_name("padr"))
for countrys in abc:
LbCountry.insert(END, countrys.text+aaa[m].text)
n = 1
while (n < C):
LbCountry.insert(END, "")
n += 1
m +=1
Time = browser.find_elements_by_tag_name('td')
for g in Time:
if (g.get_attribute('class').find('cell_ab team-home')) != -1 or (g.get_attribute('class').find('cell_ab team-home bold')) != -1:
LbHome.insert(END,g.text)
if g.get_attribute('innerHTML').find('rhcard rhcard1') != -1:
LbHRed.insert(END,"1")
elif g.get_attribute('innerHTML').find('rhcard rhcard2') != -1:
LbHRed.insert(END,"2")
else:
LbHRed.insert(END,"")
elif (g.get_attribute('class').find('cell_ac team-away')) != -1 or (g.get_attribute('class').find('cell_ac team-away bold')) != -1:
LbAway.insert(END,g.text)
if g.get_attribute('innerHTML').find('racard racard1') != -1:
LbARed.insert(END,"1")
elif g.get_attribute('innerHTML').find('racard racard2') != -1:
LbARed.insert(END,"2")
else:
LbARed.insert(END,"")
elif g.
Solution
I do not think that the slowness is coming from the
This is not very efficient, so the best solution would be to find another library for web-scraping that does not rely on opening up and loading the webpage.
My solution for this would be to use two libraries: urllib2 and BeautifulSoup.
See this StackOverflow post for an example.
if/elif statement; I think it is coming from using Selenium.Selenium is a nice web-scraper. However, it is fairly slow as you actually have to open up and load the webpage to actually do any scraping.This is not very efficient, so the best solution would be to find another library for web-scraping that does not rely on opening up and loading the webpage.
My solution for this would be to use two libraries: urllib2 and BeautifulSoup.
urllib2 - This library will be used to read the HTML document into memory. I believe this library comes with Python.BeautifulSoup - This library will be used to parse the HTML document. You will have to download this.See this StackOverflow post for an example.
Context
StackExchange Code Review Q#90251, answer score: 2
Revisions (0)
No revisions yet.