HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Copy table using Selenium and Python

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
seleniumusingpythonandtablecopy

Problem

I have some Python code that copies a table from a website using Selenium,to create a csv, but as I haven't used Selenium much, I have that irksome feeling in the back of my mind that there must be a better way. It's also quite a bit slower than I would like. Here's the relevant code:

# ...Navigate to proper page...

table = self.browser.find_element_by_id('data_table')
head = table.find_element_by_tag_name('thead')
body = table.find_element_by_tag_name('tbody')

file_data = []

file_header = []
head_line = head.find_element_by_tag_name('tr')
headers = head_line.find_elements_by_tag_name('th')
for header in headers:
    header_text = header.text.encode('utf8')
    file_header.append(header_text)
file_data.append(",".join(file_header))

body_rows = body.find_elements_by_tag_name('tr')
for row in body_rows:
    data = row.find_elements_by_tag_name('td')
    file_row = []
    for datum in data:
        datum_text = datum.text.encode('utf8')
        file_row.append(datum_text)
    file_data.append(",".join(file_row))

with open(srcFile, "w") as f:
    f.write("\n".join(file_data))

Solution

First off, I see a couple of things that can be shortened to generator expressions, rather than full-blown for loops. For example, this section:

file_header = []
head_line = head.find_element_by_tag_name('tr')
headers = head_line.find_elements_by_tag_name('th')
for header in headers:
    header_text = header.text.encode('utf8')
    file_header.append(header_text)
file_data.append(",".join(file_header))


Can be shortened immensely to the following:

head_line = head.find_element_by_tag_name("tr")
file_header = [header.text.encode("utf8") for header in head_line.find_elements_by_tag_name('th')]
file_data.append(",".join(file_header))


Finally, your other for loop can be shortened to a generator expression as well. For more on generator expressions, see PEP0289.

Code Snippets

file_header = []
head_line = head.find_element_by_tag_name('tr')
headers = head_line.find_elements_by_tag_name('th')
for header in headers:
    header_text = header.text.encode('utf8')
    file_header.append(header_text)
file_data.append(",".join(file_header))
head_line = head.find_element_by_tag_name("tr")
file_header = [header.text.encode("utf8") for header in head_line.find_elements_by_tag_name('th')]
file_data.append(",".join(file_header))

Context

StackExchange Code Review Q#87901, answer score: 4

Revisions (0)

No revisions yet.