patternpythonMinor
Copy table using Selenium and Python
Viewed 0 times
seleniumusingpythonandtablecopy
Problem
I have some Python code that copies a table from a website using Selenium,to create a csv, but as I haven't used Selenium much, I have that irksome feeling in the back of my mind that there must be a better way. It's also quite a bit slower than I would like. Here's the relevant code:
# ...Navigate to proper page...
table = self.browser.find_element_by_id('data_table')
head = table.find_element_by_tag_name('thead')
body = table.find_element_by_tag_name('tbody')
file_data = []
file_header = []
head_line = head.find_element_by_tag_name('tr')
headers = head_line.find_elements_by_tag_name('th')
for header in headers:
header_text = header.text.encode('utf8')
file_header.append(header_text)
file_data.append(",".join(file_header))
body_rows = body.find_elements_by_tag_name('tr')
for row in body_rows:
data = row.find_elements_by_tag_name('td')
file_row = []
for datum in data:
datum_text = datum.text.encode('utf8')
file_row.append(datum_text)
file_data.append(",".join(file_row))
with open(srcFile, "w") as f:
f.write("\n".join(file_data))Solution
First off, I see a couple of things that can be shortened to generator expressions, rather than full-blown
Can be shortened immensely to the following:
Finally, your other
for loops. For example, this section:file_header = []
head_line = head.find_element_by_tag_name('tr')
headers = head_line.find_elements_by_tag_name('th')
for header in headers:
header_text = header.text.encode('utf8')
file_header.append(header_text)
file_data.append(",".join(file_header))Can be shortened immensely to the following:
head_line = head.find_element_by_tag_name("tr")
file_header = [header.text.encode("utf8") for header in head_line.find_elements_by_tag_name('th')]
file_data.append(",".join(file_header))Finally, your other
for loop can be shortened to a generator expression as well. For more on generator expressions, see PEP0289.Code Snippets
file_header = []
head_line = head.find_element_by_tag_name('tr')
headers = head_line.find_elements_by_tag_name('th')
for header in headers:
header_text = header.text.encode('utf8')
file_header.append(header_text)
file_data.append(",".join(file_header))head_line = head.find_element_by_tag_name("tr")
file_header = [header.text.encode("utf8") for header in head_line.find_elements_by_tag_name('th')]
file_data.append(",".join(file_header))Context
StackExchange Code Review Q#87901, answer score: 4
Revisions (0)
No revisions yet.