Vote count: 0
I already written several lines of codes to pull url from this website. http://ift.tt/2kIDOui
code is below:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
import csv
driver = webdriver.Firefox()
driver.get('http://ift.tt/2kIDOui')
url = []
pagenbr = 1
while pagenbr <= 115:
current = driver.current_url
driver.get(current)
lks = driver.find_elements_by_xpath('//*[@href]')
for ii in lks:
link = ii.get_attribute('href')
if '/info' in link:
url.append(link)
print('page ' + str(pagenbr) + ' is done.')
if pagenbr <=114:
elm = driver.find_element_by_link_text('Next')
driver.implicitly_wait(10)
elm.click()
time.sleep(2)
pagenbr += 1
ls = list(set(url))
with open('US_GeneralHospital.csv', 'wb') as myfile:
wr = csv.writer(myfile,quoting=csv.QUOTE_ALL)
for u in ls:
wr.writerow([u])
And it worked very well to pull each individual links from this website. But the problem is I need to change the page number I need to loop by myself every time.
I want to let this code upgrade to iterate by calculating how many time it need. Not by manually inputting.
Thank you very much.
asked just now
Python and Selenium: I am automating web scraping among pages. How can I loop by Next button?
Aucun commentaire:
Enregistrer un commentaire