mercredi 8 février 2017

Python and Selenium: I am automating web scraping among pages. How can I loop by Next button?

Vote count: 0

I already written several lines of codes to pull url from this website. http://ift.tt/2kIDOui

code is below:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
import csv

driver  = webdriver.Firefox()
driver.get('http://ift.tt/2kIDOui')
url     = []
pagenbr = 1

while pagenbr <= 115:
    current = driver.current_url
    driver.get(current)
    lks = driver.find_elements_by_xpath('//*[@href]')
    for ii in lks:
        link = ii.get_attribute('href')
        if '/info' in link:
            url.append(link)

    print('page ' + str(pagenbr) + ' is done.')
    if pagenbr <=114:
        elm = driver.find_element_by_link_text('Next')
        driver.implicitly_wait(10)
        elm.click()
        time.sleep(2)
    pagenbr += 1

ls = list(set(url))
with open('US_GeneralHospital.csv', 'wb') as myfile:
    wr = csv.writer(myfile,quoting=csv.QUOTE_ALL)
    for u in ls:
        wr.writerow([u])

And it worked very well to pull each individual links from this website. But the problem is I need to change the page number I need to loop by myself every time.

I want to let this code upgrade to iterate by calculating how many time it need. Not by manually inputting.

Thank you very much.

asked just now

Let's block ads! (Why?)



Python and Selenium: I am automating web scraping among pages. How can I loop by Next button?

Aucun commentaire:

Enregistrer un commentaire