dimanche 19 avril 2015

Python Beautiful Soup retrieve multiple webpages of info


Vote count:

0




So I am trying to learn scraping and was wondering how to get multiple webpages of info. I was using it on http://ift.tt/1Q6TGAU . I want to retrieve all the teams and then go within each teams link, which shows the roster, and then retrieve each players info and within their personal link their stats.


what I have so far is:



import requests
from bs4 import BeautifulSoup

r = requests.get("http://ift.tt/1Q6TGAU")
r.content
soup = BeautifulSoup(r.content)
links = soup.find_all("a")
for link in links:
college = link.text
collegeurl = link.get("href")
c = requests.get(collegeurl)
c.content
campbells = BeautifulSoup(c.content)


Then I am lost from there. I know I have to do a nested for loop in there, but I don't want certain links such as terms and conditions and social networks. Just trying to get player info and then their stats which is linked to their name.



asked 32 secs ago







Python Beautiful Soup retrieve multiple webpages of info

Aucun commentaire:

Enregistrer un commentaire