Vote count:
0
I'm getting the following error trying to log into Github via urllib.request using Python 3.4.2:
urllib.error.HTTPError: HTTP Error 403: Forbidden
My assumption is I'm not mirroring a real user (tried doing so by adding a user-agent). I could use Selenium with PhantomJS but it's a bit too clunky and slow.
Yes, I could use Github's API, but the point is to learn how to log into a site using urllib, and I just wanted to try Github.
Here is the code:
import urllib.request
from bs4 import BeautifulSoup
import urllib.parse
#Login Info
user_name = 'USERNAME'
password = 'PASSWORD'
#Request the login page to pull the auth token
soup = urllib.request.urlopen('http://ift.tt/1i4HFM7')
soup_content = soup.read()
pretty_soup = BeautifulSoup(soup_content)
#Request the auth token
for tags in pretty_soup.findAll("meta", {'name': 'csrf-token'}):
auth_token = tags['content']
#Post login and auth token
url = 'http://ift.tt/1kQnj8Y'
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.76 Safari/537.36'
values = {'utf8' : '%E2%9C%93',
'authenticity_token' : auth_token,
'login' : user_name,
'password' : password }
headers = { 'User-Agent' : user_agent }
data = urllib.parse.urlencode(values)
binary_data = data.encode('UTF-8')
req = urllib.request.Request(url, binary_data, headers)
response = urllib.request.urlopen(req)
the_page = response.read()
pretty_page = BeautifulSoup(the_page)
print(pretty_page)
asked 4 mins ago
403 Error Logging into Github via Python Using urllib.request
Aucun commentaire:
Enregistrer un commentaire