3D grphique: How to avoid country-based redirects with urlopen or urllib2 in Python

lundi 31 mars 2014

How to avoid country-based redirects with urlopen or urllib2 in Python

Vote count:

0

I am using Python 2.7.

I want to open the URL of a website and extract information out of it. The information I am looking for is within the US version of the website (http://ift.tt/1ms3x4f) . Since I am based in Canada, I get automatically redirected to the Canadian version of the website (http://ift.tt/1gVtZ6P). I am looking for a solution to try to avoid this.

If I take any browser (IE, Firefox, Chrome, ...) and navigate to http://ift.tt/1ms3x4f, I will get redirected. The website offers a menu where the visitor can pick the "country-version" of the website he wants to view. Once I select United States, I am no longer redirected to the Canadian version of the website. This is true for any new tab within the browsing session. I suspect this has to do with cookies storage.

I tried to use the following code to prevent the redirect:


import urllib2
class RedirectHandler(urllib2.HTTPRedirectHandler):
    def http_error_302(self, req, fp, code, msg, headers):
        result = urllib2.HTTPError(req.get_full_url(), code, msg, headers, fp)
        result.status = code
        return result
    http_error_301 = http_error_303 = http_error_307 = http_error_302

opener = urllib2.build_opener(RedirectHandler())
webpage = opener.open('http://ift.tt/1ms3x4f')

but it didn't seem to work since the only bit of code that can be extracted afterwards is:


<html><head></head><body>â€¹</body></html>

A solution to my problem would be to use a proxy while scraping the website but I was wondering if there is any way to prevent these kind of redirects using exclusively Python or Python packages.

asked 50 secs ago

LaGuille

72

3D grphique

lundi 31 mars 2014

How to avoid country-based redirects with urlopen or urllib2 in Python

Vote count:

0

Aucun commentaire:

Enregistrer un commentaire

lundi 31 mars 2014

How to avoid country-based redirects with urlopen or urllib2 in Python

Vote count: 0

Aucun commentaire:

Enregistrer un commentaire

Vote count:

0