dimanche 12 février 2017

Python Hadoop Streaming -- Error in reading JSON file using JSONValueProtocol

Vote count: 0

I am a beginner in writing Mapreduce code using python through the concept of Hadoop Streaming. I have to work with JSON files stored in hdfs. To begin learning, I followed this link to try to read in a single json file and get the desired fields in the file. Using the first code snippet in the link, I wrote the following code to get the fields in a single JSON file called json_ex.json.

from mrjob.protocol import JSONValueProtocol

input = open("json_ex.json")
for line in input:
    user = JSONValueProtocol.read(line)[1]
    user_name = user['name']
    print "user_name\t%s" % (user_name)

When i run this code, i get the following error:

unbound method read() must be called with StandardJSONValueProtocol instance as first argument (got str instance instead) 

I tried to create an instance of the JSONValueProtocol class and then use it as follows :

jsonvp = JSONValueProtocol()
email = jsonvp.read(line)[1]

But the error persisted. How could I resolve this ?

asked 33 secs ago

Let's block ads! (Why?)



Python Hadoop Streaming -- Error in reading JSON file using JSONValueProtocol

Aucun commentaire:

Enregistrer un commentaire