Vote count: 0
I am a beginner in writing Mapreduce code using python through the concept of Hadoop Streaming. I have to work with JSON files stored in hdfs. To begin learning, I followed this link to try to read in a single json file and get the desired fields in the file. Using the first code snippet in the link, I wrote the following code to get the fields in a single JSON file called json_ex.json
.
from mrjob.protocol import JSONValueProtocol
input = open("json_ex.json")
for line in input:
user = JSONValueProtocol.read(line)[1]
user_name = user['name']
print "user_name\t%s" % (user_name)
When i run this code, i get the following error:
unbound method read() must be called with StandardJSONValueProtocol instance as first argument (got str instance instead)
I tried to create an instance of the JSONValueProtocol class and then use it as follows :
jsonvp = JSONValueProtocol()
email = jsonvp.read(line)[1]
But the error persisted. How could I resolve this ?
asked 33 secs ago
Python Hadoop Streaming -- Error in reading JSON file using JSONValueProtocol
Aucun commentaire:
Enregistrer un commentaire