Vote count:
 
 0 
Step 1 I am working on a social product where in I retrieve tweets continuously from firehose. This raw data is being dumped to a collection. Each tweet is a document.
Step 2 Now I retrieve these documents,pass it to a sentiment analyser API which gives whether it is a positive or negative tweet.
Step 3 The sentiment of each tweet should be added as another field for each document and update the same in collection.
My questions are Is it right to perform insertion of raw data,retrieve and update the collection. Since data volume is huge, I don't know how to save the data in intermediate place.
Considering the volume of data, if I need to insert and update the documents in bunches, how will I know what documents to be fetched for next processing assuming I am using limit as 1000?
To be more precise, I have retrieved first 1000 documents, and should I retrieve next 1000 by skipping first 1000 and further I should retrieve next 1000 skipping first 2000 records.
Kindly help me.
update collection in mongodb while insertion is still going on
Aucun commentaire:
Enregistrer un commentaire