3D grphique: Hadoop: Nested FOR loop using Map Reduce?

samedi 22 mars 2014

Hadoop: Nested FOR loop using Map Reduce?

Vote count:

0

I have two files with records and I want to do the following on Hadoop:

(Easy part)


For each Record in both files
    compute some values from record and store in array representing the record

Then(The messy part)


For each record array computed in previous step from fileA
    For each record array computed in previous step from FileB
         IF they have X number of elements in common
              print to output

This is what I am trying to do using Hadoop however I have no idea how to do this efficiently without using one reducer for the nested For Loop.

Any suggestions/ideas on how best to go about such a task?

I would prefer to use Python and streaming jar in hadoop.

Thanks

asked 1 min ago

Mo.

2,513

3D grphique

samedi 22 mars 2014

Hadoop: Nested FOR loop using Map Reduce?

Vote count:

0

Aucun commentaire:

Enregistrer un commentaire

samedi 22 mars 2014

Hadoop: Nested FOR loop using Map Reduce?

Vote count: 0

Aucun commentaire:

Enregistrer un commentaire

Vote count:

0