dimanche 6 juillet 2014

Permuting elements of a vector 10,000 times - efficiently? (R)


Vote count:

0




This question is quite straightforward. However, the solutions that I have found to it are extremely memory and time inefficient. I am wondering if this can be done in R without grinding one's machine into dust.


Take a vector:



x<-c("A", "B", "B", "E", "C", "C", "D", "E", "A', "C")


This one has 10 elements. There are five unique elements. Therefore, importantly, some elements are repeated and any permutation should contain the same total number of each type of element. I wish to permute this sequence/vector 10,000 times with each one being a unique one. With my real data, I could be doing these permutations for up to 1000 elements. This can be very hard to do efficiently.


To get one permutation, you can just do:



sample(x)


or, from the gtools package:



permute(x)


I could write some code to do that 10,000 times, but am likely to have duplicates. Is there way of doing this and dropping duplicates until 10,000 is reached?


Other similar questions on stackoverflow and statsoverflow have asked question about generating all the unique permutations of a sequence. These questions are here:


Shuffling a vector - all possible outcomes of sample()?


Generating all distinct permutations of a list in R


http://ift.tt/1n6rBMz


These are good and the suggestions for generating all the unique permutations are great and it would certainly be quite easy to run them and sample 10,000 random samples from each to get our 10,000. However, if you go beyond about 10 elements in a vector then it gets very memory intensive.


Any comments about how to do this efficiently for up to 1000 elements appreciated. This has me getting very dizzy.



asked 26 secs ago






Aucun commentaire:

Enregistrer un commentaire