jeudi 29 janvier 2015

R fastest way to do lookup for vector of strings


Vote count:

0




I have a vector of strings x = c("hello", "world") and another vector y = c("hello", "world", "how", "are", "you"). I want to see which elements of x are inside y. For small vector this could easily be done using x %in% y. However I am looking for a more efficient way to do this - normally we would sort y first in O(n log n) time, then foreach string inside x we can do lookup in O(log n) time. I am worried that %in% is doing a full pass over y for each x it is looking up.


Is there a way to take advantage of sort and binary search in R? Or is there a way to build a hashset from y for fast lookup times?



asked 17 secs ago

JCWong

309






R fastest way to do lookup for vector of strings

Aucun commentaire:

Enregistrer un commentaire