Vote count:
0
I have a question regarding the seeds options in the train() fonction from the caret package. This option is supposed to ensure that the samples used by different process in a parallelized cross-validation are consistent across the workers.
Here is an example of the creation of a seeds argument :
#create a list of seed, here change the seed for each resampling
set.seed(123)
seeds <- vector(mode = "list", length = 11)#length is = (n_repeats*nresampling)+1
for(i in 1:10) seeds[[i]]<- sample.int(n=1000, 3) #(3 is the number of tuning parameter, mtry for rf, here equal to ncol(iris)-2)
seeds[[11]]<-sample.int(1000, 1)#for the last model
#control list
myControl <- trainControl(method='cv', seeds=seeds, index=createFolds(iris$Species))
I can't get why so many arguments (10*3+1) are necessary to setup the folds. For me, the same folds are used by the models which evaluate each parameters and I really don't understand the need of the 14th parameter. How all these parameters are used by the workers ? I couldn't find any information in the documentation.
Thank you.
asked 53 secs ago
Aucun commentaire:
Enregistrer un commentaire