Vote count:
0
I'm I have a data frame that contains an 'output', average temperature, humidity, and time (given as 24 factors, not continuous) data for 100 cities (given by codes). I want to apply a regression formula to predict the output for each city based on the temperature, humidity, and time data. I hope to get 100 different regression models. I used the ddply function and came up with the following line of code with help from this thread.
df = ddply(data, "city", function(x) coefficients(lm(output~temperature+humidity, data=x)))
This code works for the numeric data, temperature and humidity. But when I add in the time zone factor data (which is 23 factor variables) I get an error:
df = ddply(data, "city", function(x) coefficients(lm(output~temperature+humidity+time, data=x)))
"Error: contrasts can be applied only to factors with 2 or more levels"
Does anyone know why this is? Here is an example chunk of my data frame:
city temperature humidity time
11 51 34 01
11 43 30 02
11 55 50 03
11 64 54 10
22 21 52 11
22 43 65 04
22 51 66 09
22 51 78 16
05 45 70 01
05 51 54 10
So I would want three models for the three cities here, based on temperature, humidity, and the time factor.
ddply for regression in R
Aucun commentaire:
Enregistrer un commentaire