mardi 31 mars 2015

pandas - how to filter "most frequent" Datetime objects


Vote count:

0




I'm working with a DataFrame like the following:



User_ID Datetime
01 2014-01-01 08:00:00
01 2014-01-02 09:00:00
02 2014-01-02 10:00:00
02 2014-01-03 11:00:00
03 2014-01-04 12:00:00
04 2014-01-04 13:00:00
05 2014-01-02 14:00:00


I would like to filter Users under certain conditions based on the Datetime columns, e.g. filter only Users with one occurrence / month, or only Users with occurrences only in summer etc.


So far I've group the df with:



g = df.groupby(['User_ID','Datetime']).size()


obtaining the "traces" in time of each User:



User_ID Datetime
01 2014-01-01 08:00:00
2014-01-02 09:00:00
02 2014-01-02 10:00:00
2014-01-03 11:00:00
03 2014-01-04 12:00:00
04 2014-01-04 13:00:00
05 2014-01-02 14:00:00


Then I applied a mask to filter, for instance, the Users with more than one trace:



mask = df.groupby('User_ID')['Datetime'].apply(lambda g: len(g)>1)
df = df[df['User_ID'].isin(mask[mask].index)]


So this is fine. I'm looking for a function instead of the lambda g: len(g)>1 able to filter Users under different conditions, as I said before. In particular filter Users with with one occurrence / month.



asked 1 min ago







pandas - how to filter "most frequent" Datetime objects

Aucun commentaire:

Enregistrer un commentaire