jeudi 26 février 2015

Python, Pandas error with groupby


Vote count:

0




I have the following Pandas DataFrame 'df1':



id_client product
client1 product1
client1 product4
client1 product5
client2 product1
client2 product6
client3 product1


First I want to groupby id_client and retrieve the matching products inside a list:



id_client product
client1 [product1,product4,product5]
client2 [product1,product6]
client3 [product1]


Then for each element of each list I want to add a new line to a new DataFrame 'df2' like this (nb_product is the length of each list):



product nb_product
product1 3
product4 3
product5 3
product1 2
product6 2
product1 1


So first I created a new dictionary:



nb_of_combination = {}
nb_of_combination['product'] = []
nb_of_combination['nb_product'] = []


then I declared the following function:



def nb_of_combination(my_list):
nb_comb = len(my_list)
for row in my_list:
nb_of_combination['product'].append(row)
nb_of_combination['nb_product'].append(nb_comb)


then I grouped by 'df1' by the field 'id_client' and I'm applying the function 'nb_of_combination':



df1 = df1.groupby('id_client',as_index=False).apply(lambda x: nb_of_combination(list(x.product)))


But I'm getting the following error:



df1 = df1.groupby('id_client',as_index=False).apply(lambda x: nb_of_combination(list(x.product)))
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 660, in apply
return self._python_apply_general(f)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 667, in _python_apply_general
not_indexed_same=mutated)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 2821, in _wrap_applied_output
v = next(v for v in values if v is not None)


Which I really don't understand since:



df2 = pd.DataFrame(nb_of_combination)


seems to work well.



asked 1 min ago







Python, Pandas error with groupby

Aucun commentaire:

Enregistrer un commentaire