Skip to content Skip to sidebar Skip to footer

Python Groupby With Boolean Mask

I have a pandas dataframe with the following general format: id,atr1,atr2,orig_date,fix_date 1,bolt,l,2000-01-01,nan 1,screw,l,2000-01-01,nan 1,stem,l,2000-01-01,nan 2,stem,l,2000-

Solution 1:

I think this should work:

df['failed_part_ind'] = df.apply(lambda row: 1 if ((row['id'] ==row['id']) &
                                                (row['atr1'] ==row['atr1']) &
                                                (row['atr2'] ==row['atr2']) &
                                                (row['orig_date'] <row['fix_date']))
                                            else0, axis=1) 

Update: I think this is what you want:

import numpy as np
deff(g):
    min_fix_date = g['fix_date'].min()
    if np.isnan(min_fix_date):
        g['failed_part_ind'] = 0else:
        g['failed_part_ind'] = g['orig_date'].apply(lambda d: 1if d < min_fix_date else0)
    return g

df.groupby(['id', 'atr1', 'atr2']).apply(lambda g: f(g))

Post a Comment for "Python Groupby With Boolean Mask"