Skip to content Skip to sidebar Skip to footer

Pandas Groupby With Bin Sum Aggregation

I have a similar question to this one I have a dataframe in pandas that looks like this - showing ages at which different users won awards. Interested in computing total awards fo

Solution 1:

You can define the bins and cuts as follows:

bins = [9* i for i inrange(0, df['age'].max() //9+2)]
cuts = pd.cut(df['age'], bins, right=False)

print(cuts)

0    [18, 27)
1    [18, 27)
2    [54, 63)
3    [27, 36)
4    [45, 54)
Name: age, dtype: category
Categories (7, interval[int64, left]): [[0, 9) < [9, 18) < [18, 27) < [27, 36) < [36, 45) < [45, 54) < [54, 63)]

Then, group by id and the cuts and sum awards for the cuts to get total_awards. Create age_interval by GroupBy.cumcount()

df_out = (df.groupby(['id', cuts])
            .agg(total_awards=('awards', 'sum'))
            .reset_index(level=0)
            .reset_index(drop=True)
         )
df_out['age_interval'] = df_out.groupby('id').cumcount()

Result:

print(df_out)

    id  total_awards  age_interval
0    1             0             0
1    1             0             1
2    1           250             2
3    1             0             3
4    1             0             4
5    1             0             5
6    1            50             6
7    2             0             0
8    2             0             1
9    2             0             2
10   2           193             3
11   2             0             4
12   2           209             5
13   2             0             6

Solution 2:

Pretty sure this covers what you are looking for

df = pd.read_clipboard()
bins = [i for i in range(0, 100 ,9)]
results = df.groupby(['id', pd.cut(df.age, bins)])['awards'].sum().reset_index()
print(results)
    id  age         awards
01(0,9]NaN11(9,18]NaN21(18,27]250.031(27,36]NaN41(36,45]NaN51(45,54]50.061(54,63]NaN71(63,72]NaN81(72,81]NaN91(81,90]NaN101(90,99]NaN112(0,9]NaN122(9,18]NaN132(18,27]NaN142(27,36]193.0152(36,45]NaN162(45,54]209.0172(54,63]NaN182(63,72]NaN192(72,81]NaN202(81,90]NaN212(90,99]NaN

Post a Comment for "Pandas Groupby With Bin Sum Aggregation"