Skip to content Skip to sidebar Skip to footer

How To Get The Frequency Of A Specific Value In Each Row Of Pandas Dataframe

I have this pandas DataFrame: df = pd.DataFrame( data=[ ['yes', 'no', np.nan], ['no', 'yes', 'no'], [np.nan, 'yes', 'yes'], ['no', 'no', 'no']

Solution 1:

Use pd.get_dummies, but set dummy_na to True:

pd.get_dummies(
    df, prefix='', prefix_sep='', dummy_na=True
 ).groupby(level=0, axis=1).sum()  # Sum the *counts* for each column.

       nan  no  yes
ID                 
xyz_1    1   1    1
xyz_2    0   2    1
xyz_3    1   0    2
xyz_4    0   3    0 

Solution 2:

You may check melt + crosstab

newdf=df.melt('ID')

pd.crosstab(newdf.ID,newdf.value.fillna('NaN'))
Out[8]: 
value  NaN  no  yes
ID                 
xyz_1    1   1    1
xyz_2    0   2    1
xyz_3    1   0    2
xyz_4    0   3    0

Solution 3:

Using pd.get_dummies

df = df.set_index('ID') # Do this line only if 'ID' is not index

df2 = pd.get_dummies(df, dummy_na=True)

df['no']  = df2[df2.columns[df2.columns.str.endswith('no')]].sum(1)
df['yes'] = df2[df2.columns[df2.columns.str.endswith('yes')]].sum(1)
df['nan'] = df2[df2.columns[df2.columns.str.endswith('nan')]].sum(1)

Post a Comment for "How To Get The Frequency Of A Specific Value In Each Row Of Pandas Dataframe"