Average Entries With Duplicate First Element In 2d Numpy Array
I have an array that looks like this   arr = np.array([[0, 1], [0, 2], [1, 3], [1, 3], [1, 4], [2, 3]])  and I would like to take the average of the 'entries' that have the same fi
Solution 1:
Here's a NumPythonic solution using np.unique and np.bincount for a generic case when the first column is not always sorted -
unqa,ID,counts = np.unique(arr[:,0],return_inverse=True,return_counts=True)
out= np.column_stack(( unqa , np.bincount(ID,arr[:,1])/counts ))
Sample run -
In [4]: arr
Out[4]: 
array([[5, 1],
       [5, 2],
       [1, 3],
       [1, 3],
       [5, 4],
       [2, 3]])
In [5]: unqa,ID,counts = np.unique(arr[:,0],return_inverse=True,return_counts=True)
   ...: out = np.column_stack(( unqa , np.bincount(ID,arr[:,1])/counts ))
   ...: 
In [6]: out
Out[6]: 
array([[ 1.        ,  3.        ],
       [ 2.        ,  3.        ],
       [ 5.        ,  2.33333333]])
Solution 2:
You can use a dictionary to grouping your items them use np.mean() within a list comprehension to get the expected result:
>>>for i,j in arr:...   d.setdefault(i,[]).append(j)...>>>d
{0: [1, 2], 1: [3, 3, 4], 2: [3]}
>>>>>>[[i,np.mean(j)] for i,j in d.items()]
[[0, 1.5], [1, 3.3333333333333335], [2, 3.0]]
Or if you want the data in a rounded mode:
>>> [[i,round(np.mean(j),2)] for i,j in d.items()]
[[0, 1.5], [1, 3.33], [2, 3.0]]
Post a Comment for "Average Entries With Duplicate First Element In 2d Numpy Array"