Average Entries With Duplicate First Element In 2d Numpy Array
I have an array that looks like this arr = np.array([[0, 1], [0, 2], [1, 3], [1, 3], [1, 4], [2, 3]]) and I would like to take the average of the 'entries' that have the same fi
Solution 1:
Here's a NumPythonic solution using np.unique
and np.bincount
for a generic case when the first column is not always sorted -
unqa,ID,counts = np.unique(arr[:,0],return_inverse=True,return_counts=True)
out= np.column_stack(( unqa , np.bincount(ID,arr[:,1])/counts ))
Sample run -
In [4]: arr
Out[4]:
array([[5, 1],
[5, 2],
[1, 3],
[1, 3],
[5, 4],
[2, 3]])
In [5]: unqa,ID,counts = np.unique(arr[:,0],return_inverse=True,return_counts=True)
...: out = np.column_stack(( unqa , np.bincount(ID,arr[:,1])/counts ))
...:
In [6]: out
Out[6]:
array([[ 1. , 3. ],
[ 2. , 3. ],
[ 5. , 2.33333333]])
Solution 2:
You can use a dictionary to grouping your items them use np.mean()
within a list comprehension to get the expected result:
>>>for i,j in arr:... d.setdefault(i,[]).append(j)...>>>d
{0: [1, 2], 1: [3, 3, 4], 2: [3]}
>>>>>>[[i,np.mean(j)] for i,j in d.items()]
[[0, 1.5], [1, 3.3333333333333335], [2, 3.0]]
Or if you want the data in a rounded mode:
>>> [[i,round(np.mean(j),2)] for i,j in d.items()]
[[0, 1.5], [1, 3.33], [2, 3.0]]
Post a Comment for "Average Entries With Duplicate First Element In 2d Numpy Array"