Increasing Value Of Top K Elements In Sparse Matrix
I am trying to find an efficient way that lets me increase the top k values of a sparse matrix by some constant value. I am currently using the following code, which is quite slow
Solution 1:
You could probably speed things up quite a lot by directly modifying a.data
rather than iterating over row/column indices and modifying individual elements:
idx = a.data.argsort()[::-1][:1] #k is1
a.data[idx] += 1
This also saves converting from CSR --> COO.
Update
As @WarrenWeckesser rightly points out, since you're only interested in the indices of the k
largest elements and you don't care about their order, you can use argpartition
rather than argsort
. This can be quite a lot faster when a.data
is large.
For example:
from scipy import sparse
# a random sparse array with 1 million non-zero elements
a = sparse.rand(10000, 10000, density=0.01, format='csr')
# find the indices of the 100 largest non-zero elements
k = 100# using argsort:
%timeit a.data.argsort()[-k:]
# 10 loops, best of 3: 135 ms per loop# using argpartition:
%timeit a.data.argpartition(-k)[-k:]
# 100 loops, best of 3: 13 ms per loop# test correctness:
np.all(a.data[a.data.argsort()[-k:]] ==
np.sort(a.data[a.data.argpartition(-k)[-k:]]))
# True
Post a Comment for "Increasing Value Of Top K Elements In Sparse Matrix"