Transform A Set Of Numbers In Numpy So That Each Number Gets Converted Into A Number Of Other Numbers Which Are Less Than It
Solution 1:
What you actually need to do is get the inverse of the sorting order of your array:
import numpy as np
x = np.random.rand(10)
y = np.empty(x.size,dtype=np.int64)
y[x.argsort()] = np.arange(x.size)
Example run (in ipython):
In[367]: xOut[367]:
array([ 0.09139335, 0.29084225, 0.43560987, 0.92334644, 0.09868977,
0.90202354, 0.80905083, 0.4801967 , 0.99086213, 0.00933582])
In[368]: yOut[368]: array([1, 3, 4, 8, 2, 7, 6, 5, 9, 0])
Alternatively, if you want to get the number of elements greater than each corresponding element in x
, you have to reverse the sorting from ascending to descending. One possible option to do this is to simply swap the construction of the indexing:
y_rev = np.empty(x.size,dtype=np.int64)
y_rev[x.argsort()] = np.arange(x.size)[::-1]
another, as @unutbu suggested in a comment, is to map the original array to the new one:
y_rev = x.size - y - 1
Solution 2:
Here's one approach using np.searchsorted
-
np.searchsorted(np.sort(x),x)
Another one mostly based on @Andras Deak's solution
using argsort()
-
x.argsort().argsort()
Sample run -
In [359]: x
Out[359]:
array([ 0.62594394, 0.03255799, 0.7768568 , 0.03050498, 0.01951657,
0.04767246, 0.68038553, 0.60036203, 0.3617409 , 0.80294355])
In [360]: np.searchsorted(np.sort(x),x)
Out[360]: array([6, 2, 8, 1, 0, 3, 7, 5, 4, 9])
In [361]: x.argsort().argsort()
Out[361]: array([6, 2, 8, 1, 0, 3, 7, 5, 4, 9])
Solution 3:
In addition to the other answers another solution using boolean indexing could be:
sum(x > i for i in x)
For your example:
In [10]: x
Out[10]:
array([ 0.62594394, 0.03255799, 0.7768568 , 0.03050498, 0.01951657,
0.04767246, 0.68038553, 0.60036203, 0.3617409 , 0.80294355])
In [10]: y =sum(x > i for i in x)
In [11]: y
Out[10]: array([6, 2, 8, 1, 0, 3, 7, 5, 4, 9])
Solution 4:
I wanted to contribute to this post by providing some testing on @Andras Deak's solution versus argsort
again.
It would appear that argsort
again is quicker for short arrays. Simple idea is to evaluate what is the length of array in which we see the balance shift.
I'll define three functions
construct
which is Andras Deak's solutionargsortagain
which is obviousattempted_optimal
which trades off atlen(a) == 400
functions
defargsortagain(s):
return s.argsort()
defconstruct(s):
u = np.empty(s.size, dtype=np.int64)
u[s] = np.arange(s.size)
return u
defattempted_optimal(s):
return argsortagain(s) iflen(s) < 400else construct(s)
testing
results = pd.DataFrame(
index=pd.RangeIndex(10, 610, 10, 'len'),
columns=pd.Index(['construct', 'argsortagain', 'attempted_optimal'], name='function'))
for i in results.index:
a = np.random.rand(i)
s = a.argsort()
for j in results.columns:
results.set_value(
i, j,
timeit(
'{}(s)'.format(j),
'from __main__ import {}, s'.format(j),
number=10000)
)
results.plot()
conclusion
attempted_optimal
does what its supposed to do. But I'm not sure it's worth it for the marginal benefit gained in a spectrum of array length (sub 400) where it hardly matters. I'd advocate fully for constructed
only.
This analysis helped me reach this conclusion.
Post a Comment for "Transform A Set Of Numbers In Numpy So That Each Number Gets Converted Into A Number Of Other Numbers Which Are Less Than It"