Skip to content Skip to sidebar Skip to footer

Howto Bin Series Of Float Values Into Histogram In Python?

I have set of value in float (always less than 0). Which I want to bin into histogram, i,e. each bar in histogram contain range of value [0,0.150) The data I have looks like this:

Solution 1:

When possible, don't reinvent the wheel. NumPy has everything you need:

#!/usr/bin/env pythonimport numpy as np

a = np.fromfile(open('file', 'r'), sep='\n')
# [ 0.     0.005  0.124  0.     0.004  0.     0.111  0.112]# You can set arbitrary bin edges:
bins = [0, 0.150]
hist, bin_edges = np.histogram(a, bins=bins)
# hist: [8]# bin_edges: [ 0.    0.15]# Or, if bin is an integer, you can set the number of bins:
bins = 4
hist, bin_edges = np.histogram(a, bins=bins)
# hist: [5 0 0 3]# bin_edges: [ 0.     0.031  0.062  0.093  0.124]

Solution 2:

from pylab import *
data = []
inf = open('pulse_data.txt')
for line in inf:
    data.append(float(line))
inf.close()
#binning
B = 50
minv = min(data)
maxv = max(data)
bincounts = []
for i inrange(B+1):
    bincounts.append(0)
for d in data:
    b = int((d - minv) / (maxv - minv) * B)
    bincounts[b] += 1# plot histogram

plot(bincounts,'o')
show()

Solution 3:

The first error is:

Traceback (most recent calllast):
  File "C:\foo\foo.py", line 17, in<module>
    diffCounts[ str(getBin(diff)) ] +=1
TypeError: list indices must be integers

Why are you converting an int to a str when a str is needed? Fix that, then we get:

Traceback (most recent calllast):
  File "C:\foo\foo.py", line 17, in<module>
    diffCounts[ getBin(diff) ] +=1
IndexError: list index outofrange

because you've only made 5 buckets. I don't understand your bucketing scheme, but let's make it 50 buckets and see what happens:

6
Traceback (most recent calllast):
  File "C:\foo\foo.py", line 21, in<module>
    maxBin =max(maxdiff)
TypeError: 'int' object isnot iterable

maxdiff is a single value out of your list of ints, so what is max doing here? Remove it, now we get:

6
Traceback (most recent call last):
  File "C:\foo\foo.py", line 28, in <module>
    print binStr + '\t' + '\t'.join(map(str, (diffCounts[i])))
TypeError: argument 2 to map() must support iteration

Sure enough, you're using a single value as the second argument to map. Let's simplify the last two lines from this:

 binStr = '[' + str(lo) + ',' + str(hi) + ')'print binStr + '\t' + '\t'.join(map(str, (diffCounts[i])))

to this:

print"[%f, %f)\t%r" % (lo, hi, diffCounts[i])

Now it prints:

6
[0.000000, 1.000000)    3
[1.000000, 3.000000)    0
[3.000000, 7.000000)    2
[7.000000, 15.000000)   0
[15.000000, 31.000000)  0
[31.000000, 63.000000)  0
[63.000000, 127.000000) 3

I'm not sure what else to do here, since I don't really understand the bucketing you are hoping to use. It seems to involve binary powers, but isn't making sense to me...

Post a Comment for "Howto Bin Series Of Float Values Into Histogram In Python?"