Skip to content Skip to sidebar Skip to footer

Frequency Of Items Within List Of Lists

I have a list of lists called bi_grams and I want the frequency of each bigram. The length of bi_grams is 23087 so I might need a loop (?) bi_grams= [[('ABC', 'Memorial'), ('Memori

Solution 1:

You have a nested list which you can flatten with itertools.chain.from_iterable.

Apart from that complication the problem boils down to a simple application of collections.Counter because a Counter has no problem with counting tuples.

>>> from collections import Counter
>>> from itertools import chain
>>> >>> bi_grams= [[('ABC', 'Memorial'), ('Memorial', 'Hospital')], [('ABC', 'Memorial'), ('Memorial', 'Clinic')]]
>>> Counter(chain.from_iterable(bi_grams))
>>> 
Counter({('ABC', 'Memorial'): 2,
         ('Memorial', 'Clinic'): 1,
         ('Memorial', 'Hospital'): 1})

There's also a pretty straight forward solution with a for loop:

>>> c = Counter()
>>> for x in bi_grams:
...:    c.update(x)
...:    
>>> c
>>> 
Counter({('ABC', 'Memorial'): 2,
         ('Memorial', 'Clinic'): 1,
         ('Memorial', 'Hospital'): 1})

Solution 2:

chain.from_iterable as suggested by @timgeb is probably the way to go, but you could also flatten your list via a list comprehension, and then apply Counter:

from collections import Counter

bi_grams= [[('ABC', 'Memorial'), ('Memorial', 'Hospital')], [('ABC', 'Memorial'), ('Memorial', 'Clinic')]]

>>> Counter(i for x in bi_grams for i in x)
Counter({('ABC', 'Memorial'): 2, ('Memorial', 'Hospital'): 1, ('Memorial', 'Clinic'): 1})

Solution 3:

You can use the chain(*iterable) idiom too:

>>> from itertools import chain                                                    >>> from collections import Counter
>>> Counter(chain(*bi_grams))
Counter({('ABC', 'Memorial'): 2, ('Memorial', 'Hospital'): 1, ('Memorial', 'Clinic'): 1})

Using chain(*iterable), flattens the list of list of tuples to the inner list, e.g.

>>> x = [[(1,2), (3,4)], [(5,6)], [(7,8)]]

>>> list(chain(*x))
[(1, 2), (3, 4), (5, 6), (7, 8)]

Counter simply counts what's in the flatten list:

>>> x = [[(1,2), (3,4)], [(5,6)], [(7,8)]]

>>> Counter(chain(*x))
Counter({(1, 2): 1, (3, 4): 1, (5, 6): 1, (7, 8): 1})

Post a Comment for "Frequency Of Items Within List Of Lists"