I am reading a file and calculating the frequency of the top 100 words. I am able to find that and create the following list:
[('test', 510), ('Hey', 362), ("please", 753), ('take', 446), ('herbert', 325), ('live', 222), ('hate', 210), ('white', 191), ('simple', 175), ('harry', 172), ('woman', 170), ('basil', 153), ('things', 129), ('think', 126), ('bye', 124), ('thing', 120), ('love', 107), ('quite', 107), ('face', 107), ('eyes', 107), ('time', 106), ('himself', 105), ('want', 105), ('good', 105), ('really', 103), ('away',100), ('did', 100), ('people', 99), ('came', 97), ('say', 97), ('cried', 95), ('looked', 94), ('tell', 92), ('look', 91), ('world', 89), ('work', 89), ('project', 88), ('room', 88), ('going', 87), ('answered', 87), ('mr', 87), ('little', 87), ('yes', 84), ('silly', 82), ('thought', 82), ('shall', 81), ('circle', 80), ('hallward', 80), ('told', 77), ('feel', 76), ('great', 74), ('art', 74), ('dear',73), ('picture', 73), ('men', 72), ('long', 71), ('young', 70), ('lady', 69), ('let', 66), ('minute', 66), ('women', 66), ('soul', 65), ('door', 64), ('hand',63), ('went', 63), ('make', 63), ('night', 62), ('asked', 61), ('old', 61), ('passed', 60), ('afraid', 60), ('night', 59), ('looking', 58), ('wonderful', 58), ('gutenberg-tm', 56), ('beauty', 55), ('sir', 55), ('table', 55), ('turned', 54), ('lips', 54), ("one's", 54), ('better', 54), ('got', 54), ('vane', 54), ('right',53), ('left', 53), ('course', 52), ('hands', 52), ('portrait', 52), ('head', 51), ("can't", 49), ('true', 49), ('house', 49), ('believe', 49), ('black', 49), ('horrible', 48), ('oh', 48), ('knew', 47), ('curious', 47), ('myself', 47)]
After getting this list, I want to draw histogram using matplotlib. I am trying something as below, but I am not able to draw a proper histogram.
My question: How do I pass the total frequency to the graph? All of my bars are at the same height right now. And even the bin center is not correct. How should I pass data to the ax.hist method on below code? I am trying to update the example from http://matplotlib.org/1.2.1/examples/api/histogram_demo.html.
totalWords = counts.most_common(100)
print(totalWords)
for z in range(len(totalWords)):words.append(totalWords[z][0])x = np.arange(len(words))
#print x
i, s = 100, 15fig = plt.figure()
ax = fig.add_subplot(111)n, bins, patches = ax.hist(x, 50, normed=1, facecolor='green', alpha=0.75)bincenters = 0.5*(bins[1:]+bins[:-1])y = mlab.normpdf(bincenters*1.00, i, s)
l = ax.plot(bincenters, y, 'r--', linewidth=1)ax.set_xlabel('Words')
ax.set_ylabel('Frequency')
ax.set_xlim(50, 160)
ax.set_ylim(0, 0.04)
ax.grid(True)plt.show()