I have data from a control and treatment group. Is matplotlib able to create a bar chart where the bar height is the mean of each group overlaid with the individual data points from that group? I'd like to visualize the spread of the actual data points, similar to what is displayed here.
I've thought about using a combination of boxplots and scatter, but my attempts have not succeeded.
Here is a solution doing exactly what you mention: overlay a bar graph with a scatter plot.
Of course you can further play around to tweak the plot: plot title, axis labels, colors, width, marker shape of the scatter plot ...
import matplotlib.pyplot as plt
np.random.seed(123)w = 0.8 # bar width
x = [1, 2] # x-coordinates of your bars
colors = [(0, 0, 1, 1), (1, 0, 0, 1)] # corresponding colors
y = [np.random.random(30) * 2 + 5, # data seriesnp.random.random(10) * 3 + 8]fig, ax = plt.subplots()
ax.bar(x,height=[np.mean(yi) for yi in y],yerr=[np.std(yi) for yi in y], # error barscapsize=12, # error bar cap width in pointswidth=w, # bar widthtick_label=["control", "test"],color=(0,0,0,0), # face color transparentedgecolor=colors,#ecolor=colors, # error bar colors; setting this raises an error for whatever reason.)for i in range(len(x)):# distribute scatter randomly across whole width of barax.scatter(x[i] + np.random.random(y[i].size) * w - w / 2, y[i], color=colors[i])plt.show()
It will yield this graph