Generate thumbnail for arbitrary audio file

2024/9/8 10:01:46

I want to represent an audio file in an image with a maximum size of 180×180 pixels.

I want to generate this image so that it somehow gives a representation of the audio file, think of it like SoundCloud's waveform (amplitude graph)?.

Screenshot of Soundcloud's player

I wonder if any of you have something for this. I have been searching around for a bit, mainly "audio visualization" and "audio thumbnailing", but I have not found anything useful.

I first posted this to ux.stackexchange.com, this is my attempt to reach any programmers working on this.

Answer

You could also break up the audio into a chunks and measure the RMS (a measure of loudness). let's say you want an image that is 180 pixels wide.

I'll use pydub, a light-weight wrapper I wrote around the std lib wave modeule:

from pydub import AudioSegment# first I'll open the audio file
sound = AudioSegment.from_mp3("some_song.mp3")# break the sound 180 even chunks (or however
# many pixels wide the image should be)
chunk_length = len(sound) / 180loudness_of_chunks = []
for i in range(180):start = i * chunk_lengthend = chunk_start + chunk_lengthchunk = sound[start:end]loudness_of_chunks.append(chunk.rms)

the for loop can be represented as the following list comprehension, I just wanted it to be clear:

loudness_of_chunks = [sound[ i*chunk_length : (i+1)*chunk_length ].rmsfor i in range(180)]

Now the only think left to do is scale the RMS down to a 0 - 180 scale (since you want the image to be 180px tall)

max_rms = max(loudness_of_chunks)scaled_loudness = [ (loudness / max_rms) * 180 for loudness in loudness_of_chunks]

I'll leave the drawing of the actual pixels to you, I'm not very experienced with PIL or ImageMagik :/

https://en.xdnf.cn/q/72700.html

Related Q&A

Extract specific text lines?

I have a large several hudred thousand lines text file. I have to extract 30,000 specific lines that are all in the text file in random spots. This is the program I have to extract one line at a time:b…

Listing users for certain DB with PyMongo

What Im trying to acheiveIm trying to fetch users for a certain database.What I did so farI was able to find function to list the databases or create users but none for listing the users, I thought ab…

Using python selenium for Microsoft edge

I am trying to use pythons selenium for Microsoft edge but I keep getting this error:WebDriverException: Message: unknown error: cannot find Microsoft Edge binaryI downloaded the latest version of the …

get all unicode variations of a latin character

E.g., for the character "a", I want to get a string (list of chars) like "aāăą" (not sure if that example list is complete...) (basically all unicode chars with names "Latin…

How do I install Django on Ubuntu 11.10?

Im using The Definitive guide to installing Django on ubuntu and ironically need something more definitive because I cant make it work.(I have followed the steps before this on the link above) Here is …

Sympy second order ode

I have a homogeneous solution to a simple second-order ODE, which when I try to solve for initial values using Sympy, returns the same solution. It should substitute for y(0) and y(0) and yield a solut…

Bulk update using Peewee library

Im trying to update many records inside a table using Peewee library. Inside a for loop, i fetch a single record and then I update it but this sounds awful in terms of performance so I need to do the u…

Can you specify variance in a Python type annotation?

Can you spot the error in the code below? Mypy cant.from typing import Dict, Anydef add_items(d: Dict[str, Any]) -> None:d[foo] = 5d: Dict[str, str] = {} add_items(d)for key, value in d.items():pr…

Django loaddata error

I created a "fixtures" folder in the app directory and put data1.json in there.This is what is in the file:[{"firm_url": "http://www.graychase.com/kadam", "firm_name&…

Parsing JSON string/object in Python

Ive recently started working with JSON in python. Now Im passing a JSON string to Python(Django) through a post request. Now I want to parse/iterate of that data. But I cant find a elegant way to parse…