← back to TrueTrackID music

From Hum to Algorithm: Music Recognition Through the Ages

by TrueTrackID  ·  6 min read

At some point in your life, you've had a song stuck in your head and absolutely no idea what it was. Maybe you caught a few seconds of it on TV, or heard it drifting out of a shop, and then it was gone — just a fragment of a melody you couldn't name. That feeling is ancient. And people have been trying to solve it for a very long time.

the pre-recording era: hum it and hope

Before recorded music existed, identifying a song meant identifying its sheet music. If you could hum the melody to a musician, they might recognise it and write down the title. Libraries catalogued songs by their opening melodic phrases — a system called a "melodic index." It worked, sort of, as long as you could carry a tune and found the right person to ask.

This wasn't really music recognition in any modern sense. It was just knowledge, passed between people who cared about music enough to memorise a lot of it.

the jukebox era: recognition by catalogue

The phonograph and the jukebox changed everything. Suddenly music was a physical object — a disc you could hold, label, and catalogue. Recognition became a matter of matching a sound to a catalogue entry. Radio DJs became unofficial music encyclopedias. Record stores stocked staff who could name almost anything you could hum.

The first real automated attempt at music recognition came in the 1990s with CDDB — a database where software could look up a CD's track listing by reading the disc's timing information. It wasn't identifying the music itself, just the disc's metadata. But it was the first time a machine could tell you what you were listening to without a human in the loop.

the Shazam breakthrough

In 1999, a researcher named Avery Wang at a company called Shazam developed the audio fingerprinting algorithm that would eventually change everything. The core idea — converting audio into a spectrogram, extracting stable peaks, hashing pairs of peaks — was elegant enough to work on the limited computing power of early mobile phones.

When Shazam launched as a phone service in 2002, you'd call a number, hold your phone up to the speaker for 30 seconds, hang up, and receive a text message with the song name. It felt miraculous. The underlying database at launch had around 3 million songs. By the time the smartphone app launched in 2008, that number had grown to tens of millions.

the smartphone decade

Once Shazam was on smartphones, music recognition went mainstream. Competitors appeared — SoundHound took a different approach, building a system that could match hummed or sung melodies rather than just recordings. Google built music recognition directly into Android's ambient sound detection. Apple eventually acquired Shazam in 2018 for a reported $400 million.

The problem shifted from "can we do this at all" to "how do we handle billions of queries." Modern music recognition systems process hundreds of millions of searches per month. The databases now contain hundreds of millions of tracks. The algorithms have been refined to work on clips as short as five seconds even in very noisy environments.

where it stands today

Music recognition is now so good and so fast that it's almost invisible. It's baked into operating systems, smart speakers, and streaming platforms. Spotify will tell you what's playing in a restaurant. Google can identify a song from a hummed tune. TikTok surfaces audio identities automatically for every video.

The interesting frontier now is live stream recognition — identifying what's playing in real-time video streams, where the audio is compressed, subject to overlapping voices and sound effects, and constantly changing. That's exactly what TrueTrackID was built to handle.

The ancient problem of "what song is this" has never had better answers. But it's also never been more complicated, with more music being created and distributed every day than ever before in history.

curious what's playing on your favourite stream right now?

identify a Twitch stream →