Tremendous Value in Big Data Analytics and AI, Beatles Fans Say

By: Lee Rickwood

September 12, 2018

Statistical analysts, data scientists, and fans of the best music in the world are singing from the same hymn book, praising the tremendous value of big data analytics and AI, saying the technology helped solve a major mystery and pressing social issue: who wrote one of the best Beatles songs ever?

While the authorship of most of The Beatles catalog is well-known and well-established, some questions and even disagreements remain about who wrote what. Among the dozens of hit Lennon-McCartney songs out there, just which Beatle composed the music for “In My Life” is disputed.

Beatles pictured on In My Life record sleeve

This track, from the 1965 album Rubber Soul, comes in at #23 on Rolling Stone’s list of The 500 Greatest Songs of All Time. Mojo magazine calls it the greatest song ever! Fair use image: Record cover

It’s no trivial matter: the track, from the 1965 album Rubber Soul, comes in at #23 on the Rolling Stone list of The 500 Greatest Songs of All Time.

Mojo magazine calls it the greatest song ever!

Not surprisingly, there is some disagreement over authorship: Lennon has been quoted as saying the song is essentially his: lyrically, there is little doubt. But McCartney says he composed much of the music (and he’s acknowledged a disagreement with Lennon about the composition).

Ironically, the song itself poignantly addresses issues of memory, clarity, and accuracy:

“There are places I’ll remember
All my life, though some have changed
Some forever, not for better …”

The decades-old musical disagreement remained unresolved until a recent presentation at the 2018 American Statistical Association conference (held in Vancouver, B.C., it wrapped about a month ago); it’s one of the largest gatherings of statisticians and data scientists in the world.

Three of those data scientists (Mark Glickman, a Harvard statistician; Harvard professor of engineering Ryan Song; and Dalhousie University mathematician Jason Brown) presented a paper called “Assessing Authorship of Beatles Songs from Musical Content: Bayesian Classification Modeling from Bags-of-Words Representations”

Statistical Association conference poster

The 2018 American Statistical Association conference was held in Vancouver, B.C., it wrapped about a month ago. It is one of the largest gatherings of statisticians and data scientists in the world.

Not quite as lyrical as “In My Life”, but almost as revealing: the trio used data analysis techniques to try to figure out what was going on in the song.

It’s called stylometry — the use of statistical techniques to determine authorship (hello, anonymous Op-Ed writers everywhere). Computer horsepower helps analysts track not only what are unique or unusual word and theme choices, but also ones that are habitual and repeated, and therefore can be logically identified and matched to a particular source.

A certain amount of machine learning and AI (artificial intelligence) was involved. Computers were trained to recognize certain elements of any Beatles tune – more than 70 songs or portions thereof were deconstructed into five main categories or representations, and then to some 150 individual characteristics or components.

This descriptive dataset of musical features includes commonly appearing individual musical chords (and appearances of unusual or uncommon chords); frequency of chord sequences and transitions; elements of melody and the nature of counterpoint in timing and rhythm, as well as other details. All can be used to distinguish different musical styles and composition traits.

“Consider the Lennon song ‘Help!’,” Glickman described in an ASA release. “It basically goes, ‘When I was younger, so much younger than today,’ where the pitch doesn’t change very much. It stays at the same note repeatedly, and only changes in short steps. Whereas with Paul McCartney, you take a song like ‘Michelle,’ and it goes, ‘Michelle, ma belle. Sont les mots qui vont très bien ensemble.’ In terms of pitch, it’s all over the place.”

Nevertheless, by plugging lots of data into an analytical model that was built from the ground up and trained on Beatles music (now there’s a tough assignment), the paper’s authors felt confident they could determine who wrote what, citing their sampled classification accuracy for songs with known authorship of 80 per cent.

Their verdict: “In My Life” is a John Lennon song. The statistical probability that McCartney wrote it is less than one per cent (.018, in fact). Sir Paul’s quoted recollections on the matter are, well, less than relevant now.

(Personally, I remember the song as one of prettiest by The Beatles: to me, the melody does more than not “change very much”, and I think the harpsichord solo by George Martin picks up on some clever chord changes in the middle eight. That, and the fact I have seen The Beatles perform live more times than any AI algorithm ever, fuels my recollections.).

But as John wrote in the song, “memories lose their meaning”.

Until big data comes along.

Betles pictured in 1965 ish black and white photo

Certain individual aspects of their songwriting style are well-known and much-appreciated: Lennon’s melody lines, for example, may not vary much: think of “Help!”, “She Said She Said” or “Strawberry Fields Forever”. McCartney’s tunes are generally more melodic: “Michelle”, “Here, There and Everywhere” and “Let It Be”. That’s why big data analysis can help determine who wrote what. Creative Commons image.




Data Protection

Artificial Intelligence 


Leave a Reply

Your email address will not be published. Required fields are marked *