View profile

Where does "data-driven storytelling" in music go wrong?

Happy Wednesday! This week, the U.S. celebrated National Radio Day, which is dedicated to celebrating
Where does "data-driven storytelling" in music go wrong?
By Cherie Hu • Issue #33 • View online
Happy Wednesday! This week, the U.S. celebrated National Radio Day, which is dedicated to celebrating the work of local and community radio stations. Yet Nicki Minaj’s “Queen Radio” show on Apple Music’s Beats 1 stole most of the radio-related headlines. Does this mean Beats 1 is the new “community radio”? *scratches head*
I’m headed to the DIY Musician Conference in Nashville tomorrow, where I’ll be leading an hour-long workshop on social media strategy for artists. Let me know if you’ll be in town and want to grab coffee and/or food!
On the writing side, I’ve made it a priority to expand my portfolio across a wider range of publications, in addition to my regular, core work with Billboard and Forbes. Last week, I published my first-ever articles for Music Business Worldwide and Variety, and am excited to share them with you all—scroll down to the “My writing” section below to check them out!
• • •
Today’s newsletter aims to unpack one of the buzziest phrases in the digital age: data-driven storytelling.
Why exactly is that concept so buzzy? We’ve loved stories for as long as humanity has existed, but there are so many of them flooding the Internet today that we often find ourselves at a loss for where to start as readers, and how to convince people to read our stories as authors. As data and algorithms increasingly govern our daily lives, they also increasingly act as a credibility filter on our own narratives.
But sometimes we end up leaning on data filters in the wrong ways. One useful framework for understanding the drawbacks of data-driven storytelling comes from the academic paper “All forest, no trees? Data journalism and the construction of abstract categories,” written by the University of Alabama’s Wilson Lowrey and Jue Hou.
The researchers analyzed nearly 200 data journalism projects published between 2011 and 2016 to study where exactly those projects got their data, how they incorporated data into their arguments, and to what extent they publicly discussed the limitations of these datasets.
The study found that 40% of these data journalism projects involved “abstract constructs” and quantitative indices that the authors either invented themselves or adopted from previous authors in order to help explain complex issues. An increased use of abstract constructs also coincided with a decline in anecdotal, qualitative reporting, with the number of data-journalism projects incorporating qualitative methods decreasing from 44% of projects in 2011 to just 26% of projects in 2016.
More importantly, just one-third of the projects explained the limitations of the datasets they used. As Lowrey and Hou argued, while statistical categories and metrics may be appealing and digestible, they are ultimately “social, political, and economic constructs and not objective reflections of the world,” and should be treated with greater scrutiny accordingly in discussions of findings and conclusions.
This is an invaluable lesson that I think anyone in music can learn, especially as our industry culture becomes more tech-, data- and platform-driven: quantification does not always mean the elimination of blinders. While analyzing metrics over time can lead to more clarity in certain situations, over-reliance on data as the only source of authority can often make you more blind and more prone to making faulty assumptions and decisions, not less.
Drawing from the aforementioned paper, there are three questions that people can ask themselves when they’re either consuming or crafting a data-driven story:
  1. Are the authors building any “abstract” or self-made data constructs or indices in order to drive their argument forward? Would their argument become invalid if they used a different construct?
  2. How much anecdotal or qualitative reporting are the authors including to imbue the data with more context, particularly around social, political and cultural issues?
  3. Are the authors realistic and open about the limitations of their datasets? If not, why not? What might that reveal about the gaps in their arguments?
Now, I’ll dive into two examples from the music industry—one on the aggregate industry level, and another on the individual artist level—to understand where this framework can come in handy.
• • •
The example from the industry side is Citigroup’s sprawling report about the future of the music business, titled “Putting the Band Back Together.” Clocking in at 88 pages long, the report is undoubtedly one of the most comprehensive introductions of a complex industry catering to an outsider audience that I’ve seen over the past few years. What’s more, Citi’s approach seems as data-driven as you can get—pulling financial figures from several SEC filings and IFPI/RIAA industry reports, plus granular contract- and royalty-level data from reputable books like All You Need to Know About the Music Business.
But the opening pages of the report present a controversial claim, gleaned from both self-made and adopted data constructs, that the Citi authors leverage to drive their argument forward: artists captured just 12% of the music industry’s revenues in 2017, with significant “value leakage” going to labels, concert promoters, digital distributors, internet radio services and other intermediaries. (Interestingly, the authors also concluded that more vertical and horizontal integration, not less, would have to take place in the music industry order to increase artists’ share of the overall pie beyond that 12%.)
Most media outlets ran off with the 12% figure as an authoritative, “data-driven” statement on the current imbalance in the music business that must be fixed. Nearly none of these outlets clarified the report’s methodology or audience, nor did they investigate any limitations of the analysts’ data sources.
Again, Citi’s report is arguably the first of its kind for an outsider audience in terms of its level of detail and amalgamation of multiple sectors within the music industry, which deserves respect in its own right. But going back to the data-journalism framework I outlined above, the following aspects of the report should give readers pause:
  1. Reliance on already-made data constructs, at the expense of qualitative reporting and cross-checking. Aside from the few qualitative “Expert Interviews” featured at the end of the report, the authors didn’t consult directly with any senior execs or artists in the music industry to give more context to their purely quantitative sources—instead resorting to financial documents alone as authoritative “enough.” This led to several inaccuracies around royalty rates and record contract terms that may have falsely skewed the authors’ eventual conclusions (e.g. many record contracts today are actually much more flexible and friendlier for artists than the report makes them out to be, in terms of giving artists a larger share of recording and ancillary revenue). What’s more, the 12% number should not be confused with an individual artist’s profit margin, which is often much higher.
  2. Limitations tied to audience: The primary audience of the report is not current or aspiring music-industry professionals, but rather Citi’s client base of institutional investors, who are arguably more interested in high-level financial documents, stock performance and M&A activity than in industry-specific jargon and purely anecdotal information. This point is directly tied to #1 above, in that intended audience often has a significant impact on methodology, for better or for worse.
I invite you to read my full article for Billboard outlining the industry’s criticisms of the report, as well as responses from the Citi analysts themselves. I would also love to hear what you thought about the report and whether or not you agree with its conclusions.
• • •
Now for the second example, on the artist level.
Going back to the idea of data as a credibility filter, artists are increasingly relying on streaming and social-media metrics to vouch for their own credibility in the face of agents, bookers, A&R execs, press outlets, fans and others. If you’ve pitched for a festival slot, playlist placement, late-night TV performance or brand partnership recently, you’ve likely engaged in “data-driven storytelling” as a promotional tool—pointing to quantifiable milestones in your growth, in an effort to persuade potential business partners to get on your moving bus.
I want to focus on playlists, because playlist placement has become one of the most influential “plot devices” in artists’ data-driven stories about themselves. There’s significant income at stake: the European Commission recently found that placement on top streaming playlists, such as Spotify’s RapCaviar and Today’s Top Hits, can lead to five or even six figures in additional revenue for an artist.
Usually, “playlist-as-plot-device” comes up in the form of an artist or manager talking about getting a track placed on a playlist with so-and-so many followers that generated so-and-so many streams in a given amount of time, which then makes that artist “worthy” of additional promotion or support via other channels.
Going back to our data-journalism framework, constructs and metrics such as follower/stream count are far from “abstract” or “self-made"—they are concrete and regularly updated, in accordance with consumer activity. But stripped of context, they are also extremely limited.
For one, follower count is a wildly incomplete and insufficient metric for understanding actual engagement on streaming playlists. I wrote an article for Billboard last month about how some of the alleged "top” Spotify playlists in terms of follower account alone, such as EDM-focused mint, actually generate less engagement for the average track than other playlists with only around 20% to 30% of the followers (e.g. Pop Rising and Young, Wild & Free).
Yet, many artists and managers view placement on playlists with larger followings as the most important “plot device” in their promotional story, without realizing that follower numbers might leave crucial information about more meaningful interactions out of the picture.
Stream counts are also far from authoritative as standalone metrics. I was chatting with someone at a major label this week, and he told me that an artist gaining a million streams in a day can be really terrible, and an artist gaining just 15,000 streams in a day can be the best thing ever—it all depends on context. How much $$ did you have to shell out to get to that number of streams? What is your true return on marketing spend? Is this level of activity among your listeners both sustainable and repeatable?
Vying for market share, major labels understandably prioritize metrics around volume and reach, and lean on those metrics to evaluate performance. Yet you can be a major-label artist telling a potential business partner that you got “five million streams on your hit single in a week,” which sounds amazing—except you had to spend three times as much money as your competitor to get there, and your overall stream:listener ratio was only around 2:1, meaning the average listener only stuck with your song twice before moving on.
In these types of cases, it helps to be realistic about the limitations behind your data and your bottlenecks to deeper understanding, and the resources, partnerships and infrastructures you need to put in place to resolve those bottlenecks and write a better next chapter in your data-driven story. Otherwise, you could be in danger of failing to deliver on the promises you make on the basis of surface-level metrics, which I think hurts everyone involved.
• • •
A last note to synthesize all of these thoughts: by nature and by necessity, data storytellers are data gatekeepers.
Gatekeepers aren’t all bad; they’re filters for helping us navigate a ever-noisier world. But for a data-driven story, there are specific motivations (benevolent or otherwise) behind which data sources get to pass through the gate, and which ones get rejected or overlooked. What’s more, as the historical music industry knows all too well, gatekeepers often fall short on transparency around these motivations.
Demanding more of this transparency around motivation and audience, realizing that even the most comprehensive data-driven stories have some sort of built-in gatekeeping mechanisms, and being realistic about the limitations of the data we take for granted—I think all of these mindsets can lead to a healthier future for all industries, including but not limited to music.
Are there any other examples that come to mind for you of where data-driven storytelling in music is done wrong, or right? I’d love to hear from you—simply reply to this email and let’s chat!

My writing
What happens when artists and record labels build and buy their own media companies?
CONFERENCES: It’s SXSW PanelPicker season! I’m part of two SXSW panel proposals for 2019, both about music and tech: The Future of (meta)Data and Licensing That Works: A 21st-Century Framework. Would greatly appreciate it if you gave both panels an upvote, simply with one click of a button—the voting portal is open until August 31. (Special thanks to Music Ally for including the licensing panel in their roundup of SXSW proposals to watch!)
MEDIA: I had the pleasure of being a guest on CNBC’s Fortt Knox again last week, chatting with anchor Jon Fortt about the rise of streaming/telco deals and their implications for the music industry’s future growth. Here’s a link to the video on YouTube—I start talking around the 20-minute mark.
I was also invited as a guest on the Music Biz Weekly podcast with Michael Brandvold and Jay Gilbert, where we talked about the importance of transparency and creating a level playing field for artists when it comes to digital and tech literacy. You can watch / listen to the episode here!
Good reads
Listening for Silence With the Headphones Off
Obligatory potato
Droughts and heatwaves have shriveled up around one-third of the bintje potato crop in Belgium, the primary variety used to make local frites (frieten). Prices have shot up by over 150% as a result, hurting local frite stands (friteries) and raising fears about turning away customers at the onset of a crucial season. Says local friterie association head Bernard Lefèvre: “Frites are essential. It is vital. It is part of our culture. It’s more than a product—it’s a symbol of Belgium.” Stay strong…
Did you enjoy this issue?
Cherie Hu


If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue