This week I’m sharing my current POV on the third wave of audio. The first wave was RSS feeds + people in their garages + iPods. The second wave was powered by audio streaming, driven by improvements in cell coverage and battery life, the connected car, and to some degree smart speakers. So what’s the third wave? It’s a combination of what’s missing today in the sound ecosystem, and what’s coming in terms of technology improvements and adoption:
- Discovery - Now there are a lot of podcasts. How do people hear about new ones and decide what to listen to? People used to browse the iPhone app store for new apps, but now they have the apps they want, so it’s harder to get their attention for your app. The same thing is happening for podcasts, so platforms and new companies will try to figure out how to help new podcasts get discovered.
- Monetization - Ads are one way to monetize. Some podcasts use “listener support.” Pretty interesting and there are lots of other audio products that have been able to reframe their products not as podcasts but as utilities. But think about what Headspace did…they effectively launched a library of podcasts, called it a meditation app, and charged $15/mo. What are the other ways audio products will be able to monetize beyond “listener support”? I wrote a bit about this here and am looking forward to seeing what monetization strategies listeners end up gravitating toward.
- Airpod/Airbud first - If you haven’t tried airpods yet, there are two interesting features. One is that they have the capacity to act like an Alexa in your ear, meaning that you can talk back and forth to them. An interesting consumer behavior is that people seem to be leaving them in by default, versus taking them out immediately after being done listening / being on a call. What does that mean for the types of audio apps people may be interested to use?
- Synthetic Voice - I wrote a while back about how I was listening to written articles by using the Siri voice (my original post here). Anyone who’s tried this knows robot voices aren’t pleasant to listen to for long periods of time (Amazon doesn’t even allow developers to have Alexa speak more than 90 seconds of content, presumably for this reason). But synthetic voice is getting better. Google Duplex is pretty good, and some other startups are using advances in transfer learning to make the tone more emphatic and easier to listen to (e.g., Resemble.ai, a Betaworks investment).
So here’s what I’m reading this week and how I’m applying it to the framework above:
> Spotify advertisers can now target listeners by what podcasts they stream.
Why is this interesting? On the surface, it’s interesting from a monetization perspective: theoretically it means more ad revenue will be driven to podcasts in a more targeted way, which hopefully trickles down in some way from Spotify to the podcasters themselves e.g., by Spotify paying podcasters to create new content since Spotify can better monetize them through targeted ads. More interesting, however, is that it’s a potential way to address discovery.
Until now, podcasters promoted their podcasts by being guests on other podcasts with an audience who might be interested. That has been a reasonably successful strategy. A problem is that podcasts tend to get listened to and then not surface again, so if you did a bunch of guest appearances in January, more episodes of those podcasts have come out by June and you’re probably not seeing as much conversion in June as you did in January. If I were a podcaster (I am
?) , I’d use this ad inventory to promote my podcast as a complement to appearing on others’ shows. It means podcasters can constantly have that guest host effect, but also presumably measure which podcasts convert best, and constantly promote relevant episodes and keep changing the messaging, versus being a guest on that podcast once and hoping it goes well.
>Stanford Team Aims at Alexa and Siri With a Privacy-Minded Alternative - The New York Times
. On one hand, I don’t think of data privacy as part of the future of audio because if anything, there’s not enough data in podcasting right now (podcasters rarely can even tell how many subscribers they have). However when it comes to virtual assistants, the privacy problem is real. I did a section recently on data privacy and smart speakers (link to that issue here
). I like the angle of this article for two reasons. First, it doesn’t focus on whether assistants should or shouldn’t have more data, but rather that government scrutiny simply hasn’t gotten to them yet because the market is still small. Second, the article as about a company trying to train assistants in a way that is privacy-first.
One of the biggest privacy issues people have right now is that Alexa needs to take your voice and run it by a human being who can help tag the data so that it can get smarter for everyone. That doesn’t feel great, but also (per my prior writing), seems necessary in order to train the AI to listen only when actually summoned (versus to *think* it hears “Alexa” when you really said “Schmashmeksa” or something). The article is about a new service, called Almond. Here’s a snippet:
The system from Dr. Lam’s group is called Almond
. In a recent paper, they argued for an approach in which virtual assistant software is decentralized and connected by programming standards that will make it possible for consumers to choose where their information is stored and how it is shared.
I took a look at the site and it’s still unclear to me how they train the machine without using people’s data. My best guess right now is that they do use your data, but you control whether they use it. And the *real* value is that since it’s open source, no company will have a monopoly on it and so you’ll have more control in how (or even if) your data is monetized. It’s an interesting approach, but I’ll need to learn more. They have a pretty unique video though about life with Almond: