Clubhouse. Twitter Spaces. Mark Cuban’s Fireside. What is happening in social audio now, with Jeremiah Owyang

Social audio is having a moment. A big moment.

We’re seeing an explosion of players in the space: up to 30 at analyst and thought leader Jeremiah Owyang’s last count, and soon to be over 100. Owyang is an analyst at Kaleido Insights, and shared his data on the emerging space on his personal site, Web Strategist.

And that’s just the beginning. We’ll see an explosion of hundreds of supporting and ecosystem apps in social audio shortly, says Owyang.

I caught up with Jeremiah recently on TechFirst to chat about social audio: why it’s hot, what’s happening, what’s driving this trend, who the key players are, why it’s here to stay, and what innovation this new sector will give birth to.

Scroll down for video, audio, and a full transcript.

Click here for the Forbes story based on this interview …

Subscribe to TechFirst


Watch: social audio, with Jeremiah Owyang

(Subscribe to my YouTube channel so you’ll get notified when I go live with future guests, or see the videos later.)

Read: what is happening in social audio right now?

(This transcript has been lightly edited for length and clarity.) 

John Koetsier:  Welcome, Jeremiah! 

Jeremiah Owyang: Hey, John. So good to see you, at least from a distance. 

John Koetsier: It is so good to see you too. It’s been a long time. It’s been over a year since we’ve seen each other in person. I am so pumped to chat with you because audio is crazy hot right now. And you are kind of— 

Jeremiah Owyang: But we’re doing video. 

John Koetsier: I know, we’re doing video too but [laughter] … we’re talking about audio and it’s crazy hot right now, and you’re kind of really on the cusp of what’s happening there. You’ve got your finger on the pulse. You’ve just released a major report as well. And I guess I’ll start here: what on earth is going on with audio right now?

Jeremiah Owyang: Well, we are all segmented in our homes and that has been the main driver here. You know, we are really just trying to reconnect with each other.

And what we found is during the quarantine, is that text messaging just doesn’t give us the emotion and the nuance that the human condition requires, especially during isolation. On the flip side — aside from this amazing show — Zoom calls and video shows are just too taxing on people.

Like right now, I’m staring at a camera because I want to look good to everybody who’s watching, but that can be straining over time to always just look into the camera, because your face is actually a few inches over. And then you have to make sure do you look good even though you feel horrible inside for all the reasons that we know. Is my background clean? You know, are there kids around?

So in between is what we call the Goldilocks medium — not too much, not too little — and that’s social audio. Real-time audio. 

John Koetsier: That is maybe the best explanation I’ve heard so far. And I totally get what you’re saying about video, because it’s funny, we video conference and we stare at our own faces. We’re video conferencing, we’re loo king — do I look good? Do I look stupid? [laughing]

Jeremiah Owyang: It’s the vanity instinct, you know? Like, oh no, right [pretend preening] — everybody’s doing that, and nobody’s looking at anybody else. They’re just checking their own face. 

John Koetsier: Exactly. Exactly. So audio is having a moment. You did a major report on that. 

Jeremiah Owyang: Thanks.

John Koetsier: We see a lot of players … I think you calculated something like 30 different players here.

Jeremiah Owyang: 30 players.

So the category is called social audio. The leader in the space is Clubhouse, but Twitter Spaces also has a beta product out and I’ve been testing it. Facebook also announced that they will have their own social audio product. Mark Cuban announced his own, called Fireside.

And there’s actually 30 total companies and more coming on my list.

Some of them are for specific verticals, like there’s one that emerged called Angle that’s really focused on the Swiss-German market. There’s one that’s called Saga, which is supposed to be for you and your parents to communicate and to even record stories for the future. There’s another one out called — is it Seventh Avenue, I think, which is for Black Americans to communicate with each other. So there’s variations across this — it is a growing category, and frankly well-funded in that the huge tech companies are in here now. 

John Koetsier: So I love what you said earlier about, you know, video doesn’t really cut it because we’re so focused on our appearance and what’s going on there. Also we have to kind of look good, you can’t just let your default ‘resting B face’ be out there that I typically have as well. Right?

So, but audio, you can do a few other things. You can do things, it does take time though, right? It does take time. It’s a very different feeling than a text-based or a mixed media social network like a Facebook or a Twitter or something like that. What’s different about it, particularly? 

Jeremiah Owyang: You come as you are.

You don’t have to be wearing your business suit. You don’t have to be wearing clothes, frankly. Some people don’t, apparently. It doesn’t matter, nobody’s going to see you. You come as you are and you tell your story, like you can hear that in the voice and the emotion. And you can really communicate in a way that text doesn’t.

And so really that’s why we see this huge trend of all of these companies and people want to connect with strangers. People want to reconnect with old friends. We are longing to be with each other. Like John, if I saw you again, like we’ve been at many conferences, I’d give you a big hug right now if it was safe to do, right?  

John Koetsier: Super uncomfortable. [laughing]

Jeremiah Owyang: Yeah, don’t cough on me.

John Koetsier: I’m just joking. 

Jeremiah Owyang: But you can see. I mean, we just — it’s the human condition to be with our clan, and social audio is the best thing we can do. 

John Koetsier: One thing that Chris Voss said — and Chris runs a bunch of different podcasts and he’s getting into Clubhouse big time — is that you don’t have to be super hot, super good-looking by conventional standards. You can just be yourself. It’s your voice and what you say— 

Jeremiah Owyang: Your ideas. 

John Koetsier:  And the quality of your ideas. Agree? 

Jeremiah Owyang: Yes. And how you treat other people — that comes out. Do you interrupt people, right? 

John Koetsier: That’s a really interesting point. I heard that on Clubhouse yesterday. I’ve tuned in a variety of times. I’m not huge on the platform, but I’ve tuned in a variety of times.

I heard it last night, and somebody was saying you get a nuance to voice that you don’t get via text. So you don’t know … am I joking right now? Am I laughing right now? Am I concerned? Am I angry? Am I, you know, unless you’re going all caps and using lots of emojis, you really can’t get that nuance with texts, can you? 

Jeremiah Owyang: You just can’t. And that’s really, I mean, this isn’t new, we’ve had party lines, and phones, and conference calls before, right? But now we’re meeting with strangers and having these sometimes deep conversations around topics that are so meaningful to us; or business conversations to talk about what the future of the industry is; or just casual chit chats and like you would be hanging out with your friends at the bar, and people are actually drinking martinis while they’re hanging out with their actual friends, or the new friends that they just met, or friends they haven’t met yet that are in the room.

So there’s many different use cases that really all map to the human condition as us, as social creatures.

John Koetsier: Let’s ping in on the big kahuna, or the one that you said is leading the category, which is Clubhouse.

How has Clubhouse done it — right place, right time? I mean, there’s a bit of that in every startup, right? And maybe a set of features that works and not too many, not too much. What has really done it for Clubhouse? Why has that one taken off so much? 

Jeremiah Owyang: Yes, the timing was right.

They launched in March. Quarantine started in … March for many places. So their timing was perfect and they started with some tech influencers and they invited in VCs. Then they invited in the Black entertainment community who brought in lots of hands and many stars. And then it grew out from there, and so that was a really brilliant set of plays that they did. Intentional? Happenstance? Yeah. A little bit of both.

The features are super easy to use. It’s just very simple to use it. There’s a mute button, you can go to different rooms and change your profile. You know, there’s no texting, there’s no emojis. You can communicate with the mute button or changing your profile photo, but mostly, 90% of it is your voice. And so that has been the main thing for that. 

John Koetsier: There’s been a draw to the exclusivity of it as well, right? 

Jeremiah Owyang: That’s true.

John Koetsier: I mean, because you get invites when you use the system and then you are able to give those away, and people say, ‘I’ve got invites, who wants them?’ And everybody’s like clamoring. That’s kind of an interesting — we’ve seen that in other places as well.

It’s also iOS only. I mean, that’s kind of shocking. We’ve actually seen kind of clones come up on Android which are hacking into their API and distributing those conversations on Android as well. 

Jeremiah Owyang: So the team doesn’t mean to exclude Android, it’s that they’re just, they were just like three, four people for months. They just only had, you know, two iOS engineers.

They didn’t have an Android developer until last week

John Koetsier: Wow!

Jeremiah Owyang: Okay. And if you know development, iOS is easier to develop on because there’s a constrained set of devices that are very specific and specifications made in only approved Apple factories, versus the Android ecosystem which is two thirds of the internet, or even broader three fourths … there are thousands of permutations of hardware. 

John Koetsier: Yeah. 

Jeremiah Owyang: So it’s really around being efficient with resources as a small startup. They also just hired a new Android developer, a Black woman from Medium today. So we’re all celebrating her addition to the team.

So they need to hurry though, because Twitter Spaces mentioned that they will have an Android version next month. So that’s in the next three, four weeks and they could sweep the market underneath Clubhouse by getting Android out there if they can do that and have a scalable platform. 

John Koetsier: Well, that’s a great segue, because I’m kind of intrigued by Twitter Spaces. I’m not on it yet. I’ve participated in a couple of those sessions, and what’s neat there is it’s tied to an existing large network. It’s a newsy/social network, right? And Twitter is shutting down Periscope. Your thoughts there? 

Jeremiah Owyang: One other really endearing feature, which is the hallmark of Twitter, is that your visual tweets are integrated into that actual Twitter Space. 

John Koetsier: Wow.

Jeremiah Owyang: So you can actually go to a tweet and hit the ‘share’ button to your space, and there’s a carousel at the — so imagine this is the screen, oh, I can’t show you right now.

There’s the screen, at the top there’s a row of your tweets, and you can slide them and people will talk about the tweets in real time. So, imagine breaking news happening, photos emerging, and they’ve also promised that you can have a million people listening to a space, versus Clubhouse says— 

John Koetsier: 5,000.

Jeremiah Owyang:  Well it’s like seven now, but yes, the point is — Twitter knows how to scale. 

John Koetsier: Yes 

Jeremiah Owyang: Twitter, I mean, they had their Fail Whale years and they’ve learned from that.

So I think there’s going to be a massive battle between that, and the celebrities and the brands and the marketers are already on Twitter. Their credit cards are already plugged in and their follower base is there. They’re already verified. So there’s going to be that battle.

But I just want to make really one important point: there’s room for at least three social audio players in your phone for you to use every day. It’s not about one. Just think about all the social networks you have, how many different email accounts you have, how many different messaging platforms you have. You have at least three, and typically I see them in the following: first, biggest, and different.

So Clubhouse has got first. Whose biggest? It’s going to be Twitter probably, or Facebook. And then different, we’ll see some of those vertical ones. Like we might see a social audio just for cool people in Vancouver, as an example.  

John Koetsier: Mm-hmm. Super, super, super interesting. So, I asked in the prep doc, you know, who should we really be watching? You said enterprise social audio and social analytics. Can you expand on that a little bit? 

Jeremiah Owyang: Sure. So in my post — you can check out my pinned tweet and see the whole post, and you’ll see there’s the — I made a number of predictions.

But there’s two product categories that are going to be emerging from this market. We can already see little, little tiny startups that are emerging that are doing these. One of them is called social audio analytics and, John, this is not new, you know social media analytics.

John Koetsier: Yes. 

Jeremiah Owyang: Remember Radian6 was acquired for $300 million by Salesforce? Ten years ago, they were scraping and getting API access to text-based social networks. Well, there’s going to be another startup that emerges that does the same thing for social audio streams, whether it’s in the terms of service or they just get a partnership for the API.

The second category is what we call enterprise social audio, and that’s the same features that we’ll see for your workplace. And here’s an example — I think you know this play — we saw Facebook and Twitter emerge and LinkedIn, then all of a sudden, Yammer, Chatter, and Socialcast, and then all the variations added to existing enterprise software. We’re going to see—

John Koetsier: Facebook for work.

Jeremiah Owyang: Thank you. Yeah. We’re going to see all of that happen, you know, Webex might have social audio. Salesforce Slack is going to have social audio. It’s going to be everywhere. So in those cases where we have existing communication platforms, social audio will be a feature. But for companies like Clubhouse, social audio is the product.

John Koetsier: Yeah. Yeah. Very, very interesting. I was wondering, as you’re listening to more and more social audio, does that cut into podcasts? Does that cut into podcast time? I’ve seen some people who are trying to do a podcast and the Clubhouse at the same time, and broadcast their podcast, like us talking, broadcasting on Clubhouse. Where do you see that going? 

Jeremiah Owyang: Yeah, it has cut into my podcast time. It really has. But the smart folks do integrate it. So Mitch Joel, my friend, he records his live podcast. We were doing this on Clubhouse, and then he replays it on Clubhouse, and then he has conversations about it on Clubhouse. So he’s repurposing the content, but he also brings in the crowd to ask questions of us, versus in a podcast … that wasn’t happening. 

John Koetsier: Yeah. 

Jeremiah Owyang: So he’s trying — that’s a hybrid — so he’s really blending them together in a very innovative way, and really one of the first guys, not the first, but one of the first guys to try to do that.

John Koetsier: Super interesting, because that was one of the things I was wondering about when I was getting interested in Clubhouse initially is, hey, I have these one-off conversations. Where does it go? Can I use that later? Can I save that? Can I extract the insights in that conversation by going speech-to-text? Those sorts of things, and so obviously that’s all coming. 

Jeremiah Owyang: Yes. Those things are also coming as well. So, natural language processing, and you can see there’s real-time translations already in Twitter Spaces. You can see some people are speaking and it’s coming out a second later in English, which is fascinating to think about, like the AI capabilities that can harvest that information. Plus, of course, the sensibility for those that are hearing impaired. So there’s a lot of amazi— oh, and translation, instant translation country to country. 

John Koetsier: Yes. 

Jeremiah Owyang: Can you bring the world together? Like that’s an amazing concept to think about. Yes, I know text-based translation is not new, but there’s something more intimate when we have our voice to do that. 

John Koetsier: And it’s important. I mean, I have some Twitter followers who are blind, you know they can listen, that’s great. And there are some who can’t type, right? Differently abled. And so being able to do that with voice is great. This is really interesting for me, because talking — like we’re doing right now — is kind of the original technology for communication, right? Now it’s returning in digital form. You said, ‘Nothing new is new.’ 

Jeremiah Owyang: So, here’s what I think about which technology will be adopted:  if it’s something that’s already innate to the human condition and tribal behavior, then you’ll get a high level of adoption. You and I were tracking social media from the early days. People want to talk and tell stories, right? This is… 

John Koetsier: Yeah. 

Jeremiah Owyang: It’s core to us. You and I were both there tracking the sharing economy, you know, villages would do this. This is how you would survive, to share resources. This is why we saw Airbnb and, in some cases, ride-sharing take off. And this is part of what we do. So if you see an existing behavior that’s amplified by technology, augmented by technology — and in this case, this is just giving us rich profiles, friends that we can find, conversations that are by different topics. So this is different, right?

It’s like, imagine you can go to 10 different conferences or festivals or coffee shops within the next two hours. That gives us instant access to all those social venues that we already know we want to do, but we can’t. Then it really helps to reinforce that we’re going to have that rapid adoption. 

John Koetsier: It just makes you wonder as well where conferences are going. I mean, assuming COVID passes at some level at some point and people want to do in-person conferences again, I can imagine doing that.

I can imagine wanting to do that. I can imagine wanting that travel, that in-person experience … but I can also imagine that there’s some challenges with competition, really. I mean, you can hear Marc Andreessen live, “in person” quote/unquote, in audio on Clubhouse most nights, it seems like. He’s always, always on the service, right? And you don’t have to go to Las Vegas or somewhere else to the right conference for $3,000 to hear Marc Andreessen anymore. 

 Jeremiah Owyang: Yes. In addition, those speakers though, may choose to go to the higher venue, higher paid speaking gigs. So they may pull off the platform when quarantine starts to end, resulting in this economic and societal change.

So I’m going to make a guess that we’ll see maybe a 30% reduction in social audio when the world can reopen again. And I think that’s, frankly, a good thing. We want to get back to you seeing each other physically.  

John Koetsier: Mm-hmm. Excellent. Jeremiah, for those who are seeing you for the first time, or maybe know of you, have heard of you, but don’t know too much about you … you’ve got an amazing backdrop right there. Tell everybody real quick where you are and how you’ve created your own office for quarantine times. 

Jeremiah Owyang: Yeah. So many people will think that it is a digital backdrop and I get that question a lot — and you probably will never believe me, and that’s okay too — but I do work in an Airstream, in my backyard, which I procured a number of years ago as a business expense. And since then, I’ve actually become an Airstream ambassador, helping them even thinking about the future of work. Because I’m not the only one to have an Airstream in their backyard and I won’t be the only — there’s many people that are doing that now. So, yes, I’m in an Airstream in my backyard. 

John Koetsier: That is amazing. You combine business and pleasure and opportunity like no one else. That is wonderful. 

Jeremiah Owyang: Thanks.

John Koetsier: Maybe to finish off our conversation on social audio … at TechFirst, this podcast is about tech that’s changing the world and innovators who are shaping the future. Where do you see social audio in a year or two, and then maybe stretch your gaze a little bit and go out to a decade? 

Jeremiah Owyang: Sure. So the next phase we’ll probably see obviously Android, and there will be a battle between Twitter and Facebook and all the other players I mentioned. After the quarantine is released and people go about the world we’ll see a shakeout in the market.

And — oh, I forecast there’ll be 100 players on my list by the end of the year. 

John Koetsier: Wow.

Jeremiah Owyang: One hundred. The cost of getting it going is easy. You can get a subscription with Agora, which is one of the underlying platforms for social and live video as well, so that’s why these companies can spin these instances up very quickly.

So we’ll see that a rapid pull out of the market and a shakeout, and that’ll be a good thing. At that point, we’ll also see these platforms offering APIs. Twitter already has an API for text-based tweets. Why wouldn’t they do that for social audio?

And at that point, we’ll see an explosion of thousands of apps, just like we do with Twitter apps, and Facebook apps, and Google apps. And so we’ll see that start to happen. At that point, we’ll also see that the embedding of social audio everywhere around the internet will start to happen, and I call that the colonization phase.

When, imagine you’re reading an article — tell me your favorite hobby, John. 

John Koetsier: I’ll say … working out. 

Jeremiah Owyang: Fitness. So you’re reading and you just read an awesome article around doing landmines, which is an exercise I know you like, and it’s talking about good form.

Imagine right there it says, ‘Click to listen to your friends talking with fitness instructors around how to do proper landmines. They’re doing it right now, do you want to join in, John?’ [laughter] Of course you’re going to join, John.

John Koetsier: Take my credit card!

Jeremiah Owyang: Yes. So the social audio experience will be embedded — might be recording too, right? Of the top fitness instructors and athletes doing and talking about doing landmines, which is a fun exercise. So it’s going to be decoupled from the app and it will be spread around the internet just as we saw likes and comments and emoticons from Facebook or Twitter embedded in other websites as well.

That same thing will happen in that radical future. So it’ll be decoupled in a way in this colonization. So there’s just a few predictions that I have.

Also, of course, we will be in social audio platforms talking to Alexa and Siri, they’ll be right next to us. It will be normal for us to have conversations with AI bots and that’ll be something that we get used to. I mean, we do it in our home. Why wouldn’t we do it in a social setting as well? So there’s just a few ideas to sink into your teeth. I have more ideas about the future, but that’s just a few. 

John Koetsier: Excellent. One thing that came to mind as you were talking there, you mentioned Twitter again. Twitter has been rumored to come out with some kind of subscription or some kind of paid level of service, and I can totally see how they’re going to work the audio into that as well, right, and have some paid level of service — follow people, connect with people.

I think that’s going to be a whole other revenue stream for them and possibly a significant growth driver as well, because we’ve seen all that growth going away to various different Patreon type sites, OnlyFans type sites, you know, Subscript, that sort of thing … and Twitter can get in on that action. 

Jeremiah Owyang: Yes, and/or they could share the advertising revenue with the creators as well. So there’s another way to generate revenue and plus there’s crypto coins and personal coins, which I am looking at that model for creators as well.

So yeah, there’s actually over 15 ways that social audio companies can generate revenue. You can even have premium features for users and those revenues could go to the creator. So there’s many ways to do this. 

John Koetsier: Yep. Excellent. Jeremiah, where can people find you? I mean, you’re a Google away, but…

Jeremiah Owyang: Yeah. You can just find me on Twitter @jowyang, you can see the spelling of my name. And then my company is

John Koetsier: Wonderful. Great to have you. Thank you so much! 

Jeremiah Owyang: Thank you, sir.

Made it here? Just subscribe already …

Made it all the way down here? Who are you?!? 🙂

The TechFirst with John Koetsier podcast is about tech that is changing the world, and innovators who are shaping the future. Guests include former Apple CEO John Scully. The head of Facebook gaming. Amazon’s head of robotics. GitHub’s CTO. Twitter’s chief information security officer, and much more. Scientists inventing smart contact lenses. Startup entrepreneurs. Google executives. Former Microsoft CTO Nathan Myhrvold. And much, much more.

Subscribe on your podcast platform of choice: