Twitter, Linkedin, Facebook, Instagram, Youtube, and TikTok are primarily visual. People post text, photos, and videos. Social audio platforms offer a radically different medium of communication because they are primarily audio and are, crucially, conversational and deliver content in real-time.
Before I delve into 10 ways I believe such social audio platforms provide unique experiences compared to the existing social platforms, and therefore 10 reasons why the medium is not a passing fad – let me identify two important aspects of audio that one should always keep in mind.
Time linearity: Unlike visual artefacts (text and photos), audio is time linear. Meaning that to consume audio the way it was meant to be consumed by those who created it, one has to listen to its content one second after the one that came before it. Unlike text, where one can easily scan text and jump around, one can’t easily go back and re-listen to something that was said 10 seconds ago without non-trivially disrupting the experience. One can, but the disruption in the consumption experience is significant. In photos, the time dimension just does come into play.
Ephemerality: Unlike text and photos, audio by definition does not persist. You can’t behold audio. Once spoken, what is spoken is gone. With that said, here are 10 reasons that I believe make social audio compelling and why, because of these reasons, I don’t believe that social audio is a fad.
1. You can simply start listening to strangers talk to each other
The majority of the people who are attending a Clubhouse room or a Twitter space are not talking: they are listening. Some of them are listening more attentively than others, but they are all more or less listening. Eavesdropping on a conversation that strangers are having with each other is not something that we can easily or frequently do in real life.
Where else, and how else, can you just drop into a conversation on a whim – and are welcome to do so – and begin to listen to conversations strangers have with one another. And crucially, not only be able to drop in but be welcome to come on in and listen?
2. People are paying attention to what you are saying
Being on stage and speaking while other people – mostly strangers – are politely listening to you is empowering, even cathartic. Again, where else can you do that in your day and on a fast whim?
Compare this to regular social media: you don’t really know how many people have actually read what you wrote, let alone that they read all of it. Views and likes are an indication, but a pale one: people can view a tweet, or like a Facebook or a Linkedin post, but they may do so without reading it.
Maybe they like you and did a quick scan and saw that what you wrote is interesting enough and so they liked it. Or maybe they want to ingratiate themselves to you and so they liked it without even reading it. Neither of those types of likes means that they actually read what you wrote. In linear social audio, you can bet that a good portion of the audience is listening to what you are saying. And that’s something powerful.
3. You can’t easily reduce the human to a cartoon
More than your face, or your style of writing, or the types of ideas that you express, the sound of your voice and your style of speaking is for some reason a deep manifestation of who you are: a male or a female, young or old, happy or sad, confused or purposeful, scared or confident, humble or arrogant, considerate or self-centered. People listening to you are not simply parsing your words: they are parsing your personality and your emotional state.
For this reason, it is much harder to reduce a human being to a caricature (favorable or otherwise). One can certainly weaponise their prejudices if one is, for instance, an overt racist or a closet bigot, but anyone who is listening in good faith is less likely to reduce a human being speaking in the fullness of their voice to a cartoon figure.
4. You can qualify people much more accurately
You can tell a lot by how a person behaves during a conversation. Do they hog the stage? Do they acknowledge what other people said? Do they have interesting things to say? Do they repeat themselves? Do they say hackneyed, superficial things? Are they curious? How do they react to someone who contradicts one of their points: do they let it be or do they react defensively? Whether or not I want to pull a person into my life can be highly informed by the answers to those questions.
5. It forces you to exercise your attention span muscles
In a world of posts and tweets, photos and short videos, an hour-long, linear, one word after another medium is a place where I can go and engage in some attention-span muscle building. Personally, while attending a Clubhouse or a Twitter space, I rarely multi-task: I sit down, put on my big headphones, and listen and take notes. I know some people (maybe most) are engaged less attentively, but the opportunity is there to use the session as a way to help you firm up your attention span.
6. Anything can happen
Unlike podcasts, in live social audio, what you are listening to is happening in real-time. One of the most compelling aspects of this real-time happening is that anything can happen. The element of surprise and serendipity are real. You don’t always really know where the conversation will go next.
7. A communal experience
As we voice our thoughts, we are in a real sense riding time, and those who are listening to us are also riding that same time together with us. We are in a concrete way experiencing the human condition collectively. This is something that was how TV was consumed for the longest time, until the rise of on-demand TV. This then enables you to, later on, chat about what happened with other people who attended the session, and, crucially, chat about it as an event rather than as static content. Such togetherness can only bring us closer to each other.
8. The fear of missing out
A social audio session is an event: it is time-boxed and if you don’t attend it, you will miss it. In cases where the session is recorded, you may be able to listen to the content. But it is still the case that you missed the opportunity of experiencing the event in real-time. It is the difference between watching a football game live and watching a rerun of that game.
9. No evidence left behind
At least for those sessions that are not recorded, you can participate, blunder, say something stupid that you regret having said, and when the session is over, you don’t have to worry about people experiencing the ugliness again (and you hope that the memory of your blunder will fade away with time). This will give you a bit of pluck to step up to the mic and speak up. To be sure, this aspect of social audio – no evidence left behind – can be abused: people can be mean, insulting, demeaning, behave in a bigoted way, or worse.
Most people are kind and considerate, and hence the medium works far more as a stimulator of folks to speak up and express themselves than as a tool to be abused by those who mean to do harm,
10. You can be part of the show
Unlike a podcast, you, simply an anonymous listener, can become an active participant: very much like a magic act where members of the audience are invited on stage to be part of the show. In fact, while on stage, you have the power to take the conversation somewhere else, if that somewhere else is interesting to the moderators and the audience. I will never tire saying this: Marshall McLuhan was right, and he keeps being proved right: The medium is the message.