Building a feature no one asked for
We recently created a feature in Intercom that none of our customers were asking for – video replies.
I explained why at a recent Intercom event about product building. What follows is a lightly edited transcript of my talk, along with some of the presentation slides I used on the night.
A big part of Intercom is our in-app messenger. This is embedded into our customers’ websites and allows them to have conversations with their users, right there in the context of their app. The messenger works a little bit like WhatsApp. You can communicate with text, emojis, stickers, etc.
Recently, I was working on adding another communication medium for our users, video. Messaging with video is a brand new feature that we’ve just added to Intercom. So recently, in fact, that it’s not fully released yet. It lets you respond in a conversation by recording a video right there in the Intercom app, which the person you’re talking to sees straightaway, like the image below.
As well as being new, it’s also an interesting feature from a development point of view. The road map of what we built at Intercom comes from five different strands.
- We iterate on our current products.
- We add new features to solve customer problems.
- We improve the quality of our existing features.
- We work on scalability.
- And we make sure to find time to innovate and experiment with brand new ideas.
These are bets we make on new concepts based on how we believe communication is evolving, and video messaging was one of these ideas. Our customers weren’t asking for it. No one thought it was an obvious thing that we needed to have. Instead, the idea came from asking a question, the same question that drives all of our product development. How can we make internet business more personal?
Ask The Right Questions Early
This time, we started by asking ourselves, “How can we bring face-to-face interactions to the web?” Of course, lots of products give you face-to-face interaction with video calls, but those are high friction. Both people have to be online at the same time. It’s something people typically arranged ahead of time, rather than a casual interaction.
What we’re imagining is a little different: video messaging from one browser to another. It’s as personal as a call, but it works either synchronously or asynchronously. This is core to the way we think internet communication is evolving. Part of the conversation, which can happen in real-time when both people are online, can also occur asynchronously over minutes, hours, or days. All of our features are designed with this in mind.
So one of our designers thinks up an idea that illustrates how this might feel – hold down your space bar and say, “Hey, thanks,” and then let go to send. Then we go to work, figuring out how we can make this concept real.
The first thing we do is build a really simple prototype. It’s half mock-up, half tech demo. We want this to answer two questions: Does this actually feel cool? And is it technically possible?
Navigating Technical Roadblocks
Recording video in the browser is not a well-trodden path. It actually varies widely between different browsers. Firefox implements the latest HTML5 MediaRecorder APIs, which means the browser itself does the heavy lifting for us of recording a video file – but most of our users are on Chrome. For once, Chrome is behind the curve. It doesn’t implement video recording at all. We can get access to the camera and the mic streams without a problem, but to turn this into a video file, we need to build it ourselves from the raw camera stream.
It would’ve been really nice to come back in a year, when all the browser support is there. But we have a go anyway to see if we can work around this and still record video natively in the browser. We found out, with a little work, it’s very possible.
You add a JavaScript recorder in the background, which snapshots the video and audio streams in real-time, collects up all the frames, and then compiles them into binary media files. We soon have a prototype working and start recording some video. It quickly becomes clear that, yes, this does actually feel cool.
The Difference Between Product and Prototype
A prototype is great, but it’s not the same thing as a usable product. It’s still an experiment at this point. We might miss something that makes it technically impossible, and we still don’t ultimately know if it’s even a good idea to build – but we believe the best way to find that out is to go and build it.
So we start trying to turn this into something real, but of course, hit some more snags along the way. You might think making a video file available is a lot like any other type of file. The video gets uploaded to Intercom, and Intercom makes it available for users to watch. It turns out it’s a little more complicated. Videos come in different encodings, and to get reliable playback across different browsers, we had to send out a raw recorded file that someone has put together. It has to be transcoded to the right encodings first.
Every service working with video deals with this problem. It’s so common that there’s a whole sector of third-party services out there who just do transcoding. We are fully expected to use one of these services. It makes sense. We prefer to run less software. If we can buy it from someone else rather than building it ourselves, that’s a better option.
But video transcoding isn’t usually something that needs to happen real-time. So those third-party services all operate on a different model. You give them some work to do, and then some time later – maybe it’s 30 seconds, maybe it’s 60 seconds, maybe it’s two minutes –they get back to you with the results. This doesn’t work for us.
Our videos can be part of a real-time conversation. You can be instant messaging with someone, record a video to send them, and then follow up straightaway with some more IMs. If the video takes even 30 seconds to deliver, that breaks this flow. It no longer feels like a real-time conversation.
So at this point, it’s getting a little hard to achieve our original vision. We could’ve compromised on the user experience here and said, “Videos aren’t like other messages. They’re not real-time. They’re not quite consistent with how everything else in Intercom works.” But sacrificing the user experience like that is not something we’re ever comfortable doing.
Easing Into Implementation
This could have been an off-ramp. It’s a point where we could’ve easily said, “We flushed out all these technical problems, and found out it’s not practical to have this in Intercom.” Maybe stopping here would have been the pragmatic choice, because, after all, we don’t even know if anyone will use this. But we have an explicit budget for experimentation, to try to make new things work, and we’re going to use it.
So we keep going and start looking at whether we can build real-time transcoding ourselves. We have a couple of advantages compared to third-party services.
- Our videos are always short.
- We don’t care a ton about high-def quality, so we can optimize for faster turnarounds.
- Critically, we don’t need to do a better job than these companies who do transcoding for a living. We just need a stripped-down version that does a better job for our really specific use case.
So we stand up a simple video processing server, and with a little bit of tweaking, we managed to get transcoding times down to just a couple seconds. Critically, that’s fast enough to feel real-time. We shipped this transcoding service, and then we get back to shipping the main feature.
And that’s worth calling out: every stop along the way here, from the very first week, we were shipping this to production daily. It was hidden for anyone outside of Intercom, but the code was deployed and usable. That’s how we always like to ship features.
We first deployed the most basic possible version. It had no sound, and it attached as a file, which you had to click to open in a new tab. We could send messages to ourselves internally at this point. It wasn’t close to a real feature yet. Then some changes later, we shipped sound support. Next we added new rendering to make it play back in-line, rather than just being a link that you click on.
Things slowly came together. We had a little additional work to get a smoother player, but it started to look pretty decent.
Test, Learn and Evolve
After more continuous shipping, we added a nice recording experience, slick enough for our real customers to use. That brings us to the place we’re at right now, where we get to find out how people use it. How does it feel to video message real people in a conversation? How do our customers use it? Our challenge is to start learning and then video messaging in business communications, the same way we learned about how people use email, IMs, emoji, and everything else to communicate.
This feature has been released to a couple hundred of our beta users, and we’re already getting a ton of feedback on how they use it. Some people are sending fun, whimsical messages. Others are sending quick responses, narrating live demos or delivering hyper-personalized messages to their VIPs.
Although it’s very early in the life of this feature, we think video messaging could become a really exciting new way for businesses to communicate. This is a concrete example of our mission here at Intercom: to create more personal mediums for businesses to talk to their customers.
Innovation takes dedicated time and effort. It takes a willingness within a whole organization to experiment with something that might completely fail, as well as a bit of stubbornness to not give up when things get tricky. But the reward that we get when this works out is a whole new medium for people to talk to their customers. We don’t know exactly what this point will evolve into, or what kind of new interactions it’s going to allow, but we’re really looking forward to finding out.