Can AI combine data from hundreds or thousands of smartphones simultaneously to make great videos?
IMINT algorithms are in 100s of millions of devices globally from smartphone manufacturers like Huawei, Vivo, Opportunity, Sharp, Motorola, Asus, and more. (“Most of the world’s smartphone manufacturers,” according to iMINT.)
In this episode of TechFirst with John Koetsier we’re chatting with Johan Svensson, CTO of IMINT about the company’s newest product: a collaborative video system that will auto-create movies from the best clips of hundreds or even thousands of people at a single event.
Or, scroll down for full audio, video, and a transcript of our conversation …
Subscribe to TechFirst: creating movies automagically from multiple phones with AI
Watch: AI to create movies rom multiple smartphones
Subscribe to my YouTube channel so you’ll get notified when I go live with future guests, or see the videos later.
Read: AI creates movies from hundreds of smartphones automagically
(This transcript has been lightly edited for clarity).
John Koetsier: Can AI combine data from hundreds or even thousands of smartphones simultaneously to make great videos?
Welcome to TechFirst with John Koetsier. Maybe you’re watching the big game or you’re at a great party and you want a memory. So guess what? You take a picture, maybe take a video. What if you could get a video that includes the best parts from everybody — the best parts from everybody’s phone that happened to be taking a video.
That’s what IMINT is working on, and to get the scoop, we’re chatting with Johan Svensson, who’s the CTO of IMINT. Welcome, Johan!
Johan Svensson: Thank you very much. Nice being here.
John Koetsier: It is wonderful to have you here. Give us a preview. What’s the tech that you’re building right now?
Johan Svensson: We’re building a system that, as you said in the intro, it kind of makes a normal, average smartphone user being able to combine materials or videos from several smartphones.
Kind of if you’re at this event — me, myself, I can be at some event with my wife and kids, and when I get home, it’s like I really want to combine my videos because someone else got a better angle at something.
But I never get to the point where I actually upload all this to my computer and I start my editing software and do that. So, it’s kind of where we figured out that this could actually be done in a more automatic way — basically, completely automated now with [the] entrance of AI and stuff.
John Koetsier: We kind of need automated solutions for that, right? Because, I mean, if we think about it and we’re honest, we take a ton of video, we take a ton of pictures, and mostly they kind of stay on our phone, or stay in the cloud somewhere. We share one or two, but so many of them, I mean, literally tens of thousands of them, for many of us, just never actually get used.
Johan Svensson: No, I agree. I mean, it’s been like that for like, since video was brought to normal people, even during the camcorder era.
But I think it’s because video is so much harder to consume in a way that it makes it attractive. I mean, take a photograph from your smartphone, it’s just, it’s all there, you get what you see. But video is much more difficult. And the comparison with professionally produced TV shows and Hollywood movies, they’re like — it’s pretty far behind, so far.
John Koetsier: I missed some of what you’re saying there, but what I heard, I think, is that you’re promising us Hollywood-style videos with your technology 🙂
So, that sounds great, that sounds excellent.
I want to get into what you’re doing and we’ll get serious about what you’re actually promising. I want to take a look at how you’re building it, and what we can expect from it when it’s ready. But let’s chat just a little bit briefly about IMINT. Your tech is currently in hundreds of millions of devices, your website says. So you’ve got some pretty significant proven expertise. What tech do you build? What kinds of customers do you have?
Johan Svensson: I mean, we build video enhancement solutions, basically for smartphone manufacturers. So we’re not currently an app in the phone that you download, we are pre-integrated in the phone when you buy it.
And our first product to the market was a pretty standard video stabilization, and it’s still our biggest product because it’s so important in smartphones. And so we kept improving that one for the last, I don’t know, five years or something. So it’s still very important. But in addition to that, we have a wide range of more products as well.
Our brand name for consumer phones is Vidhance, which is kind of a … yeah, video enhancement. So … main purpose is kind of to reduce the gap between like smart phone video and professional video, overall.
John Koetsier: And what kinds of customers are you talking about for stuff like that?
Johan Svensson: I would say we have most of the world’s smartphone manufacturers on our client list. Not all of them, but … soon maybe.
John Koetsier: Wow, that’s impressive.
Johan Svensson: We’re working on that, but yeah, awful lot of them.
John Koetsier: Impressive. Okay, let’s go back to what you’re talking about now in innovation, the multi-camera system where AI can pick the best bits of video. What’s that look like? What’s that feel like?
Do I and all my group of friends install an app on our phones, we connect it in some way, shape or form, we take a few videos — maybe on our standard camera, our standard video, or maybe in this app — and then the AI kind of sees it all and puts it all together?
What’s that look like? How’s it work?
Johan Svensson: Yeah, I mean, you’re pretty close there. I would say at these early steps, you need to — prior to using this — you need to kind of form a group of smartphones that will enter this production.
But otherwise, if it’s already a Vidhance phone then it’s easier, then you would probably just use your native camera app. And if it’s not a Vidhance phone, then you probably need to download an app that can — so that we can access some metadata that we need later in the production.
John Koetsier: Cool. Interesting. What kind of data are you using? How’s the software work? I mean, how does it know what it’s looking at? Is it looking for common features from different angles? Does it understand object? How does it work?
Johan Svensson: Yeah, I mean, it works kind of like that. It identifies what kind of smartphones or cameras, in general, that are at a certain venue. Then we upload a lot of metadata from that, everything like the basics, like GPS position and stuff like that. But we also get more data like sensor resolution and actually available resolution, which is different from the actual stream that comes out of the phone.
John Koetsier: Okay.
Johan Svensson: And then we can use our … we have a very accurate motion estimation used in our video stabilization, there, so that we can know combining that with the magnetometer of the smartphone, we know which directions the phones are pointing in.
And using all that data, it’s pretty easy to know which objects to be found — similar objects to be found from different smartphones. And that’s one input that we use in our scenario. And then, of course, overlaying that with object recognition and stuff like that.
John Koetsier: Cool. So let’s assume that I’m doing this. It’s after COVID — actually, 2020 never happened — there’s no COVID, and we’re going to the mountains, there’s a lake, we’re all jumping in the lake, we’re on boats, we’re partying by the shore … all this stuff.
We have your app, or we have phones that have your technology in it. We take a bunch of videos. What happens? Is it automatically going to the cloud? Is it automatically then kind of processing and sending me a message: ‘And here’s your video.’ Is that how it works?
Johan Svensson: Kind of how it works, yeah. I mean, it’s two aspects of this. What you’re describing now is more of a, like an extended event, right? Where you really want to pick out the highlights of this, and maybe in some occasions there are like parallel video streams at hand.
But then we will use the same algorithms to kind of find the peak moments and highlights of these videos, and then just compile it into a decent, or a better selection than we’ve seen so far.
I mean, we’re not first on that market, but I think what’s really interesting is when you actually use multiple phones at the same time, that we can actually handle parallel streams in a way more similar to like a professional production crew at some kind of set.
John Koetsier: So you’re creating a video, but interestingly — because let’s say you have 50, maybe 150, maybe 15, whatever it is, vantage points/viewpoints that you’re taking, getting videos from — you could potentially, at some point, create some kind of experience that is deeper than sort of a two dimensional video, correct? Is that something you’re thinking of?
Maybe some kind of VR experience or something that you can move around in? Or what are your thoughts there?
Johan Svensson: Yeah, I mean, it’s definitely possible to do like a map of the world using all this data. And we’ve been looking at that as well, but for now, I think we’re — yeah, I think there are other solutions with like, well, matrices of 360 cameras to be more suitable for that.
I think we’re kind of aiming to start with, on the small group of people that have their existing smart phones already, and using that capability that’s already there to make something kind of disruptive to how you imagine video with your smartphone.
John Koetsier: Right, right, right. So you’ve got obviously a bunch of challenges here. One of the core challenges is that you are essentially a B2B brand, right? You’re in a lot of phones. You work with a lot of smartphone manufacturers, and now you’re trying to get to consumers to some degree, right? How do you plan to make that leap?
Johan Svensson: Yeah. I mean, you’re right. That’s — we’re not too strong on the consumer side, but we think that this is really appealing for a lot of the big players out there as well. And I think they can help us in getting — since, I mean, we’re making video content more attractive and more viewable.
John Koetsier: Yup.
Johan Svensson: Which would probably be a big enough reason, so that we can get some traction on the consumer side as well.
John Koetsier: Yeah, interesting. I’m pretty sure you can get some sponsorship and involvement from the cloud providers as well, because the more video we create, the more we have to store somewhere.
Johan Svensson: Yeah. Yeah, but I mean, there are a lot of players in this that’s actually looking in the future, it’s the data, society, the 5G network. Everything is kind of driven by video consumption and uploading and downloading video.
Of course, I mean, that’s one of the sort of, we call it a drawback of this system, and maybe one of the factors that hasn’t made it really accessible before, that’s that it really consumes a lot of data.
And if you’re on a wifi network or that kind of a venue that’d be fine, but, I mean, in order to have it like for the masses, I think it’s a — this is the tipping point where that’s gonna happen.
John Koetsier: Okay, interesting. So give us a date, or at least a range of timeframes. When can we expect to be able to access a system like this, an experience like this? What are we talking — a month? a quarter? When do you expect to have it out in the market?
Johan Svensson: I think it’s more than a month and I don’t think it’s going to be that kind of a big launch date. I think it’s going to be more and more functionality of this, we’re going to launch to start off with a kind of a smaller version and then build up on that to see what actually sticks and what — how it’s actually done.
John Koetsier: Okay.
Johan Svensson: Because we know from looking at others that most of these kind of disruptive tools — if I’m allowed to say like that — but it’s often that it’s used in another way than it was intended—
John Koetsier: Yes.
Johan Svensson: To be used by the makers. So we’re going to plant a tool out there and we’re going to monitor how it’s actually used, and then we’re going to strengthen those areas more or less.
John Koetsier: Excellent. Excellent. You Swedes, always so disruptive.
Johan Svensson: Or just careful and no risk.
John Koetsier: Socially responsible, wonderful. Excellent. Well, thank you so much. It’s been interesting hearing about what you’re working on and what you’re planning. I look forward to seeing it come out.
Johan Svensson: Yeah. Great. Thanks for having me.
John Koetsier: Excellent. For everybody else, thank you for joining us on TechFirst, my name is John Koetsier. Really appreciate you being along for the show. You’ll be able to get a full transcript of this quite often within a few days, maybe a week, at JohnKoetsier.com. Full video is always available at YouTube. And thanks for joining, maybe share with a friend.
Until next time … this is John Koetsier with TechFirst.