Machine learning: Intelligence everywhere, with Amazon ML head Larry Pizette


Can anyone build smart systems? Use machine learning to learn and grow faster?

AI is getting more and more critical to business success, whether it’s built into your product, runs part of your marketing, or helps you make strategic decisions. But how do you get started?

Welcome to the first official episode of The AI Show, brought to you by VentureBeat. (Check out the VentureBeat post here.) In this episode, we’re chatting with Larry Pizette, Head of Data Science for Amazon’s Machine Learning Solutions Lab.

Listen to the podcast here:

Subscribe on your favorite podcasting platform:

You can also watch the interview on YouTube

What we talk about

  • Everyone wants to build smart systems. Everyone wants their products and services to evolve at internet speed.
    • But AI is complicated … how do you get started?
    • Should you kick off with long-term strategic projects? Small trials?
    • What about training people … your engineers. What kind of ramp-up are we talking about here, and what do they need to do?
    • Most people think about training as something you do for your engineers. You’ve said it needs to be for your execs too, because running an ML system is very different than operating a traditional software install. Why?
    • Some people think: I can’t do AI … I don’t have a huge training dataset. What should they do?
    • Data quality … what are the typical issues? And fixes?
  • Let’s say I’m a developer just looking to get my feet wet in AI. Where should I start?
    • Same question, but I’m a company.
  • What are the top 3 mistakes you see companies making when they first try to implement AI in their processes, products, or services?
  • Everything moves in stages. In mobile development everything was hand-built for a few years, and then frameworks and development tools emerged to make lower-level stuff simpler. Same in enterprise development, where we’re starting to see no-code applications. Where are we on that continuum for machine learning and AI?
  • You’ve built a lot of technology to help companies implement AI … talk to me about some of most innovative uses of machine learning that you can recall. Maybe brands that we wouldn’t think of when we think of AI?
    • How are they using ML to improve customer experiences?
    • How are they using ML to make the world a better place?
  • How does implementing AI in a product or service change it? Change the organization’s learning speed?
  • Many solutions in AI seem to be very narrow right now … specific to a vertical, a component, a piece of an application.
    • Do you see that widening and generalizing over time?
    • How?

And … here’s a full transcript of our conversation

Larry Pizette, head of data science for Amazons machine learning solutions lab

Larry Pizette, head of data science for Amazon’s machine learning solutions lab

John Koetsier: Can anyone build smart systems? AI is getting more and more critical to business success, whether it’s built into your product, runs part of your marketing, or helps you make strategic decisions. But how do you get started? Welcome to the very first official episode of the AI Show, brought to you by VentureBeat.

My name is John Koetsier, and today we’re chatting with Larry Pizette, he’s the Head of Data Science for Amazon’s Machine Learning Solutions Lab. Welcome, Larry. 

 Larry Pizette: I really appreciate you having me on, very much looking forward to talking with you and sharing some thoughts with your audience. 

John Koetsier: Wonderful. Before we get into everything, tell us a little bit about yourself, your career journey. 

Larry Pizette: Sure, John. I started with Amazon about eight years ago with leading solutions architecture teams, various organizations that are helping customers with adopting AWS workloads. So providing customer facing technical resources to help customers along their journey to the cloud. And in the journey to the cloud, I saw that there was significant amounts of ideation and a lot of different things that we could help our customers with in doing the work they were doing to get into the cloud and experience all of the benefits of the cloud.

A couple of years ago, I saw that opportunity in machine learning and how this opportunity to lead the data science team for the Machine Learning Solutions Lab, a global program to help our customers with adopting machine learning workloads on AWS. And so I leapt at that opportunity. It’s been super exciting, we’ve done more than 175 engagements so far, and I’m looking forward to continuing it. 

John Koetsier: Cool. Everybody wants to build smart systems today of course. Everybody wants their products and their services to evolve at internet speed, but AI can be super complicated. How do you get started? 

Larry Pizette: Sure. The way that I recommend getting started is how we do it with customers. We pull together the business leaders, the technical folks, and the people that own the data into an ideation session when we start. This is a little bit different than I experienced in my prior role, where you would typically need the business owners and the technical folks, but you didn’t really need to get the data owners in there.

And sometimes in large enterprises, large organizations, the data owners can be different than the business owners or the IT owners. And so it’s really important to get those data folks in there to not only tell you what data they have, but the quality of the data, and then we ideate on what’s possible, what they’re looking to achieve with their business objectives, and what is possible based upon the data that’s available. And then we find some opportunities where we can start moving forward quickly, where we can start proving the value of machine learning to the organization. And then after we do those ideation sessions, we start executing. And by executing, the organization learns and one of the things about the cloud is that you have the opportunity to try out and if something doesn’t work to try it again. That’s especially important in machine learning where the roadmap to getting there isn’t always known, and data scientists need to iterate on the path to getting there. 

John Koetsier: That’s kind of interesting because as we prepped for this call, we talked a little bit, you know, should you start with a big audacious project? You’re an executive, you’re thinking AI, I can save the world, create some huge new product or new value. Is that the right way to start, the right way to think, or should you start on something a little bit small? 

Larry Pizette: Yeah, I think having a high level of vision is helpful and machine learning can do so much to help an organization, help a business. Having that vision I think is important, but starting smaller where you can prove out the value, get the organization going and what it means to use machine learning in terms of the data, in terms of how you keep machine learning models up to speed, it makes sense in order to get going quickly. If you spend too much time in analysis there’s too much opportunity that’s missed.

And this is one of those things where if you start out quickly and learn, there’s so much learning that can be done in those initial proofs of concepts or initial workloads that go into production, that it’s much better to get going then to try to do a multi-year plan on it. Having that strategy is important. I’m not saying it’s not, but getting going with small incremental steps is the way to get going. 

John Koetsier: Sounds like waterfall is out the window. 

Larry Pizette: You know, some organizations make use of that, but agile and other techniques have been growing in popularity. Certainly with machine learning, getting going quickly and with data scientists that can take that data, understand and move that into production or to do a proof of concept to prove it out, I think makes sense. And that’s where we’ve seen our customers be able to move quickly. 

John Koetsier: So let’s talk about those engineers for a second. Let’s say that our development organization, maybe our IT organization hasn’t done machine learning in the past. What’s that ramp-up look like? What’s that training curve look like?

Larry Pizette: Sure. So to answer your question, it’s different for different organizations, and some organizations want to employ data scientists, some want to employ developers, and others have to train up a lot of business folks. So I’ll take it and break it down just a little bit. So first of all, in order to use machine learning, you don’t necessarily have to be a data scientist or a machine learning expert. You can use machine learning by using what we call our Artificial Intelligence layer of services, which are similar to Amazon Translate, Amazon Transcriber, two examples where you can get the benefit of machine learning just by calling in an API and so the software developers can use it at that level.

To go down to custom models, organizations typically will want to train starting with classical machine learning and then going into deep learning. But there’s always this learning curve and technologists, I find, will find a way to learn those skills that they need and they are self learners. The part that I think gets missed frequently is teaching the business folks, because people always think about the data scientists and the software developers learning about these skills. The business folks need to learn as well. And to give an example of that, in the old days when we are procuring a system that is a rule-based system, like think of an accounting system, it’s got configurations but it can be defined by rules. 

People are very used to acquiring those systems, whether it’s done through a waterfall methodology like you mentioned a moment ago, or an agile methodology, it’s very well known. This is the system and this is what we’re getting. With machine learning it’s a little bit different for business owners to say, I am going to be acquiring a system that’s making predictions, and how do those predictions affect my business? And then what happens if those predictions stop being as accurate as I need them to be?. So let’s say you’re predicting home purchases but interest rates change. If your model is trained on some assumptions and now something changes in the future, you have to retrain your model. So training the business folks so they understand what they’re getting into and how to best procure it, I think is super important.

There’s this huge urgency for machine learning and businesses know they want to go down that path, but having this understanding can save them a lot as they’re procuring the systems. 

John Koetsier: That’s interesting because you’re not procuring a static tool, you’re kind of procuring a process, right? 

Larry Pizette: In many ways, in many ways. So there’s something we call ‘concept drift’ and I was mentioning before, if interest rates start moving away from where you trained your model, you can see that happening but there’s other types of situations that happen. For instance, if the seasons change or the weather changes and it could affect your predictive analytics. So there’s other types of underlying changes. And so, like you were saying, John, you need a process that you can go ahead and use something like our tool Amazon SageMaker has new capabilities for monitoring concept drift and it also has a built-in capability for doing AB testing. So if you have a new trained model, you can see if the new trained model is performing better than your old model. So I think you’re right, there is this new process that you have to think about with acquiring these systems. 

John Koetsier: Sure, sure. So let’s say I am an executive of a company, whether it’s in software or maybe some more traditional space. I’ve heard a lot about AI, I want to get into AI, I want to do it but I feel like AI requires huge datasets, huge training datasets, and I haven’t really acquired a lot of data. I don’t really know where it is, maybe it’s not very structured or anything like that. How do I start? 

Larry Pizette: Yeah. So the way that Amazon in AWS views machine learning is we want to provide capabilities for people where they want to use it. So we want to make it available to data scientists and software developers so they have all the choices. And the top layer of our machine learning offerings we call the ‘AI Layer,’ and I mentioned before Translate and Transcribe. There’s others, Comprehend for instance, finds insights in data and texts. So there’s many different machine learning capabilities that people can use without having to train models, without having to have any of their own data, they can just take advantage of the models that we’ve trained. So that’s first at the top of the AI layer. 

At the next level down, we call it the ‘Machine Learning Layer,’ where people use custom models such as SageMaker, which is a managed service to remove much of the heavy lifting and blockers that developers and data scientists typically encounter. If they’re doing it with SageMaker there’s techniques that data scientists can use in order to make use of smaller data sets. They can generate synthetic data. They can use a technique called K-folds where they’re looking at using different data for training and for validation and then rotating that around. There’s different techniques, but your point is important that it is important to gather data and to have those folks in the room that provide the data. But it shouldn’t stop people from looking at opportunities to take advantage of machine learning, given that there are these techniques for the data scientists and there may be opportunities for them just to call services at our AI layer. 

John Koetsier: Right, right. Let’s shift focus a second to an actual developer who may be a super smart developer, maybe amazing, has looked at all the latest languages, is working in Python, whatever, you name it, hasn’t really dipped his or her toes into AI. What’s the best way to start in your opinion? 

Larry Pizette: So we have a program that we just launched called ‘Embark’ for helping out our customers with dipping their toes and getting going as developers into machine learning. And this is a really great opportunity to be brought through lots of different concepts in machine learning. That’s one way, and I certainly love the Embark program, and a unique part about the Embark program is that data scientists on my team will be teaching that. So it won’t be just instructors, it will be data scientists that they can learn from. But also there’s tools such as … I mentioned Amazon SageMaker a moment ago, which is at our machine learning layer, which we consider the middle layer in our three tier stack of machine learning offerings. And SageMaker has many built in algorithms that are really easy to get started with. 

So because SageMaker takes care of much of the heavy lifting, the developer that’s just starting to get used to it doesn’t have to worry about things like scaling out multiple servers to train their model. All of that is taken care for them in the background. When they want to go into production, it’s a single click to move it into production. And so what we’re looking to do is to make machine learning much more accessible to people so they can start taking advantage of it if they’re not a researcher at a university. We take care of them too, but we want to make it so that everybody can use it.

John Koetsier: Excellent, great. So we’ve started, we’ve dipped our toes, we’re getting into it. What are the top three mistakes that you see companies make when they start getting into AI? 

Larry Pizette: So the mistakes that people make are typically more around the human element than the technical element. And I mentioned one before is not having the data people in the room when they’re thinking through what they want to do. And I’ve seen projects start where the business owner had a vision, but the data wasn’t available or the data had quality issues, such as many missing values, negative numbers where they’re supposed to be positive numbers. When you have too much of these it can really throw off what you’re trying to accomplish.

Now, if you just have a few missing values, data scientists are amazing people and they can figure out what to do about the few missing values, those types of outliers and such. So that’s one of them. 

The second one is when you mentioned before about doing a multi-year strategy. I’ve seen some organizations want to do so much planning that it keeps them from getting going, which can be a challenge for them. So getting going quicker rather than just waiting years to get going, I think is really important. 

And then being open to understanding the different parts of how machine learning works, because machine learning is something different in terms of prediction as compared to the rule based systems that we talked about before.

I think organizations need to have a good understanding of that, and if they don’t understand that they seem to get stuck wanting to specify in a very rigid kind of way, and having some uncertainty about how to move forward. And that’s why I go back to having the big vision I think is definitely important. Important to do, but getting going quickly I think it helps organizations understand what this means to work in this new way of doing it with machine learning. 

John Koetsier: That makes a ton of sense. It’s interesting, as we look back at kind of development over the years and the decades, we see kind of everything moving in stages, right? In mobile development, for example, everything was hand built for a few years, and it was very challenging to have an app for every platform, and there were three at one point that mattered, not just the two that matter right now.

And then we’ve got frameworks and development tools that merge and make the lower level stuff simpler. We see the same in enterprise development. We’re starting to see no-code applications, that sort of thing. Can you kind of characterize where we are on that continuum in terms of AI? 

Larry Pizette: Well, I really like the question, John, because as a person that started out as a software engineer a long time ago in client server systems, and then moved to the web and saw all of these different changes, and then the mobile ones that you were just talking about, there’s certainly gonna be a lot going on. And we’re investing heavily in machine learning within AWS. And I’m seeing that what’s happening is at the AI layer, where there’s APIs where there’s uses for those that our customers are asking for. So there’s … Kendra is a new service we just released, for instance, for helping enterprises to be able to search the information in their enterprises.

So there’s these solutions that are coming out that help out with these specific use cases, and I see us continuing to invest in those so that our customers will have these types of solutions. Then there’s the middle tier, which is SageMaker, which will allow customers to innovate and to be able to do things that they hadn’t thought about for perhaps you know a year ago they could be on the leading edge, there is no service for it. These are general tools and if we look at the types of technologies that we’ve been releasing for SageMaker, we had 75 new features and capabilities in 2019 alone for SageMaker. And some of the things we did is put in SageMaker Studio. So it’s almost like an IDE that a developer would be accustomed to. We did something called … we released Jupyter Labs, which is our Notebooks service, which provides notebooks and these notebooks scale up and scale down very quickly, taking away many of the challenges that people had before with notebooks.

And these are the environments that data scientists and developers use for these custom models. So while we’re doing lots of very specific applications at the AI layer, at the SageMaker in the ML layer, we’re building out the tools so that data scientists and developers that want to work on general solutions at that layer will have the ability to continue to innovate. It will make it easier and easier and easier for them. For example, Coinbase, they were implementing broad compliance and models. In doing that they’re able to substantially reduce the amount of time it took to train their models. I think the number was 20 hours down to 10 minutes. And so there’s just …

John Koetsier: That’s significant!

Larry Pizette: Yeah, there’s huge improvements that we’ve been able to do and provide for our customers in that area. And then at the lowest level of the stack we are providing the data scientists, the researchers, the expert practitioners, what we call the ‘AWS Deep Learning Amazon Machine Image’ that they can instantiate on machine learning, compute instances. We call them our P2/P3 instances, and they have access to all of the frameworks that developers, researchers, data scientists want to use.

So we don’t tell them which one to use. We differentiate ourselves by that, by providing all of the solutions that they want to use so they can choose what they want to use so they have the right tool for the right job. So while you’ll continue to see more innovation at the AI layer with really some specific solutions, at the machine learning layer, you’ll see us continue to push out with much more capabilities to make it much faster for developers and data scientists to innovate.

And then of course, at the frameworks layer, which is the lowest level of our three tier stack, we’ll continue to provide all of the solutions and frameworks that those expert practitioners want to use. 

John Koetsier: Interesting, interesting. So let’s get specific. You built a ton of technology to help companies implement AI. We talked about a lot of it right now. Talk to me about some of the more innovative uses of machine learning that you can recall. Maybe some brands that we wouldn’t think of that are using AI? 

Larry Pizette: Well, one that I love which is really cool, is working with the NFL. So the NFL has been a great partner for working with for player health and safety. When we first started talking to them about player health and safety, we of course had a history with them, working with them, collaborating with them on next gen stats, which provides excitement for their fans, their audience. And I love every time I see it on TV, I love seeing it. But player health and safety is a really novel way of looking at a novel use case for machine learning, and with the NFL being the experts in football and having vast troves of data in AWS.

Being experts in machine learning and cloud computing, we are able to come together and think of ways where we could use machine learning to help with understanding what’s going on with injuries, help reduce the occurrence of injuries, see where we can affect game rules, and then if injuries do happen where we can help with recovery rehabilitation. 

With that particular use case, we’re going to use computer vision to identify when helmets hit, when the collisions happen, we’ll look to help out with lower extremity injuries after we’ve worked on the part with the helmets. And there might be many things we can do there, such as game rules, the way the stadiums are put together, there’s lots that could be done. I think that’s just incredibly innovative and the NFL has just been a fantastic partner in coming up with this. And one last part is that we’re going to do simulations with them.

John Koetsier: Yes.

Larry Pizette:  And we can be able to do simulations and be able to see what the effect of potential rules changes and other types of things like that are on the game. 

John Koetsier: That’s super interesting. I mean, I wouldn’t have imagined the NFL and machine learning AI. It’s funny because I was talking to the CTO of NASCAR about half a year ago, and he was talking about how they were using AI and how they were incorporating into their e-sports and a variety of other things. You’re seeing AI in a lot of different places right now in areas that you thought were very, very traditional. Maybe we’ll see it even as a referee, I don’t know, we’ve had so many controversies lately, especially in the Seattle area and others, so maybe we’ll see something like that. We know MLB is looking at robot umpires and I’m sure there’s some machine learning that’s going on in there. Any other brands that you can think of that, wow, they did something really cool? 

Larry Pizette: Cerner’s really cool use case, and we work with Cerner using Amazon SageMaker. Cerner’s the world’s largest healthcare IT company and they want to help out with helping their customers to have better health, helping out clinicians so that clinicians can be more efficient. And using anonymized data, we’re able to look at patient history and we’re able to better predict the onset of congestive heart failure before … I think it was 15 months before clinical results were showing up. And I think that’s just super innovative and fantastic to be able to use machine learning to look at this data to be able to predict a health outcome like that. I think that’s just really, really amazing.

There’s been other use cases we’ve worked with in several other industries. For instance, Formosa Plastics is a large petrochemical company in Taiwan and they wanted to use machine learning to be able to improve quality and lower the costs of that. And we went through with the Machine Learning Solutions Lab, doing the ideation sessions, the proof of concept implementing for them, and they were able to reduce their labor cost by 50% for this and increase their quality. So there’s just so many different innovative uses. It’s hard for me to imagine any business areas where there wouldn’t be, including health care we’ve talked about before and we talked about sports. But it’s in manufacturing and restaurants, there’s just so many different ones, every industry is looking at ML now. 

John Koetsier: Super interesting that you mentioned healthcare because obviously that’s top of mind right now. Coronavirus is still spreading significantly. Healthcare costs in the US are skyrocketing, we’re wondering, hey, can we continue to pay for things. And having AI present in healthcare in some way, shape or form, maybe making it out to the fringes, to the edges as well on our personal devices, the wearables that we’re starting to wear. I know Amazon came out with a ring recently, kind of a tester product. You know, there’s so much that can be done there and so much that needs to be done there. So I look forward to seeing all that. 

Let’s talk a little bit about what implementing AI and machine learning in an organization does. You’ve seen a lot of organizations do what you just mentioned, a few of the results including one, they cut labor costs and other things like that, but what does it do in terms of speed of product development, speed of improvement of levels of service, those sorts of things?

Larry Pizette: Yeah, it’s fantastic, certainly for customers. We’ve had hundreds of thousands of customers doing machine learning on AWS and many of those have been to improve customer outcomes, customer service, and each of these customers are innovating in different ways.

And so there’s … the sports business is big, I think you mentioned the Seahawks a moment ago. I’m in the Boston area so I have to confess I am a Patriots fan, but the Seahawks are using machine learning … they’re using Amazon Recognition which is our video and image analysis service to identify where players are on the field. And they’re using SageMaker in order to help out with quarterback performance and they’re going to be using it in the future with their video analytics platform in order to help out with making database decisions in the future.

And so with each of these areas, healthcare as you mentioned before, it’s a huge one, we’re doing a lot of work in healthcare with companies like Cerner, Celgene, Beth Israel Deaconess Medical Center near where I live in Boston. 

Really cool use case for machine learning is with Beth Israel Deaconess Medical Center, they wanted to make sure that they did a better job of getting patient consent data. And if a patient’s scheduled for surgery and there’s not proper consent, that could be extremely inefficient for the hospital.

So they used TensorFlow on AWS to be able to go through and read free text to faxed patient consent forms, and then to be able to access those within electronic health records which is of course better for the patient, but it’s also much more efficient for the hospital and clinicians and practitioners and physicians and such. So really cool uses. So there’s cool uses in terms of medical discovery right? But there’s really neat uses in terms of the operations in healthcare as well. 

John Koetsier: Talk about new meets old. I’ve heard recently that faxing is still important in the healthcare system. You have faxes meeting AI, so that sounds wonderful.

One thing that’s interesting is a lot of solutions in AI right now are very narrow, they’re very specific. They’re vertical specific and not just in a vertical, it’s a specific thing, a process or task. In my future39 podcast, I talk a little bit more about broader artificial general intelligence, that sort of thing, but do you see what we have right now in AI that companies are implementing widening and generalizing over time? How’s that happening? 

Larry Pizette: So I do think there’s opportunity for that in machine learning. And a couple examples are our personalized service and our forecasting service. So in some sense, personalization capabilities where we can help out with identifying, recommendations, and recommendations engines. You could view that as a one capability, on the other hand it has broad capability across so many different use cases.

And so does forecasting. We have a forecast service and forecasting is needed across many, many, many different industries. And so these are API based services in our AI layer where customers do not need to have any machine learning knowledge in order to use these, but they can take advantage of this and incorporate technology that we’ve built for Amazon over the years. Use the approaches that we’ve built for Amazon over the years and incorporate that into their business.

So, going back to our conversation a little while ago, there will be some very specific use cases that we get from our customers where we will develop capabilities and services for them in our AI layer for example. We will keep building out general services in our ML layer, general capabilities so that our customers can innovate on top of AWS and keep providing at the lowest layer those capabilities for those researchers and deep practitioners that want to control every last bit of it. And so as we keep investing in machine learning you will start seeing more and more of that API layer where we call it our ‘AI layer’, where you can call machine learning capabilities through an API.

We’ll see more at the machine learning layer where we break down the barriers to help data scientists and software engineers that don’t necessarily have that machine learning background or they just want to be more efficient. We’ll help them with providing capabilities for them in those areas. So there will be some very specific services for some very specific capabilities for customers, but at the same time, when I look at personalized or personalization or recommendation, that’s just very, very broad and customers can implement that in so many different ways into their websites. 

John Koetsier: Yeah, super interesting. I see some of that in MarTech as well, marketing technology where you see some AIs that are starting to come in that are not just telling you what might happen, or give you insights on what is going on, but are actually controlling and allocating budget and spend to be between different channels, and doing something a little more general. So it’s quite interesting. Well, it’s been great chatting with you. Is there anything else that you wanted to say? 

Larry Pizette: We are super excited about machine learning and I’m very excited about continuing to work with our customers at all three layers of our technology stack. And I really appreciate you having me on the show, it’s been a pleasure. I’ve enjoyed our conversations and thank you so much. 

John Koetsier: Well, thank you so much. It’s been a real pleasure as well. And for those listeners and viewers now and in the future, thank you for joining us on the AI show. Whatever platform you’re on, please like it, subscribe it, share or comment. And if you’re on the podcast later on, like it, rate it, review it, however you like. Thank you so much. Until next time … this is John Koetsier with the AI show.