Multi-Cloud Control Planes: The Cloudcast Interview Transcript
Our CEO Niall Dalton sat down with Brian Gracely of The Cloudcast to talk about Seaplane and the future of cloud computing. You can read the interview transcript below, or listen to the full podcast on The Cloudcast website.
Brian Gracely: There's always a couple of things that we really enjoy doing throughout the year. If you've listened to the show for a while, you know we love to do it year after year, which is to go out and find not only interesting new companies, companies that are getting started and are just starting to expand in the market, but also start to look at people that kind of look at problems in a different way. That look at you know, some of the challenges that have been going on for a while and maybe haven't kind of found a sweet spot for being solved and really are looking at things from a different perspective.
And so today we're going to do both those things! We get a chance to talk to a company that is fairly new, couple of years old but also looking at “how do we deploy applications on a global basis, make it easier for our developers, make it easier for operations team.” And so I’m really excited to have Niall Dalton who is co-founder of Seaplane, Seaplane IO. Now, welcome to the show, great to have you on.
Niall Dalton: Thanks, Brian. Wonderful to be here.
Brian Gracely: We're going to dive into Seaplane in all sorts of wonderful detail, but before we get started, give us a little bit of your background. You're not only the founder of Seaplane, but you've been around a number of things as an entrepreneur and engineer. So give us a little of your background and then ultimately what what led you to start founding Seaplane?
Niall Dalton: Sure. I'm one of those people that wandered around a little bit kind of up and down the stack. I usually describe myself as a systems technologist just to take the emphasis off any particular type of technology. Over the years just thinking about it, stack wise, working bottom up, I've done things like worked in the hardware level. You know, as an architect working on say CPU cores or you know, building very large scale ARM server clusters. So, building a custom arm server chip and then building large scale clusters out of them, you might want to put a thousand of those in a rack and then some number of racks to get a really big cluster that you can carve up into lots of little ARM servers.
Kind of working bottom up from there — that makes me sound like a real hardware person, but I've actually spent most of my career in software. Everything from system software around hypervisors to binary translators and Jade compilers; anything that essentially looks at a running program and tries to improve it. You can improve it in lots of ways, whether it's to make it more secure, make it run faster, make it run on a system it was never meant to run on. There're all these fun tricks we can play when we start thinking about, okay, someone's trying to build some application logic here; how do we make it run better on whatever the systems we're trying to map it onto? I spent a whole chunk of my career doing that.
Further up the stack again, [I did] a lot of work on scale-out analytics of various sorts, but always with an emphasis on low-latency decision-making. Take a modest scale system, maybe it's just a couple of thousand machines or something, and you're trying to keep every piece of financial data in the world in memory for today and make some decision-making in real time, say pre or post trades or something like this. How do you make it really productive for people that are domain experts to make use of that kind of a system? Again, trying to try to let people concentrate on their application logic and then map it onto a real-time decision-making system.
I should probably point out I spent a chunk of time building applications myself rather than just being an infrastructure person building the layers beneath them. That could be anything — I spent a while in high-frequency trading, some time working on travel lifestyle search systems many years ago for a company called lastminute.com over in Europe, and a while building cloud-scale storage systems for companies like Upthere. So kind of up and down the stack.
If that sounds like “hold on, he just wandered around quite a bit!” Yeah, that's exactly what happened. All this kind of “what's changing in the world and how do we help people consume these crazy systems that we can build?”
Brian Gracely: Yeah you're sort of the unique combination of a full stack engineer in the truest sense. And then combine that with a kind of a curious wanderer’s mindset and it leads you down a lot of different roads. I know that a lot of folks that we get to talk to on the show have been maybe not completely down that path, but you know, their path to get to where they are was not necessarily a straight line. It's always interesting to find out how people evolved to where they were. So all that background, all those you know, zigs and zags and things, how did that get you to sort of say “hey, there's a problem space out here for this company. I'm going to start Seaplane.io to go after.”
Niall Dalton: Totally. So, I guess it's not too shocking to hear that we didn't originally set out to solve this kind of a problem. When I was sitting around with some friends, figuring out what are the cool things we could build to really help people, we originally thought we would be cloud consumers. There was a whole bunch of interesting problems to solve where we'd consume this amazing cloud capacity out there and go solve some problems. But, you know, being an infrastructure person looking bottom up you kind of understand how all of the pieces work under the covers, right? So you can kind of look at somebody’s abstractions and understand that, okay, I see what you're doing with the networking there.
I've worked in software defined networks, I've worked on networking hardware and I’ve built out interesting networks — same on storage, same on compute — and so when I look at some of the infrastructure, sometimes you can kind of see that stuff bleeding through. You kind of look at the cloud infrastructure and you start thinking to yourself “okay, I can tell a surprising amount about how you do some of these things, and the organization behind this, just from looking at some of these abstractions.”
But when you start looking at it more from the opposite point of view to say “actually, I just want to build this application” you start getting a little bit more frustrated with some of those things, right? There're a lot of paper cuts. There're a lot of rough edges that you run into and an amazing amount of complexity that kind of comes out at you.
We were talking with lots of folks who are working at the application level and helping us understand how we might address their problems and it kept coming back to the same set of challenges where the cloud was making everything possible for them, but nothing was particularly easy. And the more successful they started to become, the harder it got. The complexity just exploded in their faces as they try to scale out a little bit. It's fun and games when you're just deploying in a single zone, and there're lots of really nice ways to do things, but as you start going multi-zone, or multi-region, or multi-geo trying to do your best to serve your end users, suddenly all of this complexity screams at you and makes life hard.
So what we figured out essentially was “hey, there's an interesting problem to try to solve, how do we take all of this amazing stuff and really make it consumable for people who are building applications and trying to scale out?” That was basically the genesis of Seaplane, just kind of stepping back and going, “okay, what would it take to let most people, most application developers, concentrate on building their application and then have something else do the heavy lifting behind the scenes to help them deliver that to their global audience?”
Brian Gracely: We've covered this space and a lot of aspects of this space for a number of years, and I feel like there's been a trend for the last, I don’t know, four or five years, where a set of people said “we need to engage more than just sort of hardcore developers so we're going to create technologies like low code.” Those things have ebbed and flowed. Then on the other side, there're folks who said “well, I'd like to deploy some applications, but I don't really want any operations at all so I'm going to create this thing called serverless and that'll make things easy because there're no ops to it.”
But I feel like there's a gap there, which says low code is sort of for a niche set of applications, and serverless has its own niche, but if I really just wish that, like you said, there were fewer paper cuts, there was less running with scissors, there was less complexity to deploying applications—
Especially when you start saying “hey, every time there's a cloud outage you see companies that should be more intelligent have outages” and you go "boy, people haven't quite figured out how to do multi-cloud, how to do multi-region, how to do multi-zone types of deployments." It feels like you guys are looking at that piece of the puzzle, that piece of the challenge, and really trying to take that head on.
Niall Dalton: Yeah, exactly right. I'm personally a huge fan of the whole low code movement, for example, because there should be as many people as possible able to build things. That's just an article of faith. When you think about something like Excel, just imagine the number of people that could effectively do application development without really thinking of themselves as developers in that sense. You can put something together.
So as you say the serverless thing is a partial answer where I think there're a whole bunch of upsides. Where you don't really have to think in as much detail about, okay, where is my code running? Like, where is it running? When is it running? Why is it running at that time for which user? All of that stuff gets taken care of for you.
There's a lot of complexity that's under the covers that you're being protected from, but the trade-off that you're making with these services is that you have to kind of rethink how you're developing this application, right? So for the huge numbers of people out there who are working in more traditional development environments, or have existing bits of software, or are building complex applications out of a whole bunch of existing pieces of software, it's really, really hard for them to be able to take full advantage of that serverless world.
Even though they can kind of look on with envy to think, okay, why do I have to micromanage servers? Why do I have to micromanage where and when these things are running? Why do I have to micromanage the wiring if I want to put two regions together and have some computation that's happening in the European union with data that must live there, but I've got pieces that need to interact and so on. And suddenly they find themselves spending way too much time micromanaging all of the interconnections there and the different services you have to layer up to make that work.
So if you think about "okay, how can we deliver the similar upside without making them micromanage that stuff" you start thinking about…why do you have to think about where the load balancers are? Why do you have to think about these gateways? Why do you have to think about these wires? Like it is actually essential or is it essentially an accident of how the clouds have been kind of built bottom-up. To me it starts looking an awful lot more like it's just an accident of history where people have gotten bogged down in—okay why do I have to think about “I've got this geo DNS thing going on at the top level, then that gets you to some kind of regional load balancer, or maybe I've got some Anycast thing going on to optimize where I'm arriving in as a client,” like, why am I having to do all of that stuff?
Effectively what's missing, and you can tell we enjoy cheesy puns around here, from our point of view what's missing is the control plane, the “C” plane that says okay, you're trying to run this application. You're trying to get it to run where and when it's needed in response to your traffic. You're trying to get it to respect some business rules that you have about, you know, “I want to deploy this thing in the EU or make sure this data side is geo-locked to the EU.” You should be able to do that, express your intent to some system, but then have the system itself take care of you from all of that assembly of the underlying infrastructure and dealing with the failures, dealing with edge optimizations, whatever it is that's not essential for your application.
It's really just micromanaging the infrastructure. And that felt like the wrong layering, essentially. It felt like, okay, maybe we shouldn't be in this situation as a typical application developer.
Brian Gracely: Cool, cool. So let's say I'm a developer. I'm listening to this and I go yeah, I want that world Niall just talked about where I wrote my code, I sort of know what the business logic looks like, and I can kind of explain it in business sense. I want it to be available, here're some constraints I have. What does my interaction now look like with Seaplane in terms of saying “it's day one, I want to go deploy my application, what do I have to do, what do you guys do” kind of walk me through what that looks like.
Niall Dalton: Sure. So we can jump into the middle on day one, but I think there's even a little bit upstream on day zero where you're really just thinking about the design and figuring out, okay, what do I have to take responsibility for before I go deploy? I think even there, what I see with a lot of application development teams is, cloud architecture stuff starts to infect their design in ways that it really shouldn't. I think people really should just be concentrating on their core application logic, containerizing it as usual, and then as they head towards the deployment stage—
Right, so it's day one. We're trying to push that thing out there. Really from first principles what would we imagine the right answer would be? You've got your CI/CD pipelines, you want to do some scripting, whatever it is. So I think the right answer is: you're going to want a API. Let's say a REST API. You're also going to want some SDKs, and some command line tooling built around that API for you to be able to express “hey, I want to run this collection of containers.”
Right? You know, my application is made up of a few different types of containers. They should be able to scale independently as they need to, depending on end user traffic, and I just want to be able to express to you that here's a collection of things — let's call it a formation of containers — and that formation internally should be able to balance how many replicas you're running of these things, where and when those things are running, and I just want to be able to push that to you to say, “hey, I've built a container, it's coming out of my CI/CD pipeline” or however you choose to do it. And I just want to tell you, “hey, can you take responsibility for running this and delivering it to my end user?”
So from first principles I think that's the right thing you want to do. That's exactly what we've gone and implemented. Here's a nice, small REST API that lets you express that intent and then we take responsibility for running and scaling that thing. Now by default your traffic could be coming from anywhere and we will globally make the right decisions about where and when to run your application. Whether that's all the way out in some edge locations in our edge network, or whether it's a little further back in regional metal, or inside the centralized public cloud, we'll take responsibility for that.
So then the next thing is, you probably have some other opinions. Like, I use the European Union as the running example. You might have some opinions about what you want to do there. And so you might want to layer in a couple of business rules, right? It's completely up to you. Just say, “hey, let's do something slightly different when we're running this collection of containers. This formation of containers should be geo-fenced in North America” for whatever business reason. Or “hey we're over in Australia. I really want to use Azure as some backing capacity there,I have a little business rule around that, so hey Seaplane — use Seaplane, but when you need cloud capacity, it's good to have Azure in the mix in that particular deployment.”
Really we want to keep you at the level where you're thinking in terms of your application, your business needs, without having to worry “okay, what does that turn into” and all kinds of complicated wiring under the covers.
Brian Gracely: That makes a lot of sense and obviously the idea that, again, you're sort of abstracting or taking away having to think about a lot of that behind the scenes capacity and complexity is awesome. Day zero, we're thinking the right way in terms of the application. Day one, we interact with the API the Seaplane control plane gets it out there, takes care of things.Walk me through a little bit about what the control plane is doing on day two. I log in one morning, I go to Twitter, and a whole bunch of people are going, “hey is AWS useast down?” Or “hey is something going on, something’s wrong somewhere on the cloud.” What is your system doing then to sort of say, “hey I've detected an outage. This outage may impact you in some way.” There's some behind the scenes stuff that you guys do that sort of makes sure that my life doesn't have to be as concerned about that. Right?
Niall Dalton: Yeah, totally. There's the traditional kind of day two concerns around, you know, maintaining these deployments, doing the monitoring, doing the optimization, dealing with all these failures. That's the core job of the control plane, actually. So one way to think about it is, even though I kind of describe it as “on day one, you're doing this deployment” or whatever, the best intuition I could give you about that is, that's like introducing an application that isn't running as well as it should be into the system on day one.
So from the control plane’s point of view, what it's really realizing is “oh, there's this user, they have the intent that this application should be running in this nicely, continuously optimized way that doesn't get affected by failures or whatever for availability reasons.” And so suddenly the control plane, what it actually realizes is, it has a little bit of an “oh crap” moment. It kind of says,”oh, hold on a second, they’ve expressed this intent. I'm noticing that thing isn't running the right way.” So it's going to start deploying replicas; it’s going to start sensing the traffic and figuring out what to do.
On day two, it's effectively still in that mode of “I'm continuously monitoring this. I'm continuously optimizing this because the developer has still expressed this intent that this application should be running and I'm responsible for it now.” So the control plane is kind of looking at a whole bunch of different signals. You mentioned we’ll have our annual AWS outage when the haunted zone goes down, and what we will end up noticing from the Seaplane side is the control plane will see: okay, I'm continuously monitoring six or seven different clouds. I've got all these regional bare metal things. I've got all of these edge deployments. I'm continuously sensing where, and when there's a problem.
Now, to be honest, a hard failure is actually the easier case. Like if you suddenly lose a whole chunk of capacity, like someone yanked the power cable from a data center, that's actually the easier case to spot and react to. The uglier version of it is when things start flapping around, or you have some kind of gray failure or something where it's not clear exactly what's happening. You just know that you're affecting the operation of the application. But no matter whether you're in a hard stop failure mode or a more gray mode, the control plane is seeing life is normal.
In other words, I have some partial failure in some infrastructure globally and what I'm doing is just adjusting continuously the scheduling of these things, right? So that's one set of signals just based on the application, the infrastructure that it needs, and the failures we're observing. Then on the other side of it, it's kind of continuously sensing, “okay, what's the demand for this application? Is it heavy in Australia? Is it heavy in the U.S. right now? And where should I be adjusting that deployment?”
So you get this kind of continuous monitoring, continuous optimization, and that is the core operation of the underlying control plane. It’s to keep you alive, keep you optimized, and it does it with this continuous monitoring of many different signals.
Brian Gracely: Interesting. And Seaplane—you guys take care of all that for you. It’s a SaaS service and you sorta take care of all that for people, correct?
Niall Dalton: Exactly, exactly. You just push the application to us and then that's what our control plane does continuously for you. So you get a lot of things that kind of pop out almost for free from our control plane that does that for you. Because it starts spreading your risk in ways that are generally, for most application teams, way too hard. Imagine you want to spread your risk by using two different public clouds. For a lot of people, that's just too hard.
Brian Gracely: Right, right. We've seen things in the past, for a lot of our listeners they lived through sort of VMware and things like VMotion and stuff that happened that was pretty amazing technology at the time. It had some options where it said, “look, by default we can take care of all this stuff for you and we'll monitor things and we'll move things around.” And for a lot of people that was like “whoa, hold on a second I don't know how much control I want to give up!”
Now we're evolved to 2022, people have seen a lot of things. How much do you guys do automatically? How much do you still have to say “there's some interesting changes that could happen here, some optimizations—Mr. And Mrs. Customer, do you want to do that?” Is it still kind of a push and pull sort of thing? Is everything automatic? How's the sort of ongoing interaction between you and your customers?
Niall Dalton: It defaults to fully automatic. That's the least friction path. You kind of come in and say “look, do whatever you think best. You know what my budget looks like, you know what my traffic looks like, you know what you're coping with in terms of infrastructure — do whatever the scheduler thinks best.”
As you start thinking about “I don't want to give up all of this control” I think the right trade-off is you don't want to have to micromanage the infrastructure. What you really want to do is have policies that let you express the controls that you want to enforce. I think the right thing to do for our customers is to say we're going to be transparent about the fact that it's not Seaplane or these clouds. We're not trying to convince you that we've got special infrastructure that's unique to us and it doesn't use any of the edge locations or bare metal or public clouds or anything.
We're very upfront about the fact that we're a control plane that's delivering you across that infrastructure. So if you want to come in and express opinions like “hey it really should be backed by AWS over in Europe, but for business reasons it's GCP over in the U.S.” then you should be able to express that. So we support those types of policies from a geo perspective, from a provider perspective, and so on.
If you wanted to, you could actually get super specific and micromanage that — “I'm going to enforce that you're only allowed to have three replicas of this container, and I'm going to enforce that it’s on AWS, and I'm going to enforce that it's in Germany.” Like, okay, that's a very restrictive way to use the control plane, but the control plane will kind of shrug and go “okay, we could do that” and do the deployment. So you can get very specific, but you don't have to.
Brian Gracely: Well if you've worked with enough different companies and enough groups, you sort of realize there's always sort of the best practice that makes sense on whiteboards and PowerPoint presentations then you realize…sometimes there's just a reason why they want to do stuff. Sometimes it’s personal, sometimes it's technical, and sometimes it's financial.
We’re not going to have enough time to go into all the details of what you do, folks can obviously go out to the website, we'll put links in the show notes for doing that. What are some of the things that you guys are thinking about, that are maybe kind of coming down the pipe in terms of how the control plane can expand? What are some of the interesting things that you're seeing in the market that you're going, “ah, we can do new and interesting or, or do more than you're doing today.”
Niall Dalton: Sure! I think it always comes back to data. So we spent a good chunk of time talking about compute, but everything that we've been talking about with compute really should apply to data as well. Once you make, say, a multi-region deployment of compute super easy the real questions start to percolate up about, okay, that's great, it makes perfect sense that my compute’s over here—where's the data that thing is accessing? Can I apply the same kinds of rules to that data location, or which provider is holding that data, or whether it's even allowed inside the clouds or whether I need Seaplane to keep it all at the edge for me, or whatever the design that makes sense for you.
I think the future for Seaplane is really bringing those same benefits across the kind of “most needed” data services that any modern or cloud native application really needs. You can imagine doing the same kind of thing for SQL service. I want to be able to set up some tables and then do my usual SQL queries over these tables, but exactly the same set of concerns come up, right?
I don't want to have a single database server that is sitting behind this “wonderfully autoscales, I don't have to worry about anything” compute layer, but now I've got this database server sitting behind it and all the same concerns come up. Like, is it in the right place or not? Is it available or not? Those types of questions. So you really want to apply exactly that same application-oriented, higher level view to say what you really want is a database service. You know “service” rather than “serverless” just to put the emphasis on you want this thing to be always up and available and responding not necessarily booting up just because you have an access.
You want this global service and you want those same autoscaling, auto-piloting features. I think overtime from Seaplane you'll see us apply the kind of secret sauce behind the scenes in the control plane to the most commonly needed data services — whether it's SQL or object storage or various types of messaging, queuing and so on that you need.
Now that said, of course there's a long tail of a lot of those services and there're also incredibly great services out there from other vendors that specialize in certain types of data services. So just like we always say “it’s Seaplane plus those clouds” it's going to be “Seaplane plus these other data things.” Even though we can do data services natively, we will make sure that our customers can always interact with whichever data services they choose in the most optimized way. Whether that's something they're running privately themselves inside some VPC in a cloud, or whether it's a cloud service from another vendor.
But really the goal for us is to say for the most typical kind of modern applications that are getting built for most of the market, there's an obvious set of things that they need. And people need it to respond in that same way and be kind of co-optimized so that you don't have dueling schedulers and you don't have things like “hey my compute is doing the right thing, except the data layer is doing the opposite thing and these things don't talk properly.” You never want people in that situation.
Brian Gracely: Good stuff. It’s good to hear that you guys are taking the lessons you've learned over the last couple of years about how to build these models and you're going to start applying them to not only compute, but data and probably other things down the road. Very, very cool. Niall thank you so much. It's like I said, we love being able to dig in to not only get to know new and emerging companies, but also people that are kind of thinking about problems that are pretty common to everybody, but really just haven't found a great solution yet. So thank you for helping us dig into Seaplane and learning how you guys deliver this multi-cloud control plane.
If folks want to reach out to you or engage with your team, what's the best way to do that?
Niall Dalton: You can find us on Twitter @seaplane_io also on LinkedIn at Seaplane or just go to our website and email us via the forms there. We love to talk with people. Obviously we're very user-driven so any kind of interesting applications, or if people think “hey, you can't deliver that type of a platform for my type of application!” We'd love to talk to you and dig in a little bit and see if that's something we can help with.
Brian Gracely: Good stuff. And you guys are globally based so this is applicable to any of our listeners around the world. Thank you again for the time on behalf of Aaron and myself.
Some quotes in this transcript have been edited for grammar and clarity.