Brief summary
Nearly ten years after the first edition of Infrastructure as Code was published by O'Reilly, Kief Morris is publishing a third edition of the book. But why a new edition now? What's changed in technology and business over the last decade?
Quite a lot, as it happens. To talk about what's new — both in the infrastructure world and in the book itself — Kief Morris joins host Ken Mugrage on the Technology Podcast. They discuss each edition and what's new in this one, and dive into the infrastructure challenges and issues that need to be tackled in 2025, from tooling and deployment to maintenance and infrastructure evolution.
Learn more about the third edition of Infrastructure as Code.
Episode transcript
Ken Mugrage: Hi, everybody. Welcome to another edition of the Thoughtworks Technology Podcast. My name is Ken Mugrage. I'm one of your regular hosts. I have with me today, Kief Morris. Kief, you want to introduce yourself?
Kief Morris: Yes, thanks, Ken. I am a distinguished engineer for Thoughtworks around infrastructure engineering in particular. I've been with Thoughtworks for about 15 years now. I've basically been working with infrastructure, automation, infrastructure as code for, gosh, probably 25 years now since before it was called that. Yes, that's my rough background.
Ken: We're here specifically to talk about a book you read called Infrastructure as Code. We're actually in volume three right now. When was the first volume published? When did it first come out?
Kief: The first edition came out in 2016. The second edition was in 2020, just in time for lockdown. That was a little bit difficult in terms of then getting out and talking to people so much. Third edition is coming out next month as we talk now. That'll be March, towards the end of March and early April, depending on which edition and where you are.
Ken: Cool. I guess first off, just what is the importance of infrastructure as code? Why should people be paying attention to this as an area, other than because of your awesome book?
Kief: [chuckles] I guess a lot of it is around, as you get a lot of infrastructure and especially as you're using cloud, you need to be able to manage it in a way that keeps it consistent and keeps it well managed. How do you make sure that stuff is updated with latest versions of stuff and kept patched? How do you make sure that you have consistency across environments, especially when you're doing things like your software delivery path to production? How do you make sure that when you deploy software in one environment, it's going to work okay the same way in the next? Code is really, as of now, at least it is the most effective way we have of being able to make sure that we can define what we want and then reuse that and reapply it and change it in a really manageable way.
Ken: Then just at a high level for our listeners, how do you define infrastructure?
Kief: That's a good one because it's like a platform where it means different things to different people. To a lot of people, it means the stuff that I don't build that I have available to me to deploy on or whatever, or to build on. For the purposes of infrastructure as code, I define infrastructure as the resources that are provided by an infrastructure as a service platform like your AWS, Azure, and so on, or data center. It's low-level resources that you need to assemble and put together in order to build and provide higher-level stuff.
When you think about, say, engineering platforms and a lot of the stuff like that, that's out there these days, I think of that as a layer on top of using your infrastructure to build your platform. That is what, say, developers use to build and people use it to test and deploy and run software on top of. I see three layers, really.
Ken: Cool. Since the first edition, or actually, I guess, since the second edition came out, what should people, if they've already read that edition, do they still get the third one? Is there enough different? What's changed in four years, five years?
Kief: I think a lot has changed. It's funny because when O'Reilly approached me to say, I think it'd be a good time to do a new edition. At first, I was thinking, well, it didn't feel like as much has changed, but since it did between the first and second. The first edition focused a lot on servers and how to build and configure servers. Things like Chef and Puppet and that level of stuff. Things like Terraform and CloudFormation and that level of thing were out and that was in the first edition of the book, but it wasn't as much of the focus.
The second edition then moved along a lot more into that of how do you-- less around servers, although again, it's still in there and more on how do you provision cloud infrastructure and make that work. I wasn't thinking there would be as much to do in this one other than some tidying up and presenting things maybe nicer since I'd obviously learned stuff in the years in between about how to talk about these things. I found that there's actually a fair number of things.
I would say the third edition has three sections and each of those has an area that I would say is fairly new or strong, much stronger emphasis now in the third edition. In the foundational section where I talk about, I've always talked about in this section of the book, things like how change is important. It's important to be able to manage change to your infrastructure reliably and that being one of the fundamental purposes of it. In the third edition, I talk a fair bit more around thinking about the business value.
In other words, not thinking of infrastructures as something that's generic that you lay out and then it's up to somebody else to worry about how to use it in your organization, but thinking about how do we make sure that we are building our infrastructure and managing it in a way that makes it easy for the business to grow in whatever way it needs to grow, to manage costs, especially these days. I think one of the big things that's changed in the industry is how much more focus there is now with the changes in the economy and everything else around, we need to consolidate.
Up until before, say, 2020 and even the '21 and so where a lot of the emphasis was on we need to build stuff and digitize and do all that stuff and get out there. A lot of stuff was built willy-nilly. Now it's a lot more, we got to consolidate. We got to figure out how to manage things. It's thinking a lot more around what are the business things that you need to be able to support and then what does that mean in terms of infrastructure and how should you think about your infrastructure. That's one big thing. Then the second section is really around design of infrastructure.
The thing that struck me as I was working on it is, so the second edition focused a lot on stacks. That is the idea of what is the deployable unit of infrastructure? How should you structure that? How should you break that up? If it gets too big, how do you integrate them and so on? Again, the third edition has a lot about that because that's still really important. Something that I think is getting more emphasis these days, and if you look at tool vendors and so on, it's that level above that of how do you put stuff together?
I've talked about the component model for infrastructure. I talk about like modules and libraries at one level, stacks is another level. Now infrastructure compositions being, how do you put those things together and manage them as groups? You see a lot more, I think tooling out there, things like what they call the tacos toolings, the Terraform automation control, something like that. I can't remember. Sorry about all the letters stand for, but it's these tools and services that are out there to deploy your infrastructure.
Those are focused a lot on how do you assemble this stuff? In the old days, and it's still what a lot of people are still doing is writing really custom scripts to put that infrastructure together, manage that orchestration and composition of infrastructure. I think we're moving towards realizing that we can't all-- So many infrastructure teams to spend all their time wrestling with that stuff with their custom scripts. There's got to go be more standardization of how to do that. I talk more about that.
Ken: I'm curious on that topic. We always talk about do tools enable things, or do they define things? The tools in this space, if you buy tool A, do you need to do things in tool A's way?
Kief: That's a big issue because I'm thinking something I do touch on a bit in the book is-- I talk a lot about, there are different patterns and practices and ways of doing things, different ways, not just like, "Here's the one way to do it." That's not what my book is about. It's about here's different ways that are out there. I do show my opinions on like, "I think some ways are better than others." Most of the tools that are out there tend to have their opinions about this is how you should organize your stuff.
Also, the third thing that's new in the book is very related, which is around how do you deploy infrastructure? Again, these tools and services tend to do a mix of two things, depending on the tool. One is how do you structure infrastructure code and orchestrate it? Then how do you trigger and manage deployments of your infrastructure to different environments and so on? Again, they're all opinionated on that. I don't feel like as an industry, we have settled on the right way to do it. I think most of the tools tend to work around the limitations of the tools they work with.
Terraform being the 800-pound gorilla in the space that either, quite, probably the majority of users tend to be using, and even those that don't, Terraform tends to influence thinking around these kinds of things. I think that these tools often the way they are right now is working around the limitations of the design decisions made in Terraform and other tools. This is a long meandering answer to your question, but, yes, these tools, they all have their opinions and you do end up having to build your stuff around what their opinions are. I think one of their risks with all of this is that I don't think it's settled.
We don't know which of these tools is going to end up still being around in a few years' time and which approaches are going to win out. I think as a cautionary thing, you need to have a really clear eye on how you're using these tools and being ready to adapt and be ready to move as things change.
Ken: Cool. Thanks. That was the first two areas of change. What's the third part of the book?
Kief: The third was that deployment. I snuck it in as I was talking around that. There's things like, can people apply concepts of GitOps to infrastructure as code? People talk about infrastructure as data is basically GitOps for infrastructure as code, where you have your infrastructure either within say a Kubernetes cluster or something like that, or a tool similar to that does a control loop. Basically, rather than I'm running a command, Terraform apply, push your changes out to your environment, it's like, well, you put your code somewhere and then something sees how the code has changed, and I'm going to apply that to the environment. Maybe even does a continuous reconciliation on that.
Again, this is one of those areas where there are a lot of different ways to do it and it's not necessarily all that mature, but I think there is this overall thing of, we need to think more about how we deploy infrastructure code, especially if you componentize it. This is more of an emergent thing than a dominant thing in the industry right now, thinking about it much more in a pull way. For application-driven, I talk about application-driven infrastructure, design and deployment. This is where you think about, so what does my application need as an application developer?
How can an application developer specifies the infrastructure I need? Then when the application gets deployed, the right infrastructure is automatically provisioned for the application rather than the old-school way, which is where your infrastructure team builds an environment as a big single thing. There it is. Now application developers can go and figure out how to get their applications running in it. Being much more, I guess, vertical thinking. You could look at it. I think that's one of the big emerging trends.
Ken: Something I'm hearing, and please correct me if I'm wrong, because I know at Thoughtworks we've been talking big A agile for literally decades and that sort of thing, is that just in time isn't good enough for this, that they really do have to plan this and think about it. Am I accurate on that? How does that--
Kief: Just in time and planning for which?
Ken: For infrastructure in general, if you're working on your cat meme app, I can say, here's the user stories that are important, this sprint or whatever, and work on those. I hear you talking about much more deliberate planning with infrastructure as code and thinking about long-term effects and your decisions and that sort of thing. How do you work that into an agile way of working?
Kief: I think the key is, again, it's thinking about those applications or your cat meme application or whatever, what does it need? Especially if you're building that in an agile way, in an incremental way, okay, what are the first iterations of that need? I think this is where it really helps when you have a good communication going between, either the same team has got the people building the applications and infrastructure, or if it is separate teams, they're communicating a lot to where it's like, "Okay, the first iteration, maybe we don't need database storage. We're just going to show how it works and what have you."
You don't have to build all the infrastructure upfront that you're going to need at the end, but what's the infrastructure we need right now for the first iteration of the software? Then even where you do start building things like say a database or what have you, maybe you can increase the fidelity as you go. One example I like to use is monitoring systems. You can say, "We're going to deploy and run our own Prometheus clusters, and we're going to have in our elk stack or what have you for log management and all these kinds of groovy things." It's like, "That's a lot of work to build. Maybe the first iteration, we just use what comes out from the cloud provider. We use CloudWatch and things like that. Then iterate over time and build that up."
I think that's part of how you get the agile approach to it. You don't have to think about the full infrastructure and have it all fully designed and built before you can even start, deploying your simple versions of your application, if that makes sense.
Ken: Yes. Shifting gears a little bit, I'm going to put you on the spot just a tad. What's next? This is the third edition. What didn't make the third edition? What's in the fourth edition? No, I don't mean to scare you. What have you done for me lately, Kief? No, not that. What are the things that are on the bleeding edge that you didn't want to put in a book yet but that people should be thinking about?
Kief: Yes. I actually touched a little bit on it, where I talk about types of infrastructures, code, and so on. Then I think there's some interesting things going on that'll be interesting to see where they end up. It's one of those where it's like in a couple of years' time, you may look back on what I wrote in the book and say, "Well, wow, that didn't hit. That's disappeared." One is, there's infrastructure from code. This is the idea that in your application code, maybe make annotations or references to whatever.
It's like what I was just saying around, okay, this part of the code, I'm going to need a database. You specify there, I need a database. These are the things that I care about from a code point of view. You might say things like, this is the kind of data that I'm going to store. It's personal data or it's transitional data, ephemeral data, what have you. Then when that application is deployed, whatever handles the application deployment pulls the right infrastructure and builds the right infrastructure to meet those needs. There's a couple of things out there. There's, I think, Winglang and Darklang. I know there's others which are playing in this space.
I think that's quite interesting. I'm a little bit skeptical, I guess, of whether that could really define all of the infrastructure and all the aspects of things that you need. To me, I think what's important about it is it's showing how we could empower developers, again, moving away from that developer has to raise a ticket. Can you make me a database? Then several rounds of back and forth where the database doesn't work. It's not doing what I need to do, blah, blah, blah. Just say, how can the developer say what they need and get it?
That's one cool thing. Another is what you might call infrastructure as a model or infrastructure as a graph, let's say, or graph driven infrastructure. System Initiative is one company that's working on this. There's another called Config Hub. There's a few others rattling around out there. These are people who've said, "Hey, maybe code isn't the thing anymore. Maybe there's a lot of limitations and mess that comes from the way we manage codes and worry about the repositories and branches and all that stuff."
If you think about what happens with infrastructure code, when you execute it, the tool builds this model desired state, "this is what the infrastructure should be." Maybe even has a state file where it represents what it thinks the current infrastructure is out there in the cloud, and then does the reconciliation and all of that. These things, what they do is bring that to the center and open it up so that becomes something you can interact with. You can write code, but you're attaching it to things within that.
It's like a graph basically of here's all the infrastructure. Then you can build up, this is what I think the infrastructure should be. You can use code for that, or you can use an interface for that or you can have it dynamically happen in response to events. There's all kinds of possibilities of what you can do when you open it up that way and expose that graph and make that something that you can interact with. I find that pretty exciting. I'm wondering whether the next edition of the book really will be infrastructure as code, or will it be infrastructure automation, [laughs] where code is one of the options, but maybe there's other options which are really gaining ground by that point. We'll see.
Ken: Yes. One of the things that you mentioned briefly in the book too was the importance of the concept of day two requirements. What is that?
Kief: Yes. That's the idea that I think a lot of times when we think about building infrastructure, including when we think about building it as code, we think about building it. We think about, "Okay, this is the infrastructure I want. Fine." Then it's like, "Well, what do you do next? How do you patch that infrastructure? How do you make fixes to it when you find out something is broken? How do you make improvements to it? How do you expand it when it's like we need to add new things to it?"
I think that's where a lot of the ways that people build infrastructure because of the way the tools are-- the mindset of the tools, I guess, and the way that we're taught when we look at guides and stuff for how to build infrastructure as code, I don't think it really handles that very well. We end up where that becomes really, really hard, and teams spend a lot of their time trying not to break things.
That's where a lot of what I talk about, and some of the things that I mentioned earlier around the componentization, but also things that I've been talking about since the first edition of using pipelines and automated tests for your infrastructure code can really help as we've learned over the past I don't know, 25 years or however long with software, having test-driven development, having continuous integration, continuous delivery for software code makes it easier to make changes. You can be more confident, and you can work faster, and the refactoring tools and those kinds of things have all made it that easier. I don't think we've really brought that into our infrastructure code so much yet.
I think those are really the key things, even though we've been talking about it. Like I said, it was in the first edition of the book, nine years ago now. It's still not really the norm. That has a lot to do with the tooling still doesn't make it easy to do that. I think our habits are still a little bit old school. It's still not necessarily all that different from when we were doing stuff with physical servers in the data center. We're using code for that, but we're doing it in the same way, which is, again, very day zero focus, day one focus rather than day two.
Ken: I did wait till we're more than 20 minutes in, at least from a recording time perspective. We'll see what the editors do. I have to ask the question, doesn't AI just make this all magic? Can I just ask my AI overlords for infrastructure?
Kief: I think AI can help. I think it has a role as it does with software, can help us to understand. Some of the cool things I've seen done with AI and some that our colleagues are doing, we're like looking at legacy code bases and the AI gives you a way of understanding and comprehending the code of what do I need to change and what needs to be involved in guiding you and it amplifies. I don't see us at the stage where you can just say, "Hey, build me an environment for my application," and it's going to do it necessarily in the way that you would want. Especially then you say, "Okay, now I need a testing environment, a production environment," and they'll all end up messy and different and so on.
I think it's a space to watch. I think one of the things that I've found in using the co-pilot-y type tools is that a lot of them are focused more on software than on infrastructure. I think they're useful, but they don't know infrastructure as well as they know software. I think as well, as I mentioned, we were just talking around the fact that we haven't learned an infrastructure as much around how to do code as well in the same way that we have a software. Again, TDD and testing and all that, the AI tools don't have anything-- They learn based on what's out there, and there's not a lot of great stuff out there, great examples of how to do this really well. I think infrastructure code needs to progress further and faster, and so the AI is going to follow what us humans do.
Ken: Yes, I think that's, admittedly was a little bit of a softball, probably a question. We talk about with AI that you have to know what good looks like and what it's trained on, frankly.
Kief: We humans don't know what good looks like with infrastructure yet. We're still figuring that out.
Ken: Yes, and that's the thing. I've heard from more than one person that, "Oh, it's DevOps. We're just going to have our developers run an AI tool that creates their infrastructure because they don't need to know about that and danger, Will Robinson."
Kief: Yes, it's good to be able to know about it. It's what I always say about platforms and layers of abstraction and all that. It can be helpful to not have to think about those details until you do need to think about those details and be able to. If you've had AI build it because you don't understand it and what AI builds is hard to understand as well. Yes, that can be a problem.
Ken: When consuming, when reading the book, what kind of book is it? Do I read it cover to cover? Is there a reference chapters? How do people consume the work?
Kief: One thing I guess I would say is that it is somewhat high-level in the sense that it focuses on design patterns and practices. It's not specific to particular tools and particular cloud platforms. The examples are pseudocode-level type examples. This isn't like learn how build AWS infrastructure with Terraform from scratch. This is something that you're going to use. Either you've already been working with these kinds of tools and you have an understanding already and you're thinking about, well, how can I organize things a bit better and get further with it or in conjunction with some material that specifically goes into the specifics of cloud platforms and infrastructure tools, whichever tools that you're using.
I think I would skim the full thing, even if you don't go into details and everything, just to know what's in there and then go back and dip into the areas as you need them, I would say. Because it covers design topics in the middle, how to break infrastructure apart. Then when you do that, what are different ways to configure infrastructure? What are different ways to integrate different pieces of infrastructure? The last section is on delivery and that's pipelines and testing and a little bit on ways of organizing teams. I guess it's less that you can leave some chapters till later and more some parts of the chapters might be-- you might just need some part of the chapter.
The one where it talks about, well, our organization is like this. We have a team that is small and does infrastructure and software together and so there are other parts of that chapter, which are then like, what do you do when you have a bigger team and lots more moving parts? Maybe you don't need that part yet, that kind of thing.
Ken: Did you notice as you're writing or editing at any point, other key like actionable advice takeaways, was there a theme anywhere? Gosh, it seems like we're, we keep mentioning the importance of X. Does anything like that jump out at you?
Kief: Generally I talk about it as applying software engineering practices to infrastructure code. Most of the book, really that's a lot of what it is. When we talk about design, it's like, well, think about principles and practices that we use for designing software, coupling and cohesion, and things like that. Then with delivery, it's things like, again, TDD, continuous integration, pipelines. I think the theme is really just to be aware and think about what we've learned in software engineering and software delivery and how we can apply that to infrastructure code.
Ken: Cool. I guess, final thoughts, anything I didn't ask you that I should have, what should our listeners know about key for the book or infrastructure or whatever you want?
Kief: I really think the important thing and the important thing with this edition and the important thing as I'm working with clients these days, is really making sure to think about things in the point of view of what is the infrastructure used for starting from there. Thinking about applications and workloads that run on your infrastructure as making sure you have an understanding of that and designing starting from there down. I think that's a difference to the way that a lot of people naturally think and the way that a lot of organizations are designed to where they really put these things into silos. I think let's just have more conversations and more collaboration across the layers of the stack.
Ken: Cool. As always, thank you very much for your time. Thanks, everybody, and have a nice day. Bye-bye.