Platform engineering has emerged as a practice to address organisations’ continuing challenges of both creating technology applications and operating them at scale over the long term.
"Platform Engineering: A Guide for Technical, Product and People Leaders", by Camille Fournier & Ian Nowland, is a timely explanation of what it is, their experience in implementing it across several firms, and the benefits as well as traps that organisations can expect.
The "Over-General Swamp"
The authors start by describing how we arrived at what they call the "over-general swamp." The explosion of choice made possible by cloud computing and open source software has led to huge diversity in organisations' technology stacks. This diversity has created significant problems in operating, supporting, and maintaining these stacks.
While the DevOps movement seeks to address the disconnect between developers who create systems and the operations teams that maintain them in production, challenges persist despite years of effort. There are fundamental differences in skill sets between developers and operations teams. True “full stack developers” are few and far between. Similarly rare are developers who are skilled in operations, and vice versa.
Add to that the growing pressure on technology teams to develop and implement new systems and features, and the increased scale and 24/7 operations demanded, and the pace of the treadmill seems to increase all the time.
The authors acknowledge the value of DevOps but note that few companies have successfully executed the "you build it, you run it" model—the idea that developers can both create new software and operate it in production. And having developers on call outside working hours is both difficult to manage and unpopular with developers themselves.
Similarly, the authors contend that Site Reliability Engineering (SRE) practices created by Google, while very successful at Google itself, have achieved mixed results when applied in other organisations.
Platform Engineering as a Solution
The authors propose platform engineering as a means to clear the over-general swamp through several key strategies:
- Limiting Primitives: Restricting the building blocks that developers can work with while minimizing overhead
- Reducing Per-Application Glue: Minimizing the connections required for each application's inputs and outputs
- Centralizing Migration Costs: Reducing diversity of open source software and cloud systems in use
- API Encapsulation: Wrapping open source software and vendor systems with APIs
- Creating Observability: Providing visibility into platform usage
- Ownership Clarity: Giving ownership of operating systems and cloud systems to teams with developers
Building a Platform Engineering Practice
Using the analogy of the three little pigs, the authors emphasize foundational thinking over quick wins. The four pillars of platform engineering include taking a curated product approach, developing software abstractions like APIs, serving developers with self-service interfaces, and maintaining platform reliability through solid operations.
Organizations often struggle with platform engineering due to common pitfalls including lacking the right people blend, poor platform operations, costly migrations that burn goodwill, and failing to communicate value to stakeholders. The book focuses extensively on building great platform teams and operating platforms as products with clear service level objectives and customer-focused delivery.
As with any change initiative, successfully adopting platform engineering adoption requires going beyond the technical elements. Organisations often fail to achieve benefits due to having the wrong mix of team members, poor operations, or failing to achieve buy-in from key stakeholders. The authors discuss the types of roles that organisations should consider when assembling platform engineering teams, and stakeholders across the organisation who need to be engaged and won over.
Platform Engineering vs Enterprise Architecture
In reviewing this book I was struck by similarities to enterprise architecture. Enterprise Architects reading this book will recognise familiar concepts, if expressed differently. Although similar to enterprise architecture principles, there are differences in focus.
Both platform engineering (as described in this book) and enterprise architecture emphasise standardisation. For platform engineering, limiting the primitives and API encapsulation is a way to reduce complexity; for enterprise architecture the benefit is described as increasing the re-use of existing assets and ensuring consistency. For both platform engineers and enterprise architects, APIs increase asset re-use and hide the complexity of internal systems so that developers in other teams don’t need to take time to understand them in detail. Enterprise architects appreciate the added benefit of looser coupling between systems, reducing the impact of change in one system on another.
Platform engineering focuses on the developer experience in using the building blocks provided by the platform. Enterprise architecture seeks to balance standardisation with business flexibility, and codifying governance as Architecture Decision Records (ADRs).
For organisations that have already embraced enterprise architecture, implementing platform engineering as explained in this book requires some mapping between the two frameworks, and deciding the organisation’s focus areas.
Conclusion
This book provides valuable insights for technical leaders looking to implement platform engineering practices. It offers practical guidance on addressing the complexity challenges facing modern technology organisations while acknowledging that successful implementation requires careful attention to both technical and organisational change management aspects.