At Flywire, we have been using blameless postmortem technique for more than 3 years. During that time we have modified, evolved and adapted the process to our company needs. Today postmortems at Flywire are a tool that help us to improve the understanding of our systems, train the oncall team, be transparent with the stakeholders about what happened during an incident, improve our procedures during oncall/fire situations and dig into the root causes of the incidents taking actions over them to fix them and avoid future similar incidents. We want to share which are the principles of blameless postmortem, how we apply them at Flywire and how to start to apply them into your company.
Building your infrastructure in baremetal servers is a very interesting and cost-effective option, especially in these container period where we can leverage the power of orchestrators like kubernetes. Baremetal is of course less flexible and it comes with tasks long forgotten in our cloud-based days: booting management and configuration. CoreOS is a good and popular choice as a base OS for a container-based system. It is also interesting due to the tools it has to make booting and OS configuration a much simpler task. In this talk I'm going to describe two of them: Matchbox and Ignition and how we can integrate them in our Terraform infrastructure description using specific providers for it. This is part of a general approach towards immutable infrastructure provisioning.