Sunset in the mountains
Incident analysis

Our website is down! Customers are complaining! The sky is falling on our heads! What can we do to improve the situation? If we are managing a team, should we start cutting off heads? Otherwise, should we start looking for a new job? Do not despair; we are not alone, we have many decades of incident analysis behind us. In this talk we will explain several techniques that may help us face the situation: root-cause analysis, the five whys and blameless postmortems. We will also review some leadership attitudes and how they affect future incident response. Finally, we will briefly analyze a couple of real catastrophes from recent history that will give us a fresh perspective on incident analysis.

23/09/2020English
Sunset in the mountains
Microcontainers: Reduciendo el tamaño de tus contenedores

Uno de los retos de los desarrollos en contenedores representa reducir el consumo de las infraestructuras en los datacenters, y conseguir mayor densidad. Unido a este tema, la reducción del tamaño también redunda en una menor superficie de ataque y mejor seguridad en las aplicaciones. En esta charla veremos estos temas y el trabajo que diferentes equipos están llevando con este objetivo.

05/02/2020Spanish
Sunset in the mountains
Secrets as Code

At Adhara we need to build very different environments every week: for sub-products internal development, for customers pilots, for phoenix environments for testing… dozens of them. Everything was nicely automated with Terraform and leveraging Kubernetes to handle the complexity of the deployments. But… the database credentials. The Kubernetes secrets were created manually, and then its content backed in somewhere like Keybase. Not pretty. This is the story on how Adhara’s Platform team moved from that situation to the current fully automated creation of databases and their passwords, following GitOps practices and using tools like Atlantis, Flux and Terraform. Now we create new deployments every day without seeing ourselves the database passwords. We don’t even have access to them!

05/02/2020Spanish
Sunset in the mountains
Non-trusted shared Kubernetes at K8Spin

K8Spin offers in SaaS format Kubernetes Namespaces. What does this mean? We open to everyone the possibility of having a space within a common Kubernetes cluster. Is kubernetes ready for this? Are the typical tools of the Kubernetes ecosystem prepared? We will see the current possibilities and the POCs that we are carrying out in K8Spin.

27/11/2019Spanish
Sunset in the mountains
Postmortems at Flywire
by Omar López (SRE) & Andoni Alonso (SRE)

At Flywire, we have been using blameless postmortem technique for more than 3 years. During that time we have modified, evolved and adapted the process to our company needs. Today postmortems at Flywire are a tool that help us to improve the understanding of our systems, train the oncall team, be transparent with the stakeholders about what happened during an incident, improve our procedures during oncall/fire situations and dig into the root causes of the incidents taking actions over them to fix them and avoid future similar incidents. We want to share which are the principles of blameless postmortem, how we apply them at Flywire and how to start to apply them into your company.

27/11/2019Spanish
Sunset in the mountains
Talk: Events, Services and 'hives'

Imanol González, Tech Lead, and Pedro Díaz SRE Lead, will share with us what they have developed at MercadonaTech but is not visible to their users: logistics. Every piece of software that is running in their warehouses (aka hives) was developed internally. They decided not to go with multitenant and certain critical services are deployed directly in the hive, also, they didn’t discard running a local instance of Kubernetes within their hives theirselves. If you want to know more about their architecture, how do they manage deployments in more than one warehouse, the why behind some decisions and every step they took to reach this point, we encourage you to come!

PS: they promise to show some numbers about what all this machinery is doing while you sleep!

23/10/2019Spanish
Sunset in the mountains
Seedtag's microservices and infrastructure

In this talk we'll dive into the infrastructure that seedtag has built for it's online advertising product. Developed as a microservices-based architecture on top of Kubernetes, we handle billions of pageviews every month in a globally distributed platform that has to respond with tight milliseconds requirements. We'll go into the mix of technologies that we are using (Kafka, Mongo, Spark, Druid, ...) and some of the challenges that we have faced with them.

18/08/2019English
Sunset in the mountains
Anatomy of an object storage migration plan

Managed services in the cloud allow us to quickly develop our projects while virtually taking away the challenges of our product’s growth. An example is object storage, especially when used for big data purposes because the only usage is billed, not capacity.

However these services are also built by humans, which their scalability is based on tradeoffs, some of them will hit you unexpectedly.

In this talk I’ll talk about the design decisions, the obstacles encountered and lessons learned from planning the migration of 39 million objects inside a maintenance window.

18/0/2019Spanish
Sunset in the mountains
Profiling de CPU en Python

Teníamos un servidor con una alta carga de CPU, y cuando nos pusimos a ver era un proceso python. Además la alta carga era en modo USER así que no podíamos culpar al SO. El proceso se ejecutaba periódicamente y tardaba minutos en terminar. Así que la gracia era averiguar qué estaba haciendo.

18/06/2019Spanish
Sunset in the mountains
Resilience patterns (server edition)

We can't control how the clients will use our services, most of the time our servers are not protected against failures and don't have the ability to recover or continue giving service. In this session, Xabier Larrakoetxea will explain the problems we face on the server side, why we need to protect our servers, and how we can design and implement more resilient and robust systems, by applying some of the most used resilience patterns in big companies like Netflix or Facebook. The session will focus only on the server part leaving the client side resilience for an own session.

20/03/2019Spanish
Sunset in the mountains
Kubernetes journey at Tuenti

antes to esto era campo We have been using Kubernetes since 1.0, that presented a lot of challenges that we manage to figure out from automated provisioning to advanced scheduling settings in a large and complex hybrid infrastructure.

09/01/2019Spanish
Sunset in the mountains
Azure Cosmos DB: how a planetary scale database was designed and implemented

With the arrival of large cloud computing providers (hyperscalers) the first wave of managed database services arrived. These services mimicked what we had seen before in our data centers, our servers, and even our development devices running managed SQLServer, MySQL, Posgresql, MariaDB, Redis... But now a new generation of database services has arrived and they only make sense in these mega-suppliers. Horizontal autoscaling, replication in dozens of availability zones, minimal latencies, CAP theorem challenging... and on top of that multi-schema databases. This talk explains how Cosmos DB was born due to Microsoft's massive scaling needs with XBOX, the design decisions made and their impact, how it compares and how other competitors do it. Finally, an example of how my globally deployed side-project Apility.io (with no link to Microsoft) could benefit of Cosmos DB.

09/01/2019Spanish
Sunset in the mountains
Service Mesh for your Service Mess

Building microservices is easy, operating a microservice architecture is hard. Many companies are successfully using tools like Kubernetes for deploys, but they still face runtime challenges when they have to perform routing, monitoring or security. Having a mess of tens, hundreds or even thousands of services communicating in production is a job only for the very tough ones... Service Mesh is an architectural pattern that comes to solve these problems with a simple, clean approach.

23/032018Spanish
Sunset in the mountains
CoreOS Container Linux baremetal provisioning with Terraform, Ignition and Matchbox

Building your infrastructure in baremetal servers is a very interesting and cost-effective option, especially in these container period where we can leverage the power of orchestrators like kubernetes. Baremetal is of course less flexible and it comes with tasks long forgotten in our cloud-based days: booting management and configuration. CoreOS is a good and popular choice as a base OS for a container-based system. It is also interesting due to the tools it has to make booting and OS configuration a much simpler task. In this talk I'm going to describe two of them: Matchbox and Ignition and how we can integrate them in our Terraform infrastructure description using specific providers for it. This is part of a general approach towards immutable infrastructure provisioning.

04/04/1984English
Sunset in the mountains
Mis primeros pasos subiendo escaleras

Guía para padres de que hacer en los momentos cruciales cuando la aplicación web de un cliente es más lento que cargar una web con flash desde Africa con un fax modem.

04/04/1984Spanish
Sunset in the mountains
High-impact refactors keeping the lights on

Cómo llegar a un proyecto que lleva años en marcha y acometer tanto reescrituras de componentes críticos como nuevas piezas intentando modificar lo menos posible los flujos de negocio actuales, aplicando patrones de diseño como parallel change, event bus, o event sourcing. Y de paso mejorando la automatización y los flujos de desarrollo y testing.

27/09/2017Spanish
Sunset in the mountains
Autoescalado con Ladder

¿Alguna vez has pensado en autoescalar tu plataforma pero has terminado pensando que es un proceso excesivamente complejo y que requiere mucho trabajo? o ¿Quizás no sabes por donde empezar? Si te has hecho estas preguntas, esto te interesa.

21/05/2017Spanish
Sunset in the mountains
Por qué API REST está muerto y deberiamos usar APIs basadas en GraphQL

Las APIs más populares que utilizamos a día de hoy son RESTful APIs o un pseudo estándar ad hoc HTTP. Pero la necesidad de avanzar rápido en productos cada vez más complejos más allá de un simple CRUD ha empujado un cambio en la forma en que interactuamos con las APIs. Aquí es dónde surge GraphQL, un fuerte candidato predestinado a sustituir a REST en muchas aplicaciones, sobre todo en el ecosistema móvil de apps. ¿Qué hay de malo en REST? Nada en su concepción inicial y en el contexto dónde surgió, pero desde que fuera definido la forma de interactuar con las APIs ha cambiado.

Vamos a repasar las razones por las que deberíamos repensar las tradicionales APIs basadas en RESTful en favor de GraphQL.

21/05/2017Spanish
Sunset in the mountains
CI/CD with Kubernetes, Helm and Wercker

He will do a brief introduction to Kubernetes and Gudog'sdockerized architecture and then move into how to achieve continuous integration and continuous delivery to multiple environments, using Wercker (a docker based CI platform) and Helm (a Kubernetes package manager).

30/11/2016English
Made withby@agonzalezro