Site Reliability Engineer in Constanţa

Docker (nice to have) Kubernetes (nice to have) Microservice Architecture (nice to have) Terraform (nice to have) English (regular) Agile (advanced) DevOps (advanced) SRE (advanced) Sonalake is a software partnering company that helps our clients realise their product roadmaps. Product design and engineering are at the heart of our business. Our engineering teams work with clients right across the stack; UX, UI design, frontend, backend, analytics, infrastructure, operations - and everything else that goes into delivering great products.We thrive on variety and are highly adaptable. Our teams are exposed to domains as varied as telecom billing, ad tech, securities-based lending, travel tech analytics, and many more.Innovation is central to our mission; anticipating future client needs, analysing emerging technologies and developing new products and services.We are now seeking to grow our team. That’s where you come in!You will:Identify and create service level indicators (SLIs) using historical dataCreate realistic service level objectives (SLOs) in order to meet SLAsEstablish monitoring tools and the system-wide observability necessary to support rapid response during service level interruptionsParticipate in on-call activities, high-priority incident response, and disaster recovery activitiesParticipate in incident retrospectives to improve overall resolution times“Automate first” mentality around incident response, infrastructure, monitoring, and play-booksProvide technical leadership through mentoring, a commitment to technical excellence, accountability, transparency and skills developmentStay up-to-date with the latest application SRE developments and trends to continually improve internal processes and toolingAssess current applications and architecture to determine where Site Reliability Engineering methodologies can be meaningfully appliedCreate an open, honest, accountable and collaborative team environment, providing timely and meaningful feedbackYou may be a fit for this role if you have:5+ years of experience with agile, site reliability engineering practicesDemonstrable operational experience successfully supporting large scale cloud deployments; including areas such as incident management, on-call, metrics and monitoring, and general observabilityDeep understanding of networking; Global Service Load Balancing, BGP, Network Redundancy, DNS, routing algorithmsDeep operational experience managing multi-region, multi-cloud large scale infrastructureExperience implementing scalability, availability, and resiliency principlesA track record applying SRE principles and tools used for cloud-native applications (e.g. Terraform, microservice architecture, Kubernetes, Docker, Istio, Envoy, Skaffold, Spinnaker)We take pride in being a people-oriented company. Openness and opportunity are really important to us. We build teams that span from experienced leaders to bright graduates and work to develop all of us within our coaching culture.


Datele de contact vor fi vizibile dupa ce veti aplica!

Anunţ expirat
loading... folosește cookies. Navigând în continuare, iți exprimi acordul pentru folosirea acestora. Află mai multe Am ințeles!