Observability Operations Engineer
Bridge351 is a tech company focused on excellence, innovation and tailored solutions, operating across Europe in areas like Cloud, Cybersecurity, Data and Advanced Development.
Location
Remote (Europe) with occasional travel to Germany
Language Requirements
German: C1 or higher (mandatory)
English: C1 or higher (mandatory)
About the Role
We are looking for an experienced Observability Operations Engineer to support and operate enterprise-scale platform environments. The successful candidate will be responsible for ensuring the reliability, performance, and observability of critical systems running in Kubernetes-based environments.
You will work closely with platform, infrastructure, and development teams to improve monitoring capabilities, operational excellence, and service reliability across complex enterprise environments.
Key Responsibilities
Operate and support Kubernetes-based production environments.
Manage and optimize observability platforms and monitoring solutions.
Configure and maintain logging, metrics, and tracing solutions.
Support incident, problem, and change management processes.
Define and monitor SLIs, SLOs, and SLAs.
Create and maintain operational runbooks and documentation.
Collaborate with engineering teams to improve platform reliability and performance.
Contribute to automation and continuous improvement initiatives.
Required Skills & Experience
Minimum 3 years of experience operating Kubernetes environments in production.
Strong experience with observability and monitoring platforms such as:
Prometheus
Grafana
Datadog
Loki
Mimir
OpenTelemetry
Strong understanding of networking concepts, load balancing, and security principles.
Experience with CI/CD tools and processes:
GitLab
Jenkins
ArgoCD
Tekton
Argo Workflows
Knowledge of ITSM processes:
Incident Management
Change Management
Problem Management
Understanding of Site Reliability Engineering (SRE) practices.
Experience documenting operational procedures and maintaining runbooks.
Nice to Have
Experience in enterprise-scale environments.
Cloud-native platform experience.
Infrastructure automation knowledge.
Experience working in regulated industries.
What We Offer
Fully remote work model.
International projects for the German market.
Modern cloud-native technology stack.
Long-term opportunities.
Collaborative and highly skilled engineering teams.
What can you expect from us?
Mind-blowing workplace culture. You will be integrated in a professional, dynamic and collaborative team.
100% Remote opportunities
We want you to have the flexibility to work where you feel most comfortable and productive.
International Career
You can expect professional growth and to be connect with the world.
We are represented in Portugal, Belgium, Luxembourg, and Denmark.
And with projects in many other countries: Netherlands, Luxembourg, Singapore and in the United States of America (and a lot more is coming…)
Extra Benefits & Perks
If you wish to work with us and you are outside European Union (good news…) we are a Tech Visa Company, We will help!
As a plus, we provide Health and Life Insurance.