For a client located in Aartselaar. We are currently looking for a Cloud Site Reliability Engineer
Cloud Site Reliability Engineer
LOCATION: Aartselaar % ONSITE WORK: 40 % REMOTE WORK: 60 START DATE: 08/08/2022 END DATE: 31/12/2022 MANDATORY LANGUAGES: English
Request Description Important
thorough & proven experience in cloud site reliability/disaster recovery in Azure
experience in writing frameworks or implementing processes.
understanding the correlation between applications & infra
experience in application management.
place of work: Wilrijk or Prague. We expect at least 2 days onsite presence.
Remark: we have adjusted the scope of the position slightly as we haven't found our right candidate yet.
Our company has a large and global Azure cloud platform and will be expanding quite extensive over the next 3 years. Different Business Areas/Divisions build applications themselves in this cloud platform hence from a global Solution & Architecture team we create/maintain a platform to make this easy for our business and we operate central managed components. Currently about 400 applications are running on our cloud environment but some lack of operational excellence, we need you here to bring a change.
Roles & Responsibilities
You will be joining our Cloud Center of Excellence of the global Solution & Architecture team.
You will be responsible to define/work out how operational excellence and reliability can be achieved for global wide applications (governance, tooling, setup, monitoring, disaster recovery, business continuity, …) using the Azure cloud environment and additional tooling.
You will be responsible to select/setup the monitoring platform for the cloud platform/applications
Supporting in setting up monitoring for the central managed cloud platform
Support/explain/train IT people in setting up monitoring/SRE within their environment and create a scalable/performant application
Experience
General experience in Azure Cloud
Experience making reliable applications or applications reliable
Experience in an operational environment of cloud applications (keep it running)
Experience with performance/scaling/caching/multi-regional setups
Experience with business continuity/disaster recovery in a cloud environment
Knowledge about monitoring setups in different aspects of an application (application, infrastructure, network, …)
Experience in explaining / training other functional/technical people
Technical Skills
Azure platform knowledge
Understanding of different capabilities of monitoring tools
Experience in automation (ie. of remediation actions)
Experience with performance testing tools is a plus
Knowledge of different monitoring tools is a plus
Knowledge on BC and DR
Non-technical Skills
A person with a passion for the job
Objective oriented, self-organized and self-driven
High degree of learning capabilities
Team player
Creative Problem Solving
Security minded
Ability to explain/train people
Strong verbal and written English Language skills is a must