For a client we are seeking a Linux System Administrator to join their team on a long-term basis starting ASAP. This role involves the administration of Linux HPC clusters both in a local datacenter and on AWS cloud, as well as the administration of related Linux workstations.
Responsibilities:
- Administer Linux HPC clusters and related Linux workstations.
- Coordinate software and hardware upgrades for HPC clusters.
- Troubleshoot software/configuration and hardware issues, including coordination with suppliers.
- Monitor system performance and implement continuous improvements.
- Analyze and implement user requests to enhance system functionality.
Qualifications:
- Proven experience with HPC applications for CAE, AI, machine learning, and rendering.
- Proficiency in Red Hat Linux OS.
- Experience with storage solutions such as NetApp, DDN, Dell, HP.
- Familiarity with filesystems including Lustre, Beegfs, CEPH, GPFS.
- Knowledge of schedulers such as Slurm, PBS/Torque, SGE.
- Experience with Bright cluster manager and HPC networking.
- Ability to support datacenter operations, including installation and maintenance of servers, disks, and connectivity solutions.
- Knowledge of HPC on AWS cloud, including Terraform and OpenStack.
- Experience with Linux workstation administration.
Desired Skills:
- Excellent problem-solving skills.
- Strong coordination and communication abilities.
- Ability to work independently and as part of a team.
Language Proficiency:
Perfect knowledge of the English language is required.