Sweet consulting projects, career opps, whatever you’re looking for, find it at The Squires Group. Jobs in ERP, IT, Cyber and Accounting & Finance.
System Administrator HPC - ML
6791 Aberdeen Blvd Aberdeen Proving Ground, MD 21005
The Squires Group has a multi-year contract opportunity for a Sr. System Administrator of High-Performance Computing and Machine Learning Systems. In this role, you will be engaged in a wide range of system administration/management duties for the IBM Power9 High Performance Computing and Machine Learning (HPC-ML) system in Aberdeen Proving Ground, MD. You will serve as a Subject Matter Expert, and will be skilled with system installation, configuration, and acceptance testing leading up to full system acceptance. The ideal candidate will have experience working with multiple operating environments and have an expert level understanding of deploying and developing within multiple Linux Distributions.
Per our Federal Government Contract, candidates must be US Citizens with an active Top-Secret Clearance.
- Provide maintenance and tuning of the Red Hat operating system (OS), IBM HPC-ML software stack, and IBM Spectrum Scale file systems
- Implement Information Assurance (IA) required functionality and perform periodic Comprehensive Security Assessment (CSA) scanning
- Obtain, manage, derive, and analyze accounting, auditing, performance, and utilization data
- Capacity and migration planning of new software and hardware products
- Perform software and firmware upgrades as directed and appropriate
- Leverage Fix Central to search, select, order, and download fixes for your system. Fixes provide updates to software, licensed internal code, and machine code that fix known problems, add new function, and keep the system, software, and hardware management console operating efficiently
- Perform shell scripting to automate and streamline system administration tasks
- Skilled and experienced with the following:
- High Performance Computing and Machine Learning (HPC-ML) systems
- Hardware and software environments supporting machine learning and deep learning frameworks
- IBM HPC servers, or equivalent
- IBM GSS/ESS storage, or equivalent
- IBM Spectrum Scale solutions, or equivalent
- IBM Spectrum MPI, or equivalent
- IBM Spectrum LSF, or equivalent
- IBM Extreme Cluster Administration Toolkit (xCAT)
- IBM Cloud Private (ICP) or RedHat OpenShift, or equivalent
- Singularity or other Kubernetes orchestration software solutions
- TensorFlow, Caffe, and other analytics
- Per our Federal Government Contract, candidates must be US Citizens with an active Top-Secret Clearance
The Squires Group, Inc. is an Equal Opportunity Employer M/F/Vets/Disabled.