Posted: Tuesday, November 7, 2017 2:32 AM
Job Duties The Big Data and HPC Engineer Lead will report to DGIT Director of Research Computing and work with Research Computing team in the development and improvement of our high performance and Big Data Intensive (eg Hadoop/Spark) compute infrastructure and parallel system environment. DGIT Research Computing team is currently developing cloud-based infrastructure for data ingestion pipelines and analytics platforms to facilitate research activities and precision health initiatives. This position will directly contribute toward the evolution of our infrastructure design to a next generation system that leverages cloud and Big Data technologies. This position is also responsible for technical systems management, administration, and support for the cloud-based and on premise high-performance computing (HPC) cluster environments. This includes all configuration, authentication, networking, storage, interconnect, and software usage & installation of the HPC Cluster(s). The position is highly technical and directly impacts the daily operational functions of the above environment(s) and is responsible in installing/configuring/patching/upgrading software, and tuning, optimizing, proactively monitoring, and securing services. This position will work directly with all levels of Academic constituents, including faculty, researchers, and students, supporting and promoting the use of the HPC Cluster(s). Job Qualifications Bachelor's degree in computer science, computer engineering, or a related field, or the equivalent combination of education and related experience. AWS Certified Solution Architect, Developer, SysOps Administrator Certifications required. Experience with AWS/Cloud computing design, provisioning, and tuning. Significant experience with Linux/Unix systems including installation, configuration, networking, backups, updates and patching, and system security. Experience with Linux cluster resource allocation, job scheduling, InfiniBand networks, MPI communications, and cluster monitoring. Installing, testing, configuring, and administering HPC clusters/servers and software. Experience with deploying cluster in AWS environment using AMI, CloudFormation templates. Significant experience with well knowledge of Big Data technologies such as Hadoop, Spark, Nifi, Storm, Spark, HDFS, NFS, Lustre, Presto, Hive, AWS Redshift, AWS Athena. Experience with big data warehouse and ETL design and implementation including technologies such as Presto, Hive, AWS Athena, AWS EMR and RedShift. Advanced knowledge of scripting and programming languages such as C/C++, Java, Perl, Python, Ruby, and bash/csh/ksh. Substantial experience in one or more of the advanced areas: local, parallel and distributed file systems, NAS platforms, or container orchestration framework, SQL/NoSQL database systems and IaaS technologies. Significant experience with software container technologies such as Docker, CoreOS and/or Singularity. Extensive knowledge of RedHat, CentOS, Ubuntu Linux and Windows. Significant experience supporting multiple independent but inter-related systems and software packages and demonstrated advanced ability to provide innovative solutions to broadly defined tasks and problems and to interact with system developers and vendors. Excellent customer service skills, working directly with customers to resolve and troubleshoot technical issues and requests. Advanced verbal and written communication skills necessary to effectively collaborate in a team environment and present and explain technical information and provide advice to management. Effective expert analytical, problem-solving, and decision-making skills to develop creative solutions to complex problems. Expert communication, facilitation, and collaboration skills necessary to present, explain, and advise senior management and/or external sponsors. Ability to learn and adopt new technology. Self-directed individual with strong desire to learn and contribute in a team of technical peers. Ability to apply troubleshooting techniques to resolve complex, cross functional issues. Experience in an academic or research community environment. Experience with any or all of the following technologies/products: Slurm, PBS Pro, Moab, Ganglia, Lustre, GPFS, Infiniband, MPICH, OpenMPI.
• Location: Susanville
• Post ID: 8513287 susanville