User News

Stay up to date with up to the minute news from XSEDE and XSEDE User Portal. Subscribe for email notifications.

Key Points
Newsfeed
Breaking user information
Contact Information

Research Cloud Administrator Position at University of Michigan Advanced Research Computing - Technology Services

Posted by Hannah Remmert on 11/30/2017 14:49 UTC

Research Cloud Administrator

Job ad: http://careers.umich.edu/job_detail/142372/research_cloud_administrator_intermediate

Job Summary

The Advanced Research Computing – Technology Services (ARC-TS) organization has an exciting opportunity to hire a Research Cloud Administrator.

This position will be part of a team working on a novel platform for research computing in the university for data science and high performance computing. The primary responsibilities for this position will be to develop and create a novel resource sharing environment to enable execution of Data Science and HPC workflows using containers for University of Michigan researchers. This position would explore the maturity of various resource sharing frameworks (Mesos, Kubernetes, Rancher, etc) for inclusion in a research system in conjunction with object storage services in the same framework. This person would also be responsible for deploying existing data science applications such as yarn, spark, impala, presto, and others, and then responsible for developing High Performance Computing (HPC) resource management under the resource scheduler framework. This position will work with guidance from senior and technical lead staff as part of a larger team.

Advanced Research Computing – Technology Services (ARC-TS) is the University of Michigan research IT provider specializing in High Performance Computing (HPC), BigData (Hadoop/Spark/etc), high speed networking, storage, and other technologies to accelerate the research mission of the institution. For more information about ARC-TS visit our website: http://arc-ts.umich.edu.

Responsibilities

-20% Object and Block Storage Service Development: Evaluate and set up cluster file system with object storage support. Integrate automated provisioning of block storage to resource scheduler framework services.
-30% Resource Scheduler Service Development: Evaluate the maturity of different resource scheduler environments. Implement resource scheduler automation and deployment on a large scale cluster. Internal documentation of how to utilize and deploy new offering on the scheduler.
-25% HDFS/Spark Service Development: Develop Spark-based services. Setup and maintain HDFS file system.
-20% User Support for system related issues: Support pilot researchers on developed systems in concert with Data Science support staff on novel platform. Develop documentation and training. Work with other ARC-TS and ARC-TS affiliated staff to support computational research around the University.
-5% Development of Self: Stay abreast of application technology trends in scientific hardware and environments (Computers, accelerators, system management methods, etc.). This can include: on-the-job training, attending technical courses or conferences, reading, researching, and testing.

Organizational Competencies:

While not limited to the following, in this role the successful candidate will be expected to demonstrate the following organizational competencies:

-Advancing the Mission: Demonstrates knowledge of the primary mission of the University and Health Systems. Demonstrates awareness of the diversity of constituency groups and their roles and purposes and issues.
-Creative Problem Solving / Strategic Thinking: Demonstrated ability to provide necessary attention to solve different level problems, often multitasking to solve moderate level problems. Defines problems, analyzes causes, identifies possible solutions, selects the best solution, and develops action plans. Generates new ideas and goes beyond the status quo. Demonstrated ability to use creative thinking to improve processes and solve complex problems.
-Development of Self and Others: Demonstrated initiative in participating in growth opportunities for continuous development and improvement. Demonstrated ability to apply new skills/knowledge to the job and serve as a training resource to less experienced staff.
-Quality Service: Demonstrated ability to establish and maintain effective relationships with internal and external customers in a manner that consistently meets the organization’s expectations for exemplary customer service. Demonstrates the ability to see issues from the customer’s perspective, assesses urgency of requests and responds accordingly. Demonstrated focus on fulfilling expectations by seeking insight into customer needs and developing solutions that provide value for the customer.

Required Qualifications

-Bachelor’s degree in computer science, engineering or an equivalent combination of education and experience.
-Minimum of two (2) years experience supporting at least one (1) of the following types of deployments:
---One of the following data science applications: Yarn, Spark, Impala, Presto or other data science application.
---Docker, and one of any number of container orchestration services (Rancher, Mesos, Kubernetes, OpenShift, Swarm).
---Ceph or another Object Storage Service.
---One of the following cloud environments: Amazon Web Services, Microsoft Azure or Google Compute Engine or equivalent cloud platform.
-HPC Scheduling systems: torque, SLURM, Platform LSF.
-Demonstrated ability with Linux, bash/shell, and Perl or Python.
-Demonstrated ability to communicate effectively in technical concepts both verbally and in writing to teams and customers.
-Ability to manage priorities in face of multiple requests and projects.
-Demonstrated ability to work in a self-directed manner, skillfully manage complex projects and stay up-to-date with the latest industry developments and best practices and apply the knowledge in the workplace.
-Demonstrated ability to troubleshoot difficult issues, and problem solving skills with a focus on process improvement and/or automation.

Desired Qualifications

-Knowledge and experience supporting at least two (2) of the following types of deployments:
---One of the following data science applications: Yarn, Spark, Impala, Presto or other data science application.
---Docker, and one of any number of container orchestration services (Rancher, Mesos, Kubernetes, OpenShift, Swarm).
---Ceph or another Object Storage Service.
---One of the following cloud environments: Amazon Web Services, Microsoft Azure or Google Compute Engine or equivalent cloud platform.
-HPC Scheduling systems: torque, SLURM, Platform LSF.
-Experience with X-Cat, Kickstart, Salt, Ansible, or other configuration management tools.
-Knowledge of golang (>=1.5).
-Demonstrated ability with Linux, bash/shell, and Perl or Python.

Additional Information

Some development may be applicable to open source projects. In addition, there may be opportunities to speak at relevant conferences regarding work done on these endeavors.

Diversity, Equity and Inclusion

The University of Michigan Information and Technology Services seeks to recruit and retain a diverse workforce as a reflection of our commitment to serve the diverse people of Michigan, to maintain the excellence of the University and to offer our students richly varied disciplines, perspectives and ways of knowing and learning.

Comprehensive Benefits

The University of Michigan is committed to offering a high-quality benefits package to support faculty, staff and their families. Learn more at hr.umich.edu/benefits-wellness

GO BLUE!

-The University of Michigan is ranked No. 2 public university in the United States and 27th overall in a survey announced 09/27/2017 by The Wall Street Journal and Times Higher Education.
-The University of Michigan maintained its ranking as the No. 4 public university in U.S. News & World Report’s 2018 annual list of the nation’s best undergraduate colleges and universities.
-The University of Michigan was featured as one of the “Great Colleges to Work For” in the 2017 Chronicle of Higher Education.
-The University of Michigan is ranked No. 3 by Money Magazine’s “Best Colleges for Your Money 2017/2018" which evaluated 711 higher education institutions on 27 factors within three broad categories: educational quality, affordability and alumni success.

Application Deadline

Job openings are posted for a minimum of seven calendar days. This job may be removed from posting boards and filled anytime after the minimum posting period has ended.

U-M EEO/AA Statement

The University of Michigan is an equal opportunity/affirmative action employer.