Infrastructure Engineer

icon building Company : Byte Dance
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
icon loader

This job is no longer accepting applications.

Scroll down below to view similar jobs .

Job Description - Infrastructure Engineer

Responsibilities

Team Introduction: The scheduling team is responsible for the company's internal cluster resource management and scheduling, supporting many core businesses such as recommendation/data warehouse/search/advertising, and managing the industry-leading YARN clusters and K8S clusters in terms of cluster scale, scheduling throughput, resource utilization, business complexity and other aspects. In view of the fact that Douyin, Toutiao and other products in the company are heavily dependent on recommendations, the scheduling team has deeply customized the scheduler to support scenarios such as streaming (Flink) training and GPU training. At the same time, in order to further improve the utilization of cluster resources, the scheduling team has started large-scale offline co-location, and it is expected that scheduling systems such as YARN/K8S will be further integrated in the near future. Job Responsibilities: 1. Build an efficient and stable cluster resource management system, optimize resource isolation and resource utilization 2. Continuously solve technical and business problems caused by scale growth, and be responsible for cluster availability, stability and performance optimization 3. Design and implement a more reasonable self-developed system architecture for unique scenarios within the company to solve common business problems 4. Responsible for resource scheduling and system integration in large-scale online & offline hybrid deployment scenarios

Qualifications

1. Consider yourself a technical geek and have strong problem-solving ability 2. Proficient in one or more programming languages such as Java/C++/Go 3. Have a solid foundation in computer theory, and have a strong foundation in data structures and algorithms 4. Have the ability to develop and optimize large-scale distributed systems Those who meet the following conditions will be given extra points: 1. In-depth understanding of systems such as YARN/Kubernetes/Spark/Flink, or have contributed relevant code to the community 2. In-depth understanding of containerization technologies such as Docker/LXC 3. In-depth understanding of Linux Kernel 4. Applicants with in-depth research and experience in machine learning training frameworks such as Submarine/Kubeflow are preferred 5. Applicants with practical management experience in large distributed systems, or a strong passion for the trends in computing infrastructure in the industry
Original job Infrastructure Engineer posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

This job is no longer accepting applications.

Scroll down below to view similar jobs .

Share this job with your friends

icon get direction How to get there?

icon geo-alt China

icon get direction How to get there?
View similar HR / Recruitment jobs below

Similar Jobs in Hong Kong

Share this job with your friends

GrabJobs is the no1 job portal in Hong Kong, connecting you to thousands of jobs fast! Find the best jobs in Hong Kong, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2024 Grabjobs Pte.Ltd. All Rights Reserved.