May.

10

Data Application Lab 自2018年12月10日起,開啓全新欄目,籌備良久,每天爲大家提供1個內推的機會,獨家資源,更直接的求職匹配。

內部資源

只對我們的讀者定期放送

Day

208

Job Title

Senior Engineer

Job ID

726857

Job Type

Contractor

Location

Santa Clara CA

Hours per Week

40

Background Check Requirement

Yes

Competitive Salary

Responsibilities:

- Measure and analyze the performance and parallel scalability of traditional HPC applications (e.g., WRF, NAMD, OpenFOAM), and emerging AI/Machine Learning frameworks (TensorFlow, PyTorch)

- Profile applications to identify architectural and algorithmic bottlenecks with a particular emphasis on emerging many core and accelerator usage

- Enhance and develop communication middleware to respond to today’s application and architectural challenges (accelerator use, artificial intelligence, virtual machines, containerized workloads)

- Propose remedies to the identified bottlenecks via software restructuring and/or architectural improvement with comprehensive understanding of any trade-offs in design, cost, and software engineering effects

- Assess emerging technologies in architecture, algorithms, parallel programming paradigms, and languages to provide input for HPC technology roadmaps out past the next decade.

- Define and enhance communications middleware and libraries for enhanced scalability, robustness and application performance for emerging application patterns (MPI, GPUDirect RDMA, accelerators, SRIOV)

Qualifications:

- BS / MS degree in Computer Science, Computer or Electrical Engineering along with 5+ years’ of relevant experience in networking TCP/IP, MPI, RDMA technologies (InfiniBand and/or RoCE) and other high-speed interconnect technologies, hands-on parallel and distributed code development in C/C++ and parallel programming environments and libraries.

- Fast learner able to work independently as well as in a team environment with good written and verbal communication skills.

- Real world outcome-oriented problem solving skills and experience to define workable solutions in ambiguous conditions.

The successful candidate will have experience in several of the following technologies:

- Application benchmarking and performance optimization across a variety of codes

- Experience with micro benchmarks and ability to write micro benchmarks that are able to exhibit the same performance characteristics as the full application code

- Detailed understanding of state-of-the-art tools used to program, profile, and debug parallel MPI, PGAS, OpenMP, and hybrid-parallel codes using C/C++ and Fortran 77/90 code

- Code parallelization and optimization with MPI

- Experience in benchmarking, code instrumentation, and performance analysis or parallel applications with emphasis on emerging multicore and many core architectures

- Experience with the use of languages and system utilities such as shell s and Python

- Experience in working with open source project, the Linux operating system environment and writing/maintaining large programs using C/C++ and/or Fortran

- Proven record of working effectively in a team, seeing projects through to completion, meeting deadlines, interacting with users, and thorough documentation of contributions

Preferences:

- Experience with large-scale distributed application deployments

- Familiarity with emerging AI and DL training workloads and application trends

- Proven track record of HPC communications enhancements and contributions to middleware, libraries and applications

- Understanding of system and networking vendor roadmaps

- Understanding of hardware capabilities such as RDMA, TCP offload engines (TOE), SR-IOV, Smart NIC.

- Experience with virtual machines and containers in a HPC environment

- Local candidates/willingness to relocate to the San Francisco Bay Area preferred

如果你感興趣這個職位

請將最新的 Resume 和 一段簡要自我介紹

發送到 [email protected]

Subject Line: Job#ID+Name

相關文章