Published 1 week ago

Manager - Network (Ref# MN250903W)

HONG KONG CYBERPORT MANAGEMENT CO LTD

Job Description

Key Responsibilities:

  • Report to Head of Information and Communication Technology (ICT).
  • Design and build the best-in-class technical infrastructure for delivering AI supercomputing systems according to the required technical architecture, performance, resilience, security, energy efficiency and service levels.
  • Configure, operate and maintain the network and communication infrastructure of the AI supercomputing systems.
  • Develop and maintain network and communication infrastructure policy, standards and procedures of AI supercomputing systems to align with industry standards, frameworks and good practices.
  • Collaborate with network and communication infrastructure suppliers to optimise AI frameworks and their operational environment and libraries to maximise performance and effectiveness for adopting the AI supercomputing systems.
  • Manage technical infrastructure outsourcing services and their contractual agreements, implement effective measures to improve effectiveness, compliance and quality of the service delivery, manage the relationship with the outsourcing partners.
  • Define network and communication infrastructure technical requirements and interconnect capabilities to optimise for the operations and performance of AI supercomputing systems.
  • Conduct performance analysis, benchmarking and modeling to identify performance bottlenecks, optimise system parameters and guide the technical infrastructure enhancements.
  • Evaluate emerging AI supercomputing technologies, including GPU processor, network fabrics, storage and interconnects for the continuous advancement of AI supercomputing systems based on technical requirements, performance characteristics and cost considerations.
  • Collaborate with subject domain experts to understand the specific requirements of scientific research and data-intensive workloads for using the AI supercomputing systems and propose appropriate technical infrastructure enhancements to manage the workloads.
  • Provide necessary support to conduct risk assessment, evaluate infrastructure control effectiveness and mitigate associated risks.
  • Monitor and analyse technical infrastructure events and alerts; identify and respond to infrastructure related risks, incidents and breaches.
  • Stay up-to-date with the latest advancements in AI supercomputing hardware, software and industry trends to guide future infrastructure design and technology adoption.
  • Provide appropriate technical guidance, training and support for effective use of the AI supercomputing systems.
  • Prepare management information, key matrices and reports for continuous improvement.

Requirements:

  • 5+ year proven experience as network and communication infrastructure specialist, preferably in AI supercomputing environment.
  • Experience in AI supercomputing network and communication infrastructure design and implementation.
  • Experience in designing and optimising AI supercomputing infrastructure and systems for business, scientific, research, or data-intensive applications.
  • Experience in adopting hardware acceleration technologies such as GPU and NPU.
  • Understanding of performance analysis and optimization techniques for parallel computing, including profiling, tracing, and performance counters.
  • Familiarity with industry-standard interconnects and network fabrics and their impact on performance of AI supercomputing systems.
  • Knowledge of parallel programming models and frameworks and their application to AI supercomputing workloads.
  • Understanding of AI supercomputing software stack components, such as compilers, runtime systems, job schedulers, and development libraries.
  • Understanding of deep learning framework such as TensorFlow, PyTorch, Caffee.
  • Understanding of programming languages such as Python, Java, C++, R and CUDA for building and implementing AI systems.
  • Good problem-solving abilities and the ability to analyse and address complex performance and scalability challenges.
  • Ability to adapt to a fast-paced and rapidly evolving technological landscape.
  • Strong communication and collaboration skills to work effectively with cross-functional teams and subject domain experts.
  • Proficiency in written and spoken English and Chinese.
  • Passion with AI technical architecture, infrastructure and systems.

Job Particulars

Job source
CPjobs
Job reference
Joo-4229595
Date published
28 Nov 2025
Job keywords
Information Technology
The Hong Kong Talent Engage website contains job vacancies information from external sources and relevant links as a convenience to our users, and is not responsible for the content of these sites.