MLOps Engineer (LLM Infrastructure)

Київстар , Kyiv, 2025-08-14 00:00:00

Описание

We are hiring an MLOps Engineer specializing in Large Language Model (LLM) infrastructure to design and maintain the robust platform on which our AI models are developed, deployed, and monitored. As an MLOps Engineer, you will build the backbone of our machine learning operations – from scalable training pipelines to reliable deployment systems – ensuring that our NLP models (including LLMs) can be trained on large datasets and served to end-users efficiently. This role sits at the intersection of software engineering, DevOps, and machine learning, and is crucial for accelerating our R&D in the Ukrainian LLM project. You’ll work closely with data scientists and software engineers to implement best-in-class infrastructure and workflows for the continuous delivery of AI innovations.About us is a Ukrainian hybrid IT company and a resident of We are a subsidiary of Kyivstar, one of Ukraine's largest telecom operators.Our mission is to change lives in Ukraine and around the world by creating technological solutions and products that unleash the potential of businesses and meet users' needs.Over 500+ specialists work daily in various areas: mobile and web solutions, as well as design, development, support, and technical maintenance of high-performance systems and services.We believe in innovations that truly bring quality changes and constantly challenge conventional approaches and solutions. Each of us is an adherent of entrepreneurial culture, which allows us never to stop, to evolve, and to create something new.What you will do• Design and implement modern, scalable ML infrastructure (cloud-native or on-premises) to support both experimentation and production deployment of NLP/LLM models. This includes setting up systems for distributed model training (leveraging GPUs or TPUs across multiple nodes) and high-throughput model serving (APIs, microservices).• Develop end-to-end pipelines for model training, validation, and deployment. Automate the ML workflow from data ingestion and feature processing to model training and evaluation, using technologies like Docker and CI/CD pipelines to ensure reproducibility and reliability.• Collaborate with Data Scientists and ML Engineers to design MLOps solutions that meet model performance and latency requirements. Architect deployment patterns (batch, real-time, streaming inference) are appropriate for various use-cases (, a real-time chatbot vs. offline analysis).• Implement and uphold best practices in MLOps, including automated testing of ML code, continuous integration/continuous deployment for model updates, and rigorous version control for code, data, and model artifacts. Ensure every model and dataset is properly versioned and reproducible.• Set up monitoring and alerting for deployed models and data pipelines. Use tools to track model performance (latency, throughput) and accuracy drift in production. Implement logging and observability frameworks to quickly detect anomalies or degradations in model outputs.• Manage and optimize our Kubernetes-based deployment environments. Containerize ML services and use orchestration (Kubernetes, Docker Swarm or similar) to scale model serving infrastructure. Handle cluster provisioning, health, and upgrades, possibly using Helm charts for managing LLM services.• Maintain infrastructure-as-code (, Terraform, Ansible) for provisioning cloud resources and ML infrastructure, enabling reproducible and auditable changes to the environment. Ensure our infrastructure is scalable, cost-effective, and secure.• Perform code reviews and guide other engineers (both MLOps and ML developers) on building efficient and maintainable pipelines. Troubleshoot issues across the ML lifecycle, from data processing bottlenecks to model deployment failures, and continuously improve system robustness.Qualifications and experience neededExperience & Background:• 4+ years of experience in DevOps, MLOps, or ML Infrastructure roles• Strong foundation in software engineering and DevOps principles as they apply to machine learning • Bachelor’s or Master’s in Computer Science, Engineering, or related field is preferredCloud & Infrastructure:• Extensive experience with cloud platforms (AWS, GCP, or Azure) and designing cloud-native applications for ML• Comfortable using cloud services for compute (EC2, GCP Compute, Azure VMs), storage (S3, Cloud Storage), container registry, and serverless components where appropriate• Experience managing infrastructure with Infrastructure-as-Code tools like Terraform or CloudFormationContainerization & Orchestration:• Proficiency in container technologies (Docker) and orchestration with Kubernetes• Ability to deploy, scale, and manage complex applications on Kubernetes clusters; experience with tools like Helm for Kubernetes package management• Knowledge of container security and networking basics in distributed systemsCI/CD & Automation:• Strong experience implementing CI/CD pipelines for ML projects• Familiar with tools like Jenkins, GitLab CI, or GitHub Actions for automating testing and deployment of ML code and models • Experience with specialized ML CI/CD (, TensorFlow Extended TFX, MLflow for model deployment) and GitOps workflows (Argo CD) is a plusProgramming & Scripting:• Strong coding skills in Python, with experience in writing pipelines or automation scripts related to ML tasks• Familiarity with shell scripting and one or more general-purpose languages (Go, Java, or C++) for infrastructure tooling• Ability to debug and optimize code for performance (both in data pipelines and in model inference code)ML Pipeline Knowledge:• Solid understanding of the machine learning lifecycle and tools• Experience building or maintaining ML pipelines, possibly using frameworks like Kubeflow, Airflow, or custom solutions• Knowledge of model serving frameworks (TensorFlow Serving, TorchServe, NVIDIA Triton, or custom Flask/FastAPI servers for ML)Monitoring & Reliability:• Experience setting up monitoring for applications and models (using Prometheus, Grafana, CloudWatch, or similar) and implementing alerting for anomalies• Understanding of model performance metrics and how to track them in production (, accuracy on a validation stream, response latency)• Familiarity with concepts of A/B testing or canary deployments for model updates in productionSecurity & Compliance:• Basic understanding of security best practices in ML deployments, including data encryption, access control, and dealing with sensitive data in compliance with regulations• Experience implementing authentication/authorization for model endpoints and ensuring infrastructure complies with organizational security policiesTeam Collaboration:• Excellent collaboration skills to work with cross-functional teams• Experience interacting with data scientists to translate model requirements into scalable infrastructure• Strong documentation habits for outlining system designs, runbooks for operations, and lessons learnedA plus would beLLM/AI Domain Experience:• Previous experience deploying or fine-tuning large language models or other large-scale deep learning models in production• Knowledge of specialized optimizations for LLMs (such as model parallelism, quantization techniques like 8-bit or 4-bit quantization, and use of libraries like DeepSpeed or Hugging Face Accelerate for efficient training) will be highly regardedDistributed Computing:• Experience with distributed computing frameworks such as Ray for scaling up model training across multiple nodes• Familiarity with big data processing (Spark, Hadoop) and streaming data (Kafka, Flink) to support feeding data into ML systems in real timeData Engineering Tools:• Some experience with data pipeline and ETL• Knowledge of tools like Apache Airflow, Kafka, or dbt and how they integrate into ML pipelines• Understanding of data warehousing concepts (Snowflake, BigQuery) and how processed data is used for model trainingVersioning & Experiment Tracking:• Experience with ML experiment tracking and model registry tools (, MLflow, Weights & Biases, DVC)• Ensuring that every model version and experiment is logged and reproducible for auditing and improvement cyclesVector Databases & Retrieval:• Familiarity with vector databases (Pinecone, Weaviate, FAISS) and retrieval systems used in conjunction with LLMs for augmented generation is a plusHigh-Performance Computing:• Exposure to HPC environments or on-prem GPU clusters for training large models• Understanding of how to maximize GPU utilization, manage job scheduling (with tools like Slurm or Kubernetes operators for ML), and profile model performance to remove bottlenecksContinuous Learning:• Up-to-date with the latest developments in MLOps and LLMOps (Large Model Ops)• Active interest in new tools or frameworks in the MLOps ecosystem (, model optimization libraries, new orchestration tools) and a drive to evaluate and introduce them to improve our processesWhat we offer• Office or remote – it’s up to you. You can work from anywhere, and we will arrange your workplace• Remote onboarding• Performance bonuses• We train employees with the opportunity to learn through the company’s library, internal resources, and programs from partners• Health and life insurance• Wellbeing program and corporate psychologist• Reimbursement of expenses for Kyivstar mobile communication

Похожие вакансии

  • Machine Learning Engineer (LLM Focus)

    Intelliarts , Lviv, 21 час назад
    ... work on cutting-edge ML LLM applications in a compliance-heavy domain. Together with 4 Full Stack Engineers and another ML Engineer (all based in Ukraine within ... )Familiarity with cloud-based ML LLM infrastructure (AWS GCP Azure) and containerized ...
    ua.talent.com
  • Security Infrastructure Engineer / DevSecOps (Kyiv,Lviv)

    LotusFlare, Inc. , Drohobych, 18 дней назад
    ... getnomad.app. Overview: As Security Engineer on the Infrastructure Team at LotusFlare you will ...
    ua.talent.com
  • Senior Data Scientist/NLP Lead

    Київстар , Kyiv, 9 дней назад
    ... processing solutions for our Ukrainian LLM project. You will lead our ... -end development of NLP and LLM models - from data exploration and ... for model training and evaluation.MLOps & Infrastructure:• Hands-on experience with containerization ( ...
    ua.talent.com
  • Senior Electrical Engineer – Sewerage Infrastructure

    Jadeer , Ukrainka, 23 дня назад
    ... one of our esteemed international infrastructure clients for a major project ... to a high-impact infrastructure initiative in Africa, apply now ... are a highly skilled electrical engineer with experience in infrastructure, wastewater, or utility projects, and ...
    ua.talent.com
  • Data Engineer (NLP-Focused)

    Київстар , Kyiv, 9 дней назад
    ... , including data augmentation labeling with LLM as teacher.Set up and manage cloud-based data infrastructure for the project. Configure and ... of experience as a Data Engineer or in a similar role, ...
    ua.talent.com
  • AI QA Engineer

    Київстар , Kyiv, 9 дней назад
    ... are seeking an AI QA Engineer with specialization in LLM NLP model quality assurance to ... the context of our Ukrainian LLM project and other products.About ... is a must, as our LLM is oriented towards Ukrainian – you ...
    ua.talent.com
  • Senior Software Engineer

    Ciklum , , 3 дня назад
    ... , analysts and product owners, we engineer technology that redefines industries and ... a highly skilled Kubernetes Migration Engineer to support the transition from ... , migration strategy, and cloud-native infrastructure, with deep knowledge of AWS ...
    ua.talent.com
  • Senior Golang Software Engineer

    Ciklum , , 11 дней назад
    ... standard for AI infrastructure with a cloud and AI- ... and experienced Software Engineer with deep expertise in cloud infrastructure, networking, and security, alongside a ... security configurations Cloud infrastructure expertise: Hands-on experience designing ...
    ua.talent.com
  • Staff Data Engineer/Data Architect

    Trimble , , 14 дней назад
    ... a highly skilled Staff Data Engineer Data Architect for our Data ... comprehensive cloud solutions.Oversee cloud infrastructure management, including monitoring, maintenance, and ... proven experience as a Data Engineer with current focus on data ...
    ua.talent.com
  • Lead DevOps Engineer with AWS experience (#3623)

    N-iX , , 21 день назад
    ... for a Lead DevOps Engineer to join our Cloud Solutions ... for a hands-on engineer with a strong technical foundation ... platform engineering, DevOps, or infrastructure roles Proficiency in at least ... with at least one Infrastructure as Code tool (e.g., ...
    ua.talent.com
  • IT Infrastructure & Security Manager (IR-428)

    Intellectsoft , Kyiv, 30 дней назад
    ... hands-on, proactive IT Infrastructure & Security Manager to lead and ... a IT help desk engineer system admin or network engineer Experience with internal network management: ... and compliance trends, propose infrastructure improvements, and help assess third- ...
    ua.talent.com
  • Senior Go Software Engineer

    TechBiz Global , ukraine, 11 дней назад
    ... seeking a Senior Go Software Engineer to join one of our ... in Python and familiarity with infrastructure containerization, DevOps & MLOps tools (preferably GCP or AWS). ...
    ua.talent.com
  • Snr AI Security Engineer (Detection)

    Zoom , Dnipro, 4 дня назад
    ... experienced, hands on AI Security Engineer (Detection)with proven experience in AI and LLM Security, to join the Threat ... years experience as a Security Engineer with a focus on AI Security, especially LLM security.Solid understanding of application ...
    ua.talent.com
  • Senior DevOps Engineer

    TechMagic Львів, Україна 10 часов назад
    ... and passionate Senior DevOps Engineer with 4+ years of experience ... with Terraform or similar infrastructure automation tools.Experience with development ... in the team.ResponsibilitiesCloud infrastructure strategy and architecture.Designing efficient ...
    www.techmagic.co
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Dnipro, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Odesa, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Kyiv, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Sokil'nyky, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Odesa, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Kharkiv, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Dnipro, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Dnipro, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Kyiv, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Kharkiv, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Kyiv, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Kharkiv, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Odesa, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Sokil'nyky, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • Site Reliability Engineer ID38563 ($3,000 signing bonus)

    AgileEngine , Sokil'nyky, 6 дней назад
    ... their security services.- Experience with infrastructure-as-code tools like Terraform, Ansible, or CloudFormation.- Certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer Expert).- Familiarity with CI CD ...
    ua.talent.com
  • DevOps Engineer (AWS)

    Andersen Ukraine 21 день назад
    ... , competency building, and peer code infrastructure reviews.RequirementsExperience as a DevOps Engineer for 4-5+ years in ...
    people.andersenlab.com
  • AI Data Engineer

    OpenBet , Lviv, месяц назад
    ... looking for a Data Engineer AI Platform Specialist to build and optimize the infrastructure powering our AI ambitions. In ... cloud services, and implement infrastructure-as-code to enable secure, ... expertise in cloud infrastructure (AWS, Azure, or GCP), data ...
    ua.talent.com
  • Field Service Engineer

    GEA , Bila Tserkva, 3 дня назад
    ... & Experience:Higher technical education (Mechanical Engineer, Power Engineer, Automation Engineer, Electrical Engineer, etc.)3+ years of experience as a service engineer, repair engineer, or operations engineer (students with relevant skills are ...
    ua.talent.com
  • Senior/Middle Data Scientist (Data Preparation & Pre-training)

    Київстар , Kyiv, 5 дней назад
    ... and transformation steps for LLM training datasets, including cleaning and ... data augmentation labeling with LLM as teacher.• Analyze large-scale ... advantage given our project’s focus.MLOps & Infrastructure:• Hands-on experience with containerization ( ...
    ua.talent.com
  • Senior/Middle Data Scientist (Benchmarking & Alignment)

    Київстар , Kyiv, 9 дней назад
    ... other failure modes in LLM outputs.• Develop pipelines for synthetic ... Techniques: • Prior work on LLM safety, fairness, and bias mitigation.• ... given our project’s focus.MLOps & Infrastructure: • Hands-on experience with containerization ( ...
    ua.talent.com
  • LLM Engineer (Remote)

    Lion Group , , 18 часов назад
    ... world-class AI professionals — from LLM Engineers and AI Architects to ... , Milvus)Designing scalable APIs for LLM inferenceExperimenting with prompt engineering and ... on state-of-the-art LLM projectsCollaborative remote-first environmentWork on ...
    ua.talent.com
  • REMOTE Automation Specialist - N8N & LLM Expert

    SalesGenius , ukraine, 18 часов назад
    ... connect our various systems.Implementing LLM-powered solutions to automate content ... :Opportunity to shape our technical infrastructure from the ground up.Continuous ... tools (Zapier, Make Integromat, etc.)LLM Implementation: Experience working with and ...
    ua.talent.com
  • Automation Specialist - N8N & LLM Expert

    SalesGenius , ukraine, 18 часов назад
    ... connect our various systems.Implementing LLM-powered solutions to automate content ... :Opportunity to shape our technical infrastructure from the ground up.Continuous ... tools (Zapier, Make Integromat, etc).LLM Implementation: Experience working with and ...
    ua.talent.com
  • Automation Specialist - N8N & LLM Expert

    Snaphunt , ukraine, 18 часов назад
    ... Automation Specialist - N8N & LLM ExpertThe JobWhat Your ... systems.Implementing LLM-powered solutions to automate content ... our technical infrastructure from the ground up.Continuous ... Integromat, etc).LLM Implementation: Experience working with and ...
    ua.talent.com
  • REMOTE Automation Specialist - N8N & LLM Expert

    Snaphunt , ukraine, 18 часов назад
    ... Automation Specialist - N8N & LLM ExpertThe JobWhat Your ... systems.Implementing LLM-powered solutions to automate content ... our technical infrastructure from the ground up.Continuous ... Integromat, etc.)LLM Implementation: Experience working with and ...
    ua.talent.com

Карточка вакансии:

  • Должность MLOps Engineer (LLM Infrastructure)
  • Размещено: 2025-08-14 00:00:00
  • Город , Kyiv,
  • Зарплата:
  • Компания: Київстар