AI MLOPS ENGINEER

Negotiable Salary

Indeed

Full-time

Onsite

No experience limit

No degree limit

6X84+33W, Nady Somoha Al Ryadi, عزبة سعد، Sidi Gaber, Alexandria Governorate 5432080, Egypt

Favourites

New tab

Description

**AI MLOPS ENGINEER** Automating the end‑to‑end lifecycle of GPU‑accelerated LLM services for a governance, risk and compliance (GRC) AI‑powered platform. **Location:** Can work remote plus visit the office in Alexandria, Egypt (San Stefano) for team meetings. **Job Type:** Full‑time, contract **Reports to:** AI Systems Architect AI MLOPS ENGINEERING Do you thrive at the intersection of machine‑learning and DevOps—turning notebooks into reproducible, secure, low‑latency services? We are hiring MLOps Engineers who will build and maintain the pipelines, versioning, monitoring and automated rollout mechanics that keep our modules (such as Llama 3 and Mixtral 8x7B) healthy in production. You will ensure every fine‑tuned model that leaves the lab is traceable, auditable and resilient across GPU clusters. BACKGROUND At Allendevaux \& Company we are combining deep regulatory expertise with cutting‑edge AI to create an intelligent GRC platform, intertwined with data‑protection and compliance workflows. You will join a lean AI cohort in Alexandria, collaborating with ML/LLM Engineers, DevOps specialists and security peers to ship trustworthy, well‑governed inference services. EXPERIENCE Ideal candidates have hands‑on experience implementing CI/CD for ML (GitOps, Argo CD, or similar), orchestrating containerised GPU workloads (Kubernetes, Kubeflow, MLflow, or Airflow), and instrumenting model‑health dashboards (Prometheus \+ Grafana). Familiarity with secure model‑registry practices, data‑lineage tracking, and role‑based access control (e.g., OPA, Kyverno) is highly desirable. PERSONAL CHARACTERISTICS \& SKILLS * Clear written and spoken English; able to document pipelines and incident playbooks. * Systems thinker who anticipates edge cases and rollback scenarios. * Collaborative, partnering with ML engineers on experiment tracking and with DevOps on infrastructure as code. * Security‑first mindset—treating secrets management, encryption, and audit logs as part of “done.” * Continuous learner who keeps pace with the rapidly evolving MLOps tooling landscape. RESPONSIBILITIES * Building Git‑driven CI/CD pipelines that automate training, validation, security scanning and promotion of Docker images. * Managing model registries and artifact stores; versioning checkpoints and metadata for lineage and rollback. * Orchestrating GPU workloads via Kubernetes, Triton Inference Server or vLLM with autoscaling policies. * Implementing drift, latency and cost‑metrics dashboards; alerting on SLA/SLO breaches. * Securing secrets, service accounts and network policies in line with ISO 27001 controls. * Developing IaC modules (Terraform, Helm, kustomize) for reproducible environment provisioning. * Collaborating with Prompt‑Engineering and UX teams to enable A/B or canary testing of new prompt or model versions. * Writing runbooks and conducting post‑incident reviews to drive continuous improvement. MINIMUM QUALIFICATIONS * Bachelor’s degree in Computer Science, Data Engineering or related field; **master’s degree preferred**. * 3\+ years in DevOps or SRE roles, with at least 1 year focused on ML or data‑pipeline operations. * Proficiency with Docker, Kubernetes and one MLOps framework (Kubeflow, MLflow, Metaflow, or similar). * Experience automating GPU driver/firmware management and monitoring GPU utilisation. PREFERRED QUALIFICATIONS * Experience with Argo Workflows, Airflow or Dagster for pipeline orchestration. * Familiarity with NVIDIA Triton, TorchServe or Hugging Face TGI deployment stacks. * Knowledge of policy‑as‑code tools (OPA, Kyverno) and supply‑chain security (Cosign, SLSA). * Exposure to standards and frameworks (ISO 27001, ISO 42001, SOC 2\) or GRC data sets.

Source: indeed View original post