Remote/Novi Sad, Serbia

MLOps Engineer

About Us:

At Qbitech, we are pioneers in software development and consulting, leveraging cutting-edge technologies to create innovative solutions for a global client base. Located in Novi Sad, Serbia, a growing tech hub, our team consists of skilled professionals with deep expertise in FinTech and E-Commerce. Since our founding in 2021, we've focused on building a technology-driven future where any concept can become a reality.

Position Overview:

As an MLOps Engineer, you will build and own the core infrastructure that powers our AI-first product development. You will design, implement, and scale the systems that make it possible for our product teams to rapidly develop, evaluate, and safely deploy both LLM-based and traditional ML applications.

You will be responsible for the entire AI lifecycle infrastructure—from model gateways and RAG pipelines to automated evaluation frameworks and cost-optimized serving. You'll work closely with AI application engineers, full-stack engineers, and product teams to ensure every AI system we ship for our FinTech and E-Commerce clients is fast, reliable, secure, and observable.

Full Time

Requirements

5+ years of experience in ML infrastructure, MLOps, applied ML, or a related software engineering role.
Strong proficiency in Python.
Hands-on, production-level experience with ML/LLM frameworks (e.g., LangChain, LlamaIndex, Haystack, or similar).
Solid understanding of retrieval-augmented generation (RAG), embeddings, vector databases, and ML/LLM evaluation methodologies.
Proven experience deploying and managing models or AI systems in a production cloud environment (e.g., AWS, GCP, or Azure).
Familiarity with prompt management, LLM observability, and CI/CD automation for AI workflows.
Experience with observability tools (e.g., Datadog, CloudWatch, Prometheus) and databases (e.g., PostgreSQL, MySQL).
Strong communication, problem-solving, and collaboration skills.
Ability to communicate effectively in English.

Nice to have

Experience with high-performance model serving frameworks (e.g., vLLM, Triton, Ray Serve, KServe).
Deep understanding of specific LLM evaluation frameworks (e.g., Open AI Evals, Promptfoo, TruLens).
Experience with vector databases (e.g., Pinecone, Weaviate, Qdrant).
Familiarity with specific AWS services (e.g., Lambda, DynamoDB, Aurora).
Experience with API development using REST or GraphQL.

Responsibilities

Build and Operate the ML/LLM Platform: Design, develop, and maintain core services for model routing, prompt registry, orchestration, and versioning.
Enable Fast, Safe Experimentation: Implement and manage automated evaluation pipelines (offline and online) with golden sets, rubrics, and regression detection.
Automate AI Lifecycle: Develop and maintain robust CI/CD processes specifically for ML models, prompts, and AI workflows, including approval gates and rollback capabilities.
Collaborate Cross-Functionally: Partner with product and engineering pods to instrument RAG pipelines, integrate retrieval systems, and ensure seamless deployment of AI features.
Optimize Performance and Cost: Profile latency, token usage, and caching strategies. Build comprehensive observability and monitoring for all AI systems (tracking costs, model behavior, and performance).
Ensure Reliability and Safety: Implement and enforce critical AI guardrails (e.g., PII filtering, toxicity detection, jailbreak detection), which are especially crucial for sensitive FinTech and E-Commerce data.
Manage Model Integration: Integrate, manage, and optimize third-party LLM APIs (e.g., OpenAI, Anthropic, Mistral) alongside internal fine-tuned models.

Remote/Novi Sad, Serbia

MLOps Engineer

APPLY VIA EMAIL

[email protected]

Any questions? E-mail me.

Jefimija

[email protected]

What will be your next steps?

Quick non-technical conversation

Our initial conversation is a brief, non-technical discussion to understand your background and career aspirations. We're keen to learn about your communication style and how you approach teamwork and decision-making.

60 to 90 minutes technical interview

This in-depth technical assessment, lasting 60 to 90 minutes, is designed to evaluate your specific skills and expertise. We will present you with challenges relevant to our client's requirements.

Client interview

In this stage, you will meet directly with the client for a final technical discussion. This interview will be similar in format to our internal technical assessment, allowing the client to see firsthand how your expertise aligns with their specific project needs and team.

Offer

Congratulations on successfully completing our rigorous evaluation process. We are pleased to extend an offer and recommend you to our clients.