Senior Software Test Engineer-GenAI Testing
About the Role:
We are seeking a passionate and forward-thinking Senior QA Engineer to lead quality assurance for Generative AI (GenAI) solutions embedded within our Digital Twin platform. This is a high-impact role that goes beyond traditional QA—focusing on the nuanced evaluation, reliability, and guardrails of AI-powered systems in production.
You will be responsible not just for testing, but also for establishing evaluation frameworks, defining AI quality benchmarks, and upskilling other QA engineers in GenAI testing methods. The ideal candidate brings a mix of structured QA discipline, hands-on familiarity with GenAI systems (LLMs, RAG, agents), and a strong sense of ownership.
Key Responsibilities:
• Design and implement end-to-end QA strategies for applications using Node.js, integrated with LLMs, retrieval-augmented generation (RAG), and Agentic AI workflows.
• Establish comprehensive benchmarks and quality metrics for GenAI components including accuracy, coherence, relevance, stability, and safety.
• Develop structured evaluation datasheets for LLM behaviour validation: test prompts, expected responses, classification criteria, and scoring rubrics.
• Perform data quality testing for RAG databases and ensure relevant, high-quality retrieval to minimize hallucinations and improve grounding.
• Conduct A/B testing across model versions, prompt designs, and system configurations to measure and compare output quality.
• Define methodologies and simulate non-deterministic behaviours using Agentic AI testing techniques.
• Collaborate closely with developers, product owners, and AI engineers to test prompt engineering pipelines, function-calling interfaces, and fallback logic.
• Build QA automation where applicable and integrate GenAI evaluations into CI/CD pipelines.
• Lead internal capability development by mentoring QA peers on GenAI testing practices and helping evolve the organization’s AI quality maturity.
Required Skills and Qualifications:
• 6+ years of experience in software quality assurance, with at least 3+ years working in or around GenAI or LLM-based systems.
• Deep understanding of GenAI quality dimensions: response grounding, factual correctness, context awareness, and hallucination minimization.
• Experience creating and maintaining LLM evaluation datasets and designing test cases for dynamic prompt behaviour.
• Hands-on experience with tools and techniques for testing retrieval pipelines, embedding quality, and vector similarity results in RAG architectures.
• Familiarity with non-deterministic testing strategies, agent loop evaluation, and multi-step LLM task validation.
• Comfortable working with APIs, logs, test scripts, and tracing tools to validate both system and AI behaviour.
• Strong analytical thinking and a methodical approach to identifying bugs, regressions, and inconsistencies in AI outputs.
- Bachelor or master’s degree in engineering
Preferred Skills:
• Experience with GenAI tools/platforms like OpenAI, LangChain, Semantic Kernel, Hugging Face, Pinecone, or Weaviate.
• Exposure to evaluating LLMs in production settings, including safety nets, guardrails, and red-teaming approaches.
• Familiarity with prompt tuning, few-shot learning, and function/tool calling in LLMs.
• Basic scripting knowledge (Python, JavaScript, or TypeScript) for building test harnesses or validation utilities.
- Department
- CPO
- Locations
- Bengaluru
- Remote status
- Hybrid
- Employment type
- Full-time
- Employment level
- First /Mid-Level Officials

Bengaluru
OUR POWER IS CURIOSITY, CREATION AND INNOVATION
We believe you love to experiment, challenge the established, co-create, develop and cultivate. Together we can explore new answers to today’s challenges and future opportunities, and talk about how industrial digitalisation can be a part of the solution for a better tomorrow. We believe that different perspectives are crucial for developing gamechanging technology for a better tomorrow. Join us in taking on this challenge!
Already working at Kongsberg Digital?
Let’s recruit together and find your next colleague.