Rutuja Raut, Author at SpurQLabs | Software Testing Services | Test Automation Services

Teaching a Private SLM About Your Target Application Using Document RAG for QA Testing

by Rutuja Raut | Mar 12, 2026 | Blog

A Private Small Language Models (SLMs) hosted onsite or on private cloud are becoming the default choice in enterprise QA teams because of privacy, compliance, and control. But the moment we try to use a private SLM for real QA work—generating test cases, understanding application flows, or validating business rules—we hit a hard truth: the model doesn’t know our target application under test. It doesn’t understand our requirements, our test plans, our architecture, or even the terminologies specific to the domain (Finance, Telecom, Life Sciences). As a result, the SLM produces generic, assumption-driven answers that cannot be trusted in a testing environment. This challenge is exactly where RAG for QA Testing becomes valuable.

In this blog, I’ll show how we solved this problem by teaching the SLM about the target application using Document-based Retrieval-Augmented Generation (RAG), and how this approach transforms a private SLM from a generic text generator into a project-aware QA assistant for RAG for QA Testing workflows.

1. Introduction

Private SLMs are widely used in QA teams because they are secure and work inside enterprise environments. But when we try to use a private SLM for real QA tasks—like understanding application flows or generating test cases—we face a common issue: the SLM does not know our target application. It has no idea about our requirements, test cases, or business rules, so it gives only generic answers.

In this blog, I show how we solve this problem by teaching the SLM using Document-based RAG (Retrieval-Augmented Generation). By connecting the SLM to real application-specific documents, the model starts answering based on actual application behaviour. Through real screenshots, I’ll show how Document RAG turns a private SLM into a useful and reliable QA assistant.

2. The Real Problem with Private SLMs in QA

When we use a private SLM in QA projects, we often expect it to behave like a smart team member who understands our application. But in reality, a private SLM only knows general software knowledge, not your application-specific details, as it comes with a fixed set of information.

It does not know:

How our application works
What modules and flows exist
What validations do the requirements define?
How do QA engineers write test cases for the target application?

So when a QA engineer asks questions like:

“Explain the onboarding flow of our application.”
“Generate test cases for the Add Vendor feature.”
“What are the negative scenarios for the SKYBoard module?”

The private SLM gives generic answers based on assumptions, not based on the real application. These answers may look correct at first glance, but they often miss important business rules, edge cases, and validations that matter in testing.

In QA, generic answers are dangerous. They reduce trust in AI, force testers to double-check everything, and limit the real value of using SLMs in testing workflows.

This is the core problem:

Private SLMs are powerful, but they are completely unaware of your target application unless you teach them.

3. Why is Document RAG Mandatory for QA Testing

To make a private SLM useful for QA, we must teach it on the target application, its concepts, terminologies, workflows, etc. Without this, the model will always give generic answers, no matter how advanced it is.

This is where Document-based Retrieval-Augmented Generation (RAG) becomes mandatory.

Instead of training or fine-tuning the SLM, Document RAG works by:

Storing target application documents outside the model
Searching those documents when a user asks a question.
Providing only the relevant content to the SLM at runtime

This means the SLM answers questions based on the well-documented target application knowledge base, not assumptions.

For QA teams, this is especially important because:

Requirements change frequently
Test cases evolve every sprint
New features introduce new flows
Teams keep updating demo videos and documentation (Or not 😀).

Fine-tuning a model every time something changes is not practical. Document RAG solves this by keeping the knowledge dynamic and always up to date.

In simple terms:

Document RAG does not change the SLM — it teaches the SLM using your actual target application documents.

This approach allows the private SLM to understand:

Application flows
Business rules
Validation logic
Real test scenarios

In the next sections, I’ll show how this works in practice using screenshots from my RAG implementation.

4. What I Built – Document RAG System for QA

To solve the problem of private SLMs not understanding target applications, I built a Document RAG system specifically designed for QA software testing.

The idea was simple:
Instead of expecting the SLM to “know” the application, we connect it directly to the documents containing the target application knowledge base and let it learn from them at query time.

High-Level Architecture

The system has four main parts:

Application Documents as Source of Truth
The system stores all QA-related documents in a single place.
- Requirement documents
- Test cases and test plans
- Architecture notes
- JSON and structured QA data
- Demo and walkthrough videos
RAG Engine (Document Processing Layer)
The RAG engine:
- Reads documents from the workspace
- Splits them into meaningful chunks
- Converts them into vector embeddings
- Stores them in a vector database
Private SLM (Reasoning Layer)
The system uses a private SLM only for reasoning.
It does not store application knowledge permanently.
It answers questions using the context provided by RAG.
MCP Server (Integration Layer)
The system exposes the RAG system as an MCP tool, so the SLM can:
- Query documents.
- Perform deep analysis
- Retrieve answers with traceable sources

This design keeps the system:

Modular
Secure
Easy to extend across multiple projects

How QA Engineers Use It

QA engineers interact with the system directly from VS Code using the Continue extension. They can ask real project questions, such as:

“Explain the Add Employee flow.”
“Generate test cases for this module.”
What validations do the requirements define?

The answers come only from indexed documents, making the output reliable and QA-friendly.

5. Implementation – Documents Indexed into RAG

The first and most important step in teaching a private SLM is feeding it the right knowledge. In my implementation, this knowledge comes directly from target application documents, not sample data or assumptions.

What the RAG System Indexes

The RAG system continuously scans a dedicated workspace folder that contains all QA-related artifacts, such as:

Requirement documents (.pdf, .docx, .txt)
Test cases and test plans
Architecture and functional notes
JSON and structured QA data
Demo and walkthrough videos

These documents represent the single source of truth for the application.

How Documents Are Prepared for RAG

When teams add or update documents:

The RAG engine reads each file from the workspace (local file system, Google Drive, OneDrive, etc)
The RAG engine cleans and normalizes the content (especially PDFs).
The RAG engine splits large documents into meaningful chunks.
The system converts each chunk into vector embeddings.
The system stores the embeddings in a vector database.

This process ensures that:

The system does not lose any important knowledge.
Large documents remain searchable
Retrieval is fast and accurate

Why This Matters for QA

Because the RAG engine indexes the documents directly from the workspace:

The SLM always works with the latest information from documents
Updated test cases are immediately available
The system does not require retraining or fine-tuning.

From a QA perspective, this is critical.
The AI assistant answers questions only based on what exists in the target application documents, not on general industry assumptions.

RAG_SYSTEM :

This screenshot shows the actual workspace structure used by the Document RAG system:

target_docs/
Contains real QA artifacts:
- Requirement documents (PDF)
- Test case design files
- JSON configuration data
- Excel-based test data
- Demo images and videos
target_docs/videos/
Stores walkthroughs, and demo videos that are indexed using:
- Speech-to-text (video transcripts)
- OCR on video frames (for UI text)
db_engine/
This is the vector database generated by the RAG engine:
- chroma.sqlite3 stores embeddings
- Chunked document knowledge lives here

6. Ask QA Questions Using VS Code (Continue + MCP)

Once the documents are indexed, the next step is how QA engineers/Testers actually use the system in their daily work. In my implementation, everything happens inside VS Code, using the Continue extension connected to the RAG system through an MCP server.

QA Workflow Inside VS Code

Instead of switching between tools, documents, and browsers, a QA engineer can simply ask questions directly in VS Code, such as:

“How do I add a new employee in the PIM module?”
“Explain the validation rules for this feature.”
“Generate test cases based on the requirement document.”

These are real QA questions that require application-specific knowledge, not generic AI answers.

What Happens Behind the Scenes

When a question is asked in Continue:

The query is sent to the MCP server
The MCP server invokes the RAG tool
Relevant documents are retrieved from the vector database
The retrieved content is passed to the private SLM
The SLM generates an answer strictly based on those documents

At no point does the SLM guess or rely on public knowledge.

Why MCP Matters Here

Using MCP provides a clean separation of responsibilities:

This makes the system:

Modular
Scalable
Easy to extend across projects

For QA teams, this means the AI assistant behaves like an application-aware testing expert, not a generic chatbot.

This screenshot demonstrates how Model Context Protocol (MCP) is used to connect a private SLM with the Document RAG system during a real QA query.

You can see the list of registered MCP tools, such as:

🔎 rag_query – Standard RAG Query Tool

This is the primary tool used for document-based question answering.

It allows QA engineers to ask questions about the client application.
If debug=True, it returns structured JSON that includes:

Original user question
Rewritten query (if applied)
Whether query rewriting was triggered
Retrieved document sources
Final generated answer

This tool ensures that responses are grounded in real client documents.

🎥 index_video – Index a Single Video

This tool indexes a single demo or walkthrough video into the RAG database.

It processes:

Speech-to-text transcription

Optional OCR on video frames

Once indexed, video knowledge becomes searchable like any other document.

📂 index_all_videos – Bulk Video Indexing

This tool scans the target_docs/videos directory and indexes all .mp4 files into the RAG database at once.

It is useful when:

New KT sessions are added
Demo recordings are uploaded
Large batches of videos need indexing

🧠 hybrid_deep_query – Advanced RAG + Full Document Context

This tool is designed for complex or high-precision queries.

It works by:

Using RAG to identify the most relevant files
Loading the complete content of those files (CAG – Context-Aware Generation)
Generating a deep, fully context-grounded answer

This is ideal for detailed QA analysis or requirement validation.

❤️ health_check – Connectivity Verification

A lightweight tool that verifies whether the MCP server is running and properly connected to the vector database.

This helps ensure:

Server availability
Database presence
Stable MCP communication

Screenshot: Asking a QA Question in VS Code

This screenshot demonstrates:

A real QA question typed inside VS Code: Retrieve information related to how to add a new employee in the PIM Module using RAG …
Continue invoking the RAG MCP tool: rag_query tool
The workflow stays fully inside the IDE
On the right side, when a QA question is asked,

Continue clearly shows:

Continue to use the RAG rag_query tool.
This is a very important indicator.

This message confirms that:

The SLM is not answering from its own knowledge
The response is generated by calling the RAG MCP tool
Documents are actively retrieved and used to form the answer

In other words, the SLM is behaving like a tool user, not a guessing chatbot.

What This Means for QA Testing

For QA engineers, this brings confidence and transparency:

Answers are based on real application documentation
No hallucination or assumed workflows
Clear visibility into which tool was used
Easy to debug and validate AI responses

This is critical in QA, where incorrect assumptions can lead to missed defects and unreliable test coverage.

Key Takeaway from This Screenshot

MCP makes RAG visible, verifiable, and production-ready.

Instead of hiding retrieval logic inside prompts, MCP exposes RAG as a first-class QA tool that the private SLM explicitly uses. This is what turns AI from an experiment into a trusted QA assistant

7. Advanced RAG in Action – Query Rewriting & Source-Aware Retrieval

One of the biggest challenges in QA is that they ask questions in human language, and documents speak a more formal and sophisticated language.

QA engineers usually ask questions like:

“How does a supervisor approve or reject a timesheet?”
“What happens after submission?”

But documentation often uses:

Formal headings
Role-based terms
Structured language (Supervisor, Manager, Approval Workflow, etc.)

If we send the user’s raw question directly to vector search, retrieval can be incomplete or noisy.

To solve this, I implemented Query Rewriting as part of my RAG pipeline — a key feature that turns this into an advanced, enterprise-grade RAG system.

What Is Query Rewriting in RAG?

Query rewriting means:

Taking a conversational QA question
Understanding the intent
Converting it into a clean, focused retrieval query
Then, using that rewritten query to fetch documents

In simple words:

Users ask questions like humans.
Documents are written like manuals/SOPs/Workflows.
Query rewriting bridges that gap.

How Query Rewriting Works in My RAG System

Before document retrieval happens:

The system looks at:
- Current question
- Recent conversation history
It rewrites the question into a single, standalone search query
That rewritten query is used for vector retrieval
Only the most relevant document chunks are passed to the SLM

This step dramatically improves:

Retrieval precision
Answer accuracy
QA trust in AI outputs

This screenshot demonstrates an advanced RAG capability that goes beyond basic document retrieval — query rewriting combined with source-level traceability.

Query Rewriting in Action (Left Panel)

On the left side, the RAG system returns a structured debug response that clearly shows how the user’s question was processed before retrieval.

The original user question was:

“How does a supervisor approve or reject an employee’s timesheet?”

Before performing a document search, the system automatically rewrote the query into a more focused retrieval term:

question_rewritten: "Supervisor"
rewrite_enabled: true
rewrite_applied: true

This step is critical because QA engineers usually ask questions in natural language, while documentation is written using formal role-based terminology. Query rewriting bridges this gap and ensures that the retrieval engine searches using the language of the documentation, not the language of conversation.

Document-Backed Retrieval with Exact Page References

The same debug output also shows the retrieved sources:

Application document: OrangeHRM User Guide (PDF)
Exact page numbers: pages 113 and 114
Multiple retrieved chunks confirming consistency

On the right side, the generated answer is explicitly labeled as:

“As documented in the OrangeHRM User Guide – pages 113–114.”

This confirms that:

The response is not generated from model assumptions
Every step is grounded in real application documentation
QA engineers can instantly verify the source

Why This Matters for QA Software Testing

In QA, accuracy and traceability are more important than creativity.

This screenshot proves that:

The private SLM does not hallucinate
Answers come strictly from approved documents
Every response can be audited back to the source PDF

If the information is not found, the system safely responds with:

“I don’t know based on the documents.”

This controlled behaviour is intentional and essential for building trust in AI-assisted QA workflows.

Key Takeaway

Advanced RAG is not just about retrieving documents — it’s about retrieving the right content, for the right question, with full traceability.

Query rewriting ensures precise retrieval, and source-level evidence ensures QA-grade reliability. Together, they transform a private SLM into a trusted, project-aware QA assistant.

8. What Types of Files and Resources Does This RAG System Support?

In real projects, knowledge is never stored in a single format.
Requirements, Designs, architectures, user guides, manuals, test cases, configurations, and data are scattered across multiple file types. A useful RAG system must be able to understand all of them, not just PDFs.

This RAG system is designed to index and reason over multiple relevant file formats, all from a single workspace.

Supported File Types in the RAG Workspace

As shown in the screenshot, the target_docs folder acts as the knowledge source for the RAG engine. It supports the following resource types:

📄 Text & Documentation Files

.txt – Test case descriptions, notes, and exploratory testing ideas
.pdf – Official requirement documents, user guides, specifications
.md – QA documentation and internal knowledge pages

These files are:

Cleaned
Chunked
Indexed into the vector database for semantic search

📊 Structured Test Data Files

.json – Configuration values, test inputs, environment data
.xlsx / .csv – Professional test data sheets, boundary values, scenarios

Structured files are especially important in QA because they represent real test inputs, not just documentation.

🖼 Images & Visual Assets

.png, .jpg (via OCR)
- Screenshots
- Error messages
- UI states

Text inside images is extracted using OCR and indexed, allowing the SLM to answer questions based on visual evidence, not assumptions.

🎥 Videos (Optional but Supported)

Demo recordings
Product Walkthrough videos
KT session recordings

Videos are processed using:

Speech-to-text (audio transcription)
Optional OCR on video frames

This allows QA teams to query spoken explanations that never existed in written form.

Why This Matters for QA Teams

This multi-format support ensures that:

No QA knowledge is lost
Testers don’t need to rewrite documents for AI
The SLM learns from exactly what the team already uses

Instead of changing QA workflows, the RAG system adapts to existing QA artifacts.

Key Takeaway

A QA RAG system is only as good as the data it can understand. (Garbage->In->Garbage->Out)

By supporting documents, structured data, images, and videos, this RAG system becomes a true knowledge layer for QA, not just a document chatbot.

9. Why This Approach Scales Across QA Projects

One of the biggest mistakes teams make with AI in QA is building solutions that work for one project but collapse when reused for another. This RAG-based approach was intentionally designed to scale across multiple QA projects and different applications without rework.

No Application Specific Hardcoding

The RAG system does not hardcode:

Application names
Module flows
Test scenarios
Business rules

Instead, each team teaches the SLM through its own documents.
When a new project starts, the only action required is:

Add the applications’ QA artifacts to the target_docs folder
Rebuild the index

The same RAG engine and MCP tools continue to work without change.

Document-Driven Knowledge, Not Model Memory

Because all knowledge lives in documents:

No fine-tuning is required per application
No retraining cost
No risk of cross-application data leakage

Each application’s knowledge stays isolated at the document level, which is critical for:

Enterprise security
Compliance
Multi-application QA environments

MCP Makes the System Reusable Everywhere

Exposing RAG through MCP tools makes this system:

IDE-agnostic
SLM-agnostic
Workflow-independent

Whether QA teams use:

VS Code today
Another IDE tomorrow
Different private SLMs in the future

The same MCP contract remains valid.

This is what makes the solution future-proof.

Works for Different QA Maturity Levels

This approach scales naturally across teams:

Manual QA teams
→ Use it to understand requirements and flows
Automation QA teams
→ Generate scenarios, validations, and test logic
New joiners
→ Faster onboarding using project-specific answers
Senior QA / Leads
→ Analyse coverage, gaps, and test strategies

All without changing the system.

Minimal Maintenance, Maximum Reuse

When requirements change:

Update the document
Re-run indexing

That’s it.

There is no need to:

Rewrite prompts
Update AI logic
Touch model configurations

This makes the system low-maintenance and high-impact.

Key Takeaway

Scalable AI is not built by making the model smarter —
It’s built by making the knowledge portable.

By combining Document RAG, MCP, and private SLMs, this approach delivers an application-aware/Domain-aware QA assistant that scales effortlessly across projects, teams, and organizations.

Conclusion

Using AI in QA is not about choosing the most powerful SLM or LLM, for that matter. It’s about making the SLM understand the target application or target domain. A private SLM, by itself, does not know requirements, business flows, or test logic, which makes its answers generic and unsafe for real testing work.

This is where Document-based RAG becomes essential for RAG for QA Testing. By grounding the SLM in real application artifacts—BRD/PRD/SRS/requirements, Designs, Architectures, test cases, data files, and user guides— the AI is able to produce answers that are accurate, verifiable, and relevant to the project. Advanced capabilities like query rewriting and source traceability further ensure that every response is backed by documented evidence, eliminating hallucinations.

Exposing this intelligence through MCP tools makes the system transparent, reusable, and scalable across multiple projects and applications. The architecture stays the same; only the documents change. This keeps maintenance low while maximizing impact.

Final Thought

AI becomes truly useful in QA when it stops guessing and starts learning from real application knowledge.

By combining private SLMs with Document RAG and MCP, we can build AI-powered QA assistants that teams can trust, audit, and scale with confidence.

Click here to read more blogs like this.

Rutuja Raut

QA Engineer exploring the future of AI in software testing. Working with Playwright, modern automation frameworks, LLMs, and Agentic AI to build smarter, efficient testing solutions. Interested in AI-driven automation, intelligent QA systems, self-healing tests, and modern testing architectures.

Zero Code, Zero Headache – How to do Manual Testing with Playwright MCP?

by Rutuja Raut | Oct 3, 2025 | Blog

Manual Testing with Playwright MCP – Have you ever felt that a simple manual test should be less manual?

For years, quality assurance relied on pure human effort to explore, click, and record. But what if you could perform structured manual and exploratory testing, generate detailed reports, and even create test cases—all inside your Integrated Development Environment (IDE), using zero code?

I’ll tell you this: there’s a tool that can help us perform manual testing in a much more structured and easy way inside the IDE: Playwright MCP.

Section 1: End the Manual Grind – Welcome to AI-Augmented QA

The core idea is to pair a powerful AI assistant (like GitHub Copilot) with a tool that can control a real browser (Playwright MCP). This simple setup is done in only a few minutes.

The Essential Setup for Manual Testing with Playwright MCP: Detailed Steps

For this setup, you will integrate Playwright MCP as a tool that your AI agent can call directly from VS Code.

1. Prerequisites (The Basics)

VS Code installed in your system.
Node.js (LTS version recommended) installed on your machine.

2. Installing GitHub Copilot (The AI Client)

Open Extensions: In VS Code, navigate to the Extensions view (Ctrl+Shift+X or Cmd+Shift+X).
Search and Install: Search for “GitHub Copilot” and “GitHub Copilot Chat” and install both extensions.

Authentication: Follow the prompts to sign in with your GitHub account and activate your Copilot subscription.
- GitHub Copilot is an AI-powered code assistant that acts almost like an AI pair programmer.

After successful installation and Authentication, you see something like below

3. Installing the Playwright MCP Server (The Browser Tool)

Playwright MCP (Model Context Protocol): This is the bridge that provides browser automation capabilities, enabling the AI to interact with the web page.

The most direct way to install the server and configure the agent is via the official GitHub page:
Navigate to the Source: Open your browser and search for the Playwright MCP Server official GitHub page (https://github.com/microsoft/playwright-mcp).
The One-Click Install: On the GitHub page, look for the Install Server VSCode button.

Launch VS Code: Clicking this button will prompt you to open Visual Studio Code.

Final Step: Inside VS Code, select the “Install server” option from the prompt to automatically add the MCP entry to your settings.

To verify successful installation and configuration, follow these steps:
- Click on “Configure Tool” icon

After clicking on the “configure tool “ icon, you see the tools of Playwright MCP as shown in the below image.

After clicking on the “Settings” icon, you see the “Configuration (JSON)” file of Playwright MCP, where you start, stop, and restart the server as shown in image below

{
    "servers": { 
        "playwright": { 
            "command": "npx", 
            "args": [ 
                "@playwright/mcp@latest" 
            ], 
            "type": "stdio" 
        } 
    }, 
    "inputs": [] 
}

1. Start Playwright MCP Server:

After the Playwright MCP Server is successfully configured and installed, you will see the output as shown below.

2. Stop and Restart Server

Playwright MCP Start Stop Restart Server

This complete setup allows the Playwright MCP Server to act as the bridge, providing browser automation capabilities and enabling the GitHub Copilot Agent to interact with the web page using natural language.

Section 2: Phase 1: Intelligent Exploration and Reporting

The first, most crucial step is to let the AI agent, powered by the Playwright MCP, perform the exploratory testing and generate the foundational report. This immediately reduces the tester’s documentation effort.

Instead of manually performing steps, you simply give the AI Agent your test objective in natural language.

The Exploration Workflow:

Exploration Execution: The AI uses discrete Playwright MCP tools (like browser_navigate, browser_fill, and browser_click) to perform each action in a real browser session.
Report Generation: Immediately following execution, the AI generates an Exploratory Testing Report. This report is generated on the basis of the exploration, summarizing the detailed steps taken, observations, and any issues found.

Our focus is simple: Using Playwright MCP, we reduce the repetitive tasks of a Manual Tester by automating the recording and execution of manual steps.

Execution Showcase: Exploration to Report

Input (The Prompt File for Exploration)

This prompt directs the AI to execute the manual steps and generate the initial report.

Prompt for Exploratory Testing

Exploratory Testing: (Use Playwright MCP)

Navigate to https://www.demoblaze.com/. Use Playwright MCP Compulsory for Exploring the Module <Module Name> and generate the Exploratory Testing Report in a .md file in the Manual Testing/Documentation Directory.

Output (The Generated Exploration Report)
The AI generates a structured report summarizing the execution.

Live Browser Snapshot from Playwright MCP Execution

Section 3: Phase 2: Design, Plan, Execution, Defect Tracking

Once the initial Exploration Report is generated, QA teams move to design specific, reusable assets based on these findings.

1. Test Case Design (on basis of Exploration Report)

The Exploration Report provides the evidence needed to design formal Test Cases. The report’s observations are used to create the Expected Results column in your CSV or Test Management Tool.

The focus is now on designing reusable test cases, which can be stored in a CSV format.
These manually designed test cases form the core of your execution plan.
We need to provide the Exploratory Report for References at the time of design test Cases.
Drag and drop the Exploratory Report File as context as shown in the image below.

Input (Targeted Execution Prompt)

This prompt instructs the AI to perform a single, critical verification action from your Test Case.

Role: Act as a QA Engineer. 
Based on Exploratory report Generate the Test cases in below of Format of Test Case Design Template 
======================================= 
🧪 TEST CASE DESIGN TEMPLATE For CSV File 
======================================= 
Test Case ID – Unique identifier for the test case (e.g., TC_001) 
Test Case Title / Name – Short descriptive name of what is being tested 
Preconditions / Setup – Any conditions that must be met before test execution 
Test Data – Input values or data required for the test 
Test Steps – Detailed step-by-step instructions on how to perform the test 
Expected Result – What should happen after executing the steps 
Actual Result – What happened (filled after execution) 
Status – Pass / Fail / Blocked (result of the execution) 
Priority – Importance of the test case (High / Medium / Low) 
Severity – Impact level if the test fails (Critical / Major / Minor) 
Test Type – (Optional) e.g., Functional, UI, Negative, Regression, etc. 
Execution Date – (Optional) When the test was executed 
Executed By – (Optional) Name of the tester 
Remarks / Comments – Any additional information, observations, or bugs found

Output (The Generated Test cases)

The AI generates structured test cases.

2. Test Plan Creation

The created test cases are organized into a formal Test Plan document, detailing the scope, environment, and execution schedule.

Input (Targeted Execution Prompt)

This prompt instructs the AI to perform a single, critical verification action from your Test Case. 2

Role: Act as a QA Engineer.
- Use clear, professional language. 
- Include examples where relevant. 
- Keep the structure organized for documentation. 
- Format can be plain text or Markdown. 
- Assume the project is a web application with multiple modules. 
generate Test Cases in Form Of <Module Name >.txt in Manual Testing/Documentation Directory  
Instructions for AI: 
- Generate a complete Test Plan for a software project For Our Test Cases 
- Include the following sections: 
  1. Test Plan ID 
  2. Project Name 
  3. Module/Feature Overview 
  4. Test Plan Description 
  5. Test Strategy (Manual, Automation, Tools) 
  6. Test Objectives 
  7. Test Deliverables 
  8. Testing Schedule / Milestones 
  9. Test Environment 
  10. Roles & Responsibilities 
  11. Risk & Mitigation 
  12. Entry and Exit Criteria 
  13. Test Case Design Approach 
  14. Metrics / Reporting 
  15. Approvals

Output (The Generated Test plan)

The AI generates structured test plan of designed test cases.

3. Test Cases Execution

This is where the Playwright MCP delivers the most power: executing the formal test cases designed in the previous step.

Instead of manually clicking through the steps defined in the Test Plan, the tester uses the AI agent to execute the written test case (e.g., loaded from the CSV) in the browser.
The Playwright MCP ensures the execution of those test cases is fast, documented, and accurate.
Any failures lead to immediate artifact generation (e.g., defect reports).

Input (Targeted Execution Prompt)

This prompt instructs the AI to perform a single, critical verification action from your Test Case.

Use Playwright MCP to Navigate “https://www.demoblaze.com/” and Execute Test Cases attached in context and Generate Test Execution Report.

First, Drag and drop the test case file for references as shown in the image below.

Live Browser Snapshot from Playwright MCP Execution

Output (The Generated Test Execution report)

The AI generates structured test execution report of designed test cases.

4. Defect Reporting and Tracking

If a Test Case execution fails, the tester immediately leverages the AI Agent and Playwright MCP to generate a detailed defect report, which is a key task in manual testing.

Execution Showcase: Formal Test Case Run (with Defect Reporting)

We will now execute a Test Case step, intentionally simulating a failure to demonstrate the automated defect reporting capability.

Input (Targeted Execution Prompt for Failure)

This prompt asks the AI to execute a check and explicitly requests a defect report and a screenshot if the assertion fails.

Refer to the test cases provided in the Context and Use Playwright MCP to execute the test, and if there is any defect, then generate a detailed defect Report. Additionally, I would like a screenshot of the defect for evidence.

Output (The Generated Defect report and Screenshots as Evidence)

The AI generates a structured defect report of designed test cases.

Conclusion: Your Role is Evolving, Not Ending

Manual Testing with Playwright MCP is not about replacing the manual tester; it’s about augmenting their capabilities. It enables a smooth, documented, and low-code way to perform high-quality exploratory testing with automated execution.

Focus on Logic: Spend less time on repetitive clicks and more time on complex scenario design.
Execute Instantly: Use natural language prompts to execute tests in the browser.
Generate Instant Reports: Create structured exploratory test reports from your execution sessions.
Future-Proof Your Skills: Learn to transition seamlessly to an AI-augmented testing workflow.

It’s time to move beyond the traditional—set up your Playwright MCP today and start testing with the power of an AI-pair tester!

Rutuja Raut

Teaching a Private SLM About Your Target Application Using Document RAG for QA Testing

1. Introduction

2. The Real Problem with Private SLMs in QA

3. Why is Document RAG Mandatory for QA Testing

4. What I Built – Document RAG System for QA

High-Level Architecture

How QA Engineers Use It

5. Implementation – Documents Indexed into RAG

What the RAG System Indexes

How Documents Are Prepared for RAG

Why This Matters for QA

RAG_SYSTEM :

6. Ask QA Questions Using VS Code (Continue + MCP)

QA Workflow Inside VS Code

What Happens Behind the Scenes

Why MCP Matters Here

🔎 rag_query – Standard RAG Query Tool

🎥 index_video – Index a Single Video

📂 index_all_videos – Bulk Video Indexing

🧠 hybrid_deep_query – Advanced RAG + Full Document Context

❤️ health_check – Connectivity Verification

Screenshot: Asking a QA Question in VS Code

What This Means for QA Testing

Key Takeaway from This Screenshot

7. Advanced RAG in Action – Query Rewriting & Source-Aware Retrieval

What Is Query Rewriting in RAG?

How Query Rewriting Works in My RAG System

Query Rewriting in Action (Left Panel)

Document-Backed Retrieval with Exact Page References

Why This Matters for QA Software Testing

Key Takeaway

8. What Types of Files and Resources Does This RAG System Support?

Supported File Types in the RAG Workspace

📄 Text & Documentation Files

📊 Structured Test Data Files

🖼 Images & Visual Assets

🎥 Videos (Optional but Supported)

Why This Matters for QA Teams

Key Takeaway

9. Why This Approach Scales Across QA Projects

No Application Specific Hardcoding

Document-Driven Knowledge, Not Model Memory

MCP Makes the System Reusable Everywhere

Works for Different QA Maturity Levels

Minimal Maintenance, Maximum Reuse

Key Takeaway

Conclusion

Final Thought

Zero Code, Zero Headache – How to do Manual Testing with Playwright MCP?

Manual Testing with Playwright MCP – Have you ever felt that a simple manual test should be less manual?

Section 1: End the Manual Grind – Welcome to AI-Augmented QA

The Essential Setup for Manual Testing with Playwright MCP: Detailed Steps

1. Start Playwright MCP Server:

2. Stop and Restart Server

Section 2: Phase 1: Intelligent Exploration and Reporting

Execution Showcase: Exploration to Report

Live Browser Snapshot from Playwright MCP Execution

Section 3: Phase 2: Design, Plan, Execution, Defect Tracking

1. Test Case Design (on basis of Exploration Report)

2. Test Plan Creation

3. Test Cases Execution

Live Browser Snapshot from Playwright MCP Execution

4. Defect Reporting and Tracking

Conclusion: Your Role is Evolving, Not Ending

Recent Posts

Teaching a Private SLM About Your Target Application Using Document RAG for QA Testing

5 Must-Have DevOps Tools for Test Automation in CI/CD

Smart way of E2E testing using cypress new features cy.prompt

Boosting Web Performance: Integrating Google Lighthouse with Automation Frameworks

Top 10 tips to debug your Java code with IntelliJ