RAG

Building a RAG Application with Spring Boot, Spring AI, MongoDB Atlas Vector Search, and OpenAI

Honghao Wang

27 Oct 2025 — 3 min read

# Key Takeaways

- **Retrieval-Augmented Generation (RAG)** overcomes the limitations of static LLMs by combining text generation with information retrieval from enterprise databases — ensuring **accuracy, context, and transparency**.
- **Spring Boot + Spring AI** streamline AI model integration into enterprise apps, supporting **multiple providers** without invasive code or infrastructure changes.
- **MongoDB Atlas Vector Search** removes the need for specialist vector databases, enabling **semantic search** within existing infrastructure.
- **OpenAI models** support both embeddings and generation, letting teams balance **cost, speed, and accuracy** according to their business needs.
- Demonstrated implementation: a **sentiment-based music recommendation system** via ingestion, embedding, semantic search, and reranking — applicable across industries.

---

## Understanding the RAG Pipeline

**Retrieval-Augmented Generation** is a modern AI architecture pattern that pairs a generative model with an external **up-to-date knowledge source**.

### Pipeline Stages
1. **Ingestion** – Collect and process relevant documents or datasets.  
2. **Embedding** – Convert text/data into vector representations.  
3. **Semantic Search** – Find contextually similar content via vector search.  
4. **Generation & Reranking** – Pass retrieved results to an LLM for final output optimization.

**Benefits for Enterprise**:
- **Transparent** – Supporting retrieved data shown alongside generated responses.
- **Scalable & Modular** – Retrieval engine updated independently.
- **Secure** – Controlled governance over sensitive corporate data.

![image](images/img_001.jpg)
*Figure 1: RAG Pipeline*

---

## Spring AI + Enterprise Integration

Spring Boot + Spring AI allow orchestration, integration of multiple LLMs, and maintainable scalability — vital in enterprises with seasoned Java teams and strict compliance.

**Highlights**:
- *Inversion of Control (IoC)* pattern applied to swapping AI providers with minimal code changes.
- Supports **OpenAI, Azure OpenAI, Hugging Face** and more — just configuration changes.
- Abstracted contracts for **embedded models, chat models, image models, and vector stores**.

![image](images/img_002.jpg)
*Figure 2: Spring AI Ecosystem*

---

## MongoDB Atlas Vector Store

**Native vector search** turns MongoDB Atlas into a unified platform for structured, unstructured, and high-dimensional vector data.

**Under the Hood**:
- Uses **HNSW algorithm** for approximate nearest neighbor search.
- Supports hybrid queries — combining vector and traditional filters.
- Handles embeddings with dimensions ≤ **4096** (`float32` arrays).

![image](images/img_003.jpg)
*Figure 3: MongoDB Vector Store*

**Advantages**:
- Avoids separate vector DB management.
- Supports multimodal search in a single document.

---

## OpenAI Embedding & Chat Models

**Embedding Models**:
- `text-embedding-3-small` — cost-efficient, compact.
- `text-embedding-3-large` — high-accuracy semantic vectors (1536 dims).

**Chat Models**:
- `gpt-4o-mini` — fast, low-cost RAG responses.
- `gpt-4o`, `gpt-4-1` — higher accuracy, reasoning for complex tasks.

---

## LyricMind: Musical RAG Recommendation System

Goal: Build a **Spring Boot + Spring AI + MongoDB Atlas + OpenAI** music recommender that **understands mood** and fetches relevant songs.

**Phases**:
1. **Ingestion & Embedding** — load songs, create embeddings with OpenAI, store in MongoDB.
2. **Query & Retrieval** — user mood turned into embedding, semantic search, reranking via chat model.

![image](images/img_004.jpg)
*Figure 4: LyricMind RAG Architecture*

---

### Tech Stack & Dependencies

org.springframework.boot

spring-boot-starter-web

org.springframework.ai

spring-ai-openai-spring-boot-starter

org.springframework.ai

spring-ai-mongodb-atlas-store-spring-boot-starter

org.springframework.boot

spring-boot-starter-actuator


---

## Embedding Flow

**Steps**:
1. Read dataset from CSV.
2. Map into `Song` entities.
3. Store in MongoDB.
4. Generate embeddings via OpenAI (`text-embedding-3-large`).
5. Add to MongoDB Vector Store.

@Getter

@Setter

@NoArgsConstructor

@Document(collection = "songs")

public class Song {

@Id

public String id;

public String title;

public String artist;

public String album;

public String genre;

public String lyrics;

public String description;

public List tags;

public Integer releaseYear;

}


**Embedding class**:

@Document(collection = "song_embedding")

@Data

@NoArgsConstructor

@AllArgsConstructor

public class SongEmbedding {

@Id

private String id;

private String songId;

private String content;

private List embedding;

private Map metadata;

}


---

## Recommendation Engine

**Logic**:
1. **Semantic Search** — find candidate songs matching mood.
2. **Rerank** — use LLM to reorder and explain choices.

**Controller**:

@RestController

@RequestMapping("/api/lyricmind/v1/recommendations")

public class RecommendationController {

@Autowired

RecommendationService recommendationService;

}


**Service Flow**:
- Get candidates → Rerank → Map to response DTO.

---

## SemanticQueryComponent

public List similaritySearch(String mood, int limit) {

SearchRequest searchRequest = SearchRequest.builder()

.query(buildSemanticQuery(mood))

.topK(limit * 2)

.similarityThreshold(0.6)

.build();

return vectorStore.similaritySearch(searchRequest);

}


**Similarity Threshold**:
Controls minimum semantic match — here `0.6` balances relevance and diversity.

---

## Reranking with GPT

String.format("""

You are a music recommendation ranking assistant.

Rank the following songs [...]

Requested Mood: %s

Songs to rank: %s

Instructions:

Return ONLY JSON array
Include ALL docs
Sort by relevance
Score 0.0 to 1.0
Max motivation 100 chars
Expected format: [...]
""", sanitizeInput(mood), documentsText);
spring.ai.openai.api-key=<>
spring.ai.openai.chat.options.model=gpt-4o-mini
curl --location 'http://localhost:8080/api/lyricmind/v1/recommendations' \
--header 'Content-Type: application/json' \
--data '{
"mood": "A song that talks about love",
"limit": 2
}'
[
{"doc_index": 2, "score": 0.94, "motivation": "Lyrics express deep emotional connection"},
{"doc_index": 1, "score": 0.88, "motivation": "Gentle melody captures a romantic theme"}
]


---

## Real-World Use Cases

- **Finance** — Regulatory compliance doc retrieval.
- **Healthcare** — Clinical protocol matching.
- **Legal** — Case law and statute search.
- **Customer Service** — Chatbot knowledge retrieval.
- **Education** — Personalized tutoring Q&A.

---

## Conclusion

Integrating **Spring AI, MongoDB Atlas Vector Search, and OpenAI models**:
- Enables semantic knowledge retrieval.
- Adds LLM-powered re-ranking for rich responses.
- Fits enterprise architectures with minimal disruption.

**Source Code**: [GitHub Repository](https://github.com/matteoroxis/lyricmind)

---


**Response**:


---

## Example Request/Response

**Request**:


Config:

Building a RAG Application with Spring Boot, Spring AI, MongoDB Atlas Vector Search, and OpenAI

Honghao Wang

Read more

Meta Releases Docusaurus 3.9 with New AI Search Feature

L2 Dad Car with VLA Turns into Robotaxi! Shenzhen Physical AI Unicorn Leads Wuxi in Lane-Changing Overtake

Drink Some VC | a16z on the “Data Moat”: The Breakthrough Lies in High-Quality Data That Remains Fragmented, Sensitive, or Hard to Access, with Data Sovereignty and Trust Becoming More Crucial

19-Year-Old Dropout Lands $2.6M: The Secret Even Google’s AI Chief Rushed to Invest In