Building a RAG Application with Spring Boot, Spring AI, MongoDB Atlas Vector Search, and OpenAI

# Key Takeaways

- **Retrieval-Augmented Generation (RAG)** overcomes the limitations of static LLMs by combining text generation with information retrieval from enterprise databases — ensuring **accuracy, context, and transparency**.
- **Spring Boot + Spring AI** streamline AI model integration into enterprise apps, supporting **multiple providers** without invasive code or infrastructure changes.
- **MongoDB Atlas Vector Search** removes the need for specialist vector databases, enabling **semantic search** within existing infrastructure.
- **OpenAI models** support both embeddings and generation, letting teams balance **cost, speed, and accuracy** according to their business needs.
- Demonstrated implementation: a **sentiment-based music recommendation system** via ingestion, embedding, semantic search, and reranking — applicable across industries.

---

## Understanding the RAG Pipeline

**Retrieval-Augmented Generation** is a modern AI architecture pattern that pairs a generative model with an external **up-to-date knowledge source**.

### Pipeline Stages
1. **Ingestion** – Collect and process relevant documents or datasets.  
2. **Embedding** – Convert text/data into vector representations.  
3. **Semantic Search** – Find contextually similar content via vector search.  
4. **Generation & Reranking** – Pass retrieved results to an LLM for final output optimization.

**Benefits for Enterprise**:
- **Transparent** – Supporting retrieved data shown alongside generated responses.
- **Scalable & Modular** – Retrieval engine updated independently.
- **Secure** – Controlled governance over sensitive corporate data.

![image](images/img_001.jpg)
*Figure 1: RAG Pipeline*

---

## Spring AI + Enterprise Integration

Spring Boot + Spring AI allow orchestration, integration of multiple LLMs, and maintainable scalability — vital in enterprises with seasoned Java teams and strict compliance.

**Highlights**:
- *Inversion of Control (IoC)* pattern applied to swapping AI providers with minimal code changes.
- Supports **OpenAI, Azure OpenAI, Hugging Face** and more — just configuration changes.
- Abstracted contracts for **embedded models, chat models, image models, and vector stores**.

![image](images/img_002.jpg)
*Figure 2: Spring AI Ecosystem*

---

## MongoDB Atlas Vector Store

**Native vector search** turns MongoDB Atlas into a unified platform for structured, unstructured, and high-dimensional vector data.

**Under the Hood**:
- Uses **HNSW algorithm** for approximate nearest neighbor search.
- Supports hybrid queries — combining vector and traditional filters.
- Handles embeddings with dimensions ≤ **4096** (`float32` arrays).

![image](images/img_003.jpg)
*Figure 3: MongoDB Vector Store*

**Advantages**:
- Avoids separate vector DB management.
- Supports multimodal search in a single document.

---

## OpenAI Embedding & Chat Models

**Embedding Models**:
- `text-embedding-3-small` — cost-efficient, compact.
- `text-embedding-3-large` — high-accuracy semantic vectors (1536 dims).

**Chat Models**:
- `gpt-4o-mini` — fast, low-cost RAG responses.
- `gpt-4o`, `gpt-4-1` — higher accuracy, reasoning for complex tasks.

---

## LyricMind: Musical RAG Recommendation System

Goal: Build a **Spring Boot + Spring AI + MongoDB Atlas + OpenAI** music recommender that **understands mood** and fetches relevant songs.

**Phases**:
1. **Ingestion & Embedding** — load songs, create embeddings with OpenAI, store in MongoDB.
2. **Query & Retrieval** — user mood turned into embedding, semantic search, reranking via chat model.

![image](images/img_004.jpg)
*Figure 4: LyricMind RAG Architecture*

---

### Tech Stack & Dependencies

org.springframework.boot

spring-boot-starter-web

org.springframework.ai

spring-ai-openai-spring-boot-starter

org.springframework.ai

spring-ai-mongodb-atlas-store-spring-boot-starter

org.springframework.boot

spring-boot-starter-actuator


---

## Embedding Flow

**Steps**:
1. Read dataset from CSV.
2. Map into `Song` entities.
3. Store in MongoDB.
4. Generate embeddings via OpenAI (`text-embedding-3-large`).
5. Add to MongoDB Vector Store.

@Getter

@Setter

@NoArgsConstructor

@Document(collection = "songs")

public class Song {

@Id

public String id;

public String title;

public String artist;

public String album;

public String genre;

public String lyrics;

public String description;

public List tags;

public Integer releaseYear;

}


**Embedding class**:

@Document(collection = "song_embedding")

@Data

@NoArgsConstructor

@AllArgsConstructor

public class SongEmbedding {

@Id

private String id;

private String songId;

private String content;

private List embedding;

private Map metadata;

}


---

## Recommendation Engine

**Logic**:
1. **Semantic Search** — find candidate songs matching mood.
2. **Rerank** — use LLM to reorder and explain choices.

**Controller**:

@RestController

@RequestMapping("/api/lyricmind/v1/recommendations")

public class RecommendationController {

@Autowired

RecommendationService recommendationService;

}


**Service Flow**:
- Get candidates → Rerank → Map to response DTO.

---

## SemanticQueryComponent

public List similaritySearch(String mood, int limit) {

SearchRequest searchRequest = SearchRequest.builder()

.query(buildSemanticQuery(mood))

.topK(limit * 2)

.similarityThreshold(0.6)

.build();

return vectorStore.similaritySearch(searchRequest);

}


**Similarity Threshold**:
Controls minimum semantic match — here `0.6` balances relevance and diversity.

---

## Reranking with GPT

String.format("""

You are a music recommendation ranking assistant.

Rank the following songs [...]

Requested Mood: %s

Songs to rank: %s

Instructions:

  • Return ONLY JSON array
  • Include ALL docs
  • Sort by relevance
  • Score 0.0 to 1.0
  • Max motivation 100 chars
  • Expected format: [...]
  • """, sanitizeInput(mood), documentsText);
  • spring.ai.openai.api-key=<>
  • spring.ai.openai.chat.options.model=gpt-4o-mini
  • curl --location 'http://localhost:8080/api/lyricmind/v1/recommendations' \
  • --header 'Content-Type: application/json' \
  • --data '{
  • "mood": "A song that talks about love",
  • "limit": 2
  • }'
  • [
  • {"doc_index": 2, "score": 0.94, "motivation": "Lyrics express deep emotional connection"},
  • {"doc_index": 1, "score": 0.88, "motivation": "Gentle melody captures a romantic theme"}
  • ]

---

## Real-World Use Cases

- **Finance** — Regulatory compliance doc retrieval.
- **Healthcare** — Clinical protocol matching.
- **Legal** — Case law and statute search.
- **Customer Service** — Chatbot knowledge retrieval.
- **Education** — Personalized tutoring Q&A.

---

## Conclusion

Integrating **Spring AI, MongoDB Atlas Vector Search, and OpenAI models**:
- Enables semantic knowledge retrieval.
- Adds LLM-powered re-ranking for rich responses.
- Fits enterprise architectures with minimal disruption.

**Source Code**: [GitHub Repository](https://github.com/matteoroxis/lyricmind)

---

**Response**:

---

## Example Request/Response

**Request**:

Config:

Read more

Drink Some VC | a16z on the “Data Moat”: The Breakthrough Lies in High-Quality Data That Remains Fragmented, Sensitive, or Hard to Access, with Data Sovereignty and Trust Becoming More Crucial

Drink Some VC | a16z on the “Data Moat”: The Breakthrough Lies in High-Quality Data That Remains Fragmented, Sensitive, or Hard to Access, with Data Sovereignty and Trust Becoming More Crucial

Z Potentials — 2025-11-03 11:58 Beijing > “High-quality data often resides for long periods in fragmented, highly sensitive, or hard-to-access domains. In these areas, data sovereignty and trust often outweigh sheer model compute power or general capabilities.” Image source: unsplash --- 📌 Z Highlights * When infrastructure providers also become competitors, startups

By Honghao Wang