Building a RAG Application with Spring Boot, Spring AI, MongoDB Atlas Vector Search, and OpenAI
# Key Takeaways
- **Retrieval-Augmented Generation (RAG)** overcomes the limitations of static LLMs by combining text generation with information retrieval from enterprise databases — ensuring **accuracy, context, and transparency**.
- **Spring Boot + Spring AI** streamline AI model integration into enterprise apps, supporting **multiple providers** without invasive code or infrastructure changes.
- **MongoDB Atlas Vector Search** removes the need for specialist vector databases, enabling **semantic search** within existing infrastructure.
- **OpenAI models** support both embeddings and generation, letting teams balance **cost, speed, and accuracy** according to their business needs.
- Demonstrated implementation: a **sentiment-based music recommendation system** via ingestion, embedding, semantic search, and reranking — applicable across industries.
---
## Understanding the RAG Pipeline
**Retrieval-Augmented Generation** is a modern AI architecture pattern that pairs a generative model with an external **up-to-date knowledge source**.
### Pipeline Stages
1. **Ingestion** – Collect and process relevant documents or datasets.
2. **Embedding** – Convert text/data into vector representations.
3. **Semantic Search** – Find contextually similar content via vector search.
4. **Generation & Reranking** – Pass retrieved results to an LLM for final output optimization.
**Benefits for Enterprise**:
- **Transparent** – Supporting retrieved data shown alongside generated responses.
- **Scalable & Modular** – Retrieval engine updated independently.
- **Secure** – Controlled governance over sensitive corporate data.

*Figure 1: RAG Pipeline*
---
## Spring AI + Enterprise Integration
Spring Boot + Spring AI allow orchestration, integration of multiple LLMs, and maintainable scalability — vital in enterprises with seasoned Java teams and strict compliance.
**Highlights**:
- *Inversion of Control (IoC)* pattern applied to swapping AI providers with minimal code changes.
- Supports **OpenAI, Azure OpenAI, Hugging Face** and more — just configuration changes.
- Abstracted contracts for **embedded models, chat models, image models, and vector stores**.

*Figure 2: Spring AI Ecosystem*
---
## MongoDB Atlas Vector Store
**Native vector search** turns MongoDB Atlas into a unified platform for structured, unstructured, and high-dimensional vector data.
**Under the Hood**:
- Uses **HNSW algorithm** for approximate nearest neighbor search.
- Supports hybrid queries — combining vector and traditional filters.
- Handles embeddings with dimensions ≤ **4096** (`float32` arrays).

*Figure 3: MongoDB Vector Store*
**Advantages**:
- Avoids separate vector DB management.
- Supports multimodal search in a single document.
---
## OpenAI Embedding & Chat Models
**Embedding Models**:
- `text-embedding-3-small` — cost-efficient, compact.
- `text-embedding-3-large` — high-accuracy semantic vectors (1536 dims).
**Chat Models**:
- `gpt-4o-mini` — fast, low-cost RAG responses.
- `gpt-4o`, `gpt-4-1` — higher accuracy, reasoning for complex tasks.
---
## LyricMind: Musical RAG Recommendation System
Goal: Build a **Spring Boot + Spring AI + MongoDB Atlas + OpenAI** music recommender that **understands mood** and fetches relevant songs.
**Phases**:
1. **Ingestion & Embedding** — load songs, create embeddings with OpenAI, store in MongoDB.
2. **Query & Retrieval** — user mood turned into embedding, semantic search, reranking via chat model.

*Figure 4: LyricMind RAG Architecture*
---
### Tech Stack & Dependencies
org.springframework.boot
spring-boot-starter-web
org.springframework.ai
spring-ai-openai-spring-boot-starter
org.springframework.ai
spring-ai-mongodb-atlas-store-spring-boot-starter
org.springframework.boot
spring-boot-starter-actuator
---
## Embedding Flow
**Steps**:
1. Read dataset from CSV.
2. Map into `Song` entities.
3. Store in MongoDB.
4. Generate embeddings via OpenAI (`text-embedding-3-large`).
5. Add to MongoDB Vector Store.
@Getter
@Setter
@NoArgsConstructor
@Document(collection = "songs")
public class Song {
@Id
public String id;
public String title;
public String artist;
public String album;
public String genre;
public String lyrics;
public String description;
public List tags;
public Integer releaseYear;
}
**Embedding class**:
@Document(collection = "song_embedding")
@Data
@NoArgsConstructor
@AllArgsConstructor
public class SongEmbedding {
@Id
private String id;
private String songId;
private String content;
private List embedding;
private Map metadata;
}
---
## Recommendation Engine
**Logic**:
1. **Semantic Search** — find candidate songs matching mood.
2. **Rerank** — use LLM to reorder and explain choices.
**Controller**:
@RestController
@RequestMapping("/api/lyricmind/v1/recommendations")
public class RecommendationController {
@Autowired
RecommendationService recommendationService;
}
**Service Flow**:
- Get candidates → Rerank → Map to response DTO.
---
## SemanticQueryComponent
public List similaritySearch(String mood, int limit) {
SearchRequest searchRequest = SearchRequest.builder()
.query(buildSemanticQuery(mood))
.topK(limit * 2)
.similarityThreshold(0.6)
.build();
return vectorStore.similaritySearch(searchRequest);
}
**Similarity Threshold**:
Controls minimum semantic match — here `0.6` balances relevance and diversity.
---
## Reranking with GPT
String.format("""
You are a music recommendation ranking assistant.
Rank the following songs [...]
Requested Mood: %s
Songs to rank: %s
Instructions:
- Return ONLY JSON array
- Include ALL docs
- Sort by relevance
- Score 0.0 to 1.0
- Max motivation 100 chars
- Expected format: [...]
- """, sanitizeInput(mood), documentsText);
- spring.ai.openai.api-key=<>
- spring.ai.openai.chat.options.model=gpt-4o-mini
- curl --location 'http://localhost:8080/api/lyricmind/v1/recommendations' \
- --header 'Content-Type: application/json' \
- --data '{
- "mood": "A song that talks about love",
- "limit": 2
- }'
- [
- {"doc_index": 2, "score": 0.94, "motivation": "Lyrics express deep emotional connection"},
- {"doc_index": 1, "score": 0.88, "motivation": "Gentle melody captures a romantic theme"}
- ]
---
## Real-World Use Cases
- **Finance** — Regulatory compliance doc retrieval.
- **Healthcare** — Clinical protocol matching.
- **Legal** — Case law and statute search.
- **Customer Service** — Chatbot knowledge retrieval.
- **Education** — Personalized tutoring Q&A.
---
## Conclusion
Integrating **Spring AI, MongoDB Atlas Vector Search, and OpenAI models**:
- Enables semantic knowledge retrieval.
- Adds LLM-powered re-ranking for rich responses.
- Fits enterprise architectures with minimal disruption.
**Source Code**: [GitHub Repository](https://github.com/matteoroxis/lyricmind)
---
**Response**:
---
## Example Request/Response
**Request**:
Config: