Are you looking to leverage the power of Ollama and DeepSeek-R1 in your Spring Boot application? This post will walk you through the entire process, from understanding what Ollama is to implementing a seamless integration.
What is Ollama?
Ollama is a powerful tool designed to simplify the deployment and management of large language models (LLMs) locally. It provides an easy-to-use API for interacting with models like DeepSeek-R1, making it an excellent choice for developers who want to integrate AI capabilities into their applications without relying on external cloud services.
With Ollama, you can:
- Run LLMs locally on your machine.
- Switch between different model versions effortlessly.
- Integrate AI capabilities into your applications via a simple API.
Why Integrate Ollama with DeepSeek-R1?
DeepSeek-R1 is a state-of-the-art language model that offers high performance and flexibility. By integrating it with Ollama in your Spring Boot application, you can:
- Build AI-powered features like chatbots, content generators, and more.
- Keep your AI logic local, ensuring data privacy and reducing latency.
- Easily switch between different versions of DeepSeek-R1 based on your application’s needs.
Step 1: Install Ollama
To get started, you’ll need to install Ollama on your system. Run the following command in your terminal:
curl -fsSL https://ollama.com/install.sh | sh
Successful Installation Output:
>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
>>> Creating ollama user...
>>> Adding ollama user to groups...
>>> Creating ollama systemd service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service
>>> Nvidia GPU detected
>>> API available at 127.0.0.1:11434
Once installed, Ollama will be ready to use, and the API will be available at http://localhost:11434.
To verify it's working:
ollama serve
ollama list
ollama pull deepseek-r1:1.5b
If deepseek-r1:1.5b isn’t listed, pull it.
Test the model with curl:
curl -X POST http://localhost:11434/api/generate \
-d '{"model": "deepseek-r1:1.5b", "prompt": "Hello", "stream": false}'
Step 2: Application Configuration
Next, configure your Spring Boot application by updating the application.yml file:
spring: application: name: demo-deepseek-r1.ollama # Server configuration server: port: 8080 error: include-message: always # Ollama configuration ollama: endpoint: http://localhost:11434/api/generate model: deepseek-r1:1.5b timeout: connect: 30000 read: 60000
Step 3: Core Implementation
Create the following records to handle requests and responses:
// OllamaRequest.java
@JsonInclude(JsonInclude.Include.NON_NULL)
public record OllamaRequest(
String model,
String prompt,
boolean stream
) {}
// OllamaResponse.java
@JsonIgnoreProperties(ignoreUnknown = true)
public record OllamaResponse(
String model,
String response,
String created_at,
boolean done
) {}
Service Layer
@Service
public class OllamaService {
private final RestTemplate restTemplate;
private final OllamaProperties properties;
public OllamaService(OllamaProperties properties) {
this.properties = properties;
RequestConfig config = RequestConfig.custom()
.setConnectTimeout(Timeout.ofMilliseconds(properties.getTimeout().getConnect()))
.setResponseTimeout(Timeout.ofMilliseconds(properties.getTimeout().getRead()))
.build();
CloseableHttpClient httpClient = HttpClients.custom()
.setDefaultRequestConfig(config)
.build();
HttpComponentsClientHttpRequestFactory requestFactory = new HttpComponentsClientHttpRequestFactory(httpClient);
this.restTemplate = new RestTemplate(requestFactory);
}
public String generateResponse(String prompt) {
try {
OllamaRequest request = new OllamaRequest(properties.getModel(), prompt, false);
HttpHeaders headers = new HttpHeaders();
headers.setContentType(MediaType.APPLICATION_JSON);
ResponseEntity<OllamaResponse> response = restTemplate.exchange(
properties.getEndpoint(),
HttpMethod.POST,
new HttpEntity<>(request, headers),
OllamaResponse.class
);
if (response.getStatusCode().is2xxSuccessful() && response.getBody() != null) {
return response.getBody().response() != null
? response.getBody().response()
: "Received empty response from model";
}
return "Ollama API returned status: " + response.getStatusCode();
} catch (RestClientException e) {
return "Error communicating with Ollama: " + e.getMessage();
}
}
}
REST Controller
Create a REST controller to expose the chat endpoint:
@RestController
@RequestMapping("/api/chat")
public class ChatController {
private final OllamaService ollamaService;
public ChatController(OllamaService ollamaService) {
this.ollamaService = ollamaService;
}
@PostMapping
public ResponseEntity<String> chat(@RequestBody String prompt) {
if (prompt == null || prompt.isBlank()) {
return ResponseEntity.badRequest().body("Prompt cannot be empty");
}
String response = ollamaService.generateResponse(prompt);
return ResponseEntity.ok(response);
}
}
Model Version Compatibility
Here’s a quick reference for DeepSeek-R1 model versions and their requirements:
*Check official model availability at:
Testing the Integration
To test the integration, use the following curl command or postman:
curl -X POST -H "Content-Type: text/plain" -d "Explain AI in simple terms" http://localhost:8080/api/chat
Ouput
🪟 Bonus: Using Ollama in WSL on Windows
If you're on Windows using WSL, follow these steps to expose the Ollama service to Windows:
🔒 WSL Side: Open the Port
sudo ufw enable
sudo ufw allow 11434
sudo systemctl stop ollama
sudo lsof -i :11434
export OLLAMA_HOST=0.0.0.0
ollama serve
sudo ss -tulnp | grep 11434
🪟 Windows Side: Port Forwarding wit admin permission
$wsl_ip = (wsl hostname -I).Split()[0] netsh interface portproxy add v4tov4 ` listenport=11434 listenaddress=0.0.0.0 ` connectport=11434 connectaddress=$wsl_ip New-NetFirewallRule -DisplayName "Ollama-WSL" ` -Direction Inbound -Protocol TCP -LocalPort 11434 -Action Allow
echo 'export OLLAMA_HOST=0.0.0.0' >> ~/.bashrc
echo 'pkill ollama; ollama serve > /tmp/ollama.log 2>&1 &' >> ~/.bashrc
$action = New-ScheduledTaskAction -Execute "wsl" -Argument "-e bash -c 'ollama serve'"
$trigger = New-ScheduledTaskTrigger -AtStartup
Register-ScheduledTask -TaskName "Ollama-WSL" -Action $action -Trigger $trigger -RunLevel Highest
Source Code
Here on GitHub.
🙌 Final Thoughts
Running LLMs locally has never been easier. With Ollama, DeepSeek-R1, and Spring Boot, you can build blazing-fast AI-powered apps while keeping full control over your data.
No comments:
Post a Comment