Monday, January 27, 2025

Integrating Ollama with DeepSeek-R1 in Spring Boot

Are you looking to leverage the power of Ollama and DeepSeek-R1 in your Spring Boot application? This post will walk you through the entire process, from understanding what Ollama is to implementing a seamless integration. 




What is Ollama?

Ollama is a powerful tool designed to simplify the deployment and management of large language models (LLMs) locally. It provides an easy-to-use API for interacting with models like DeepSeek-R1, making it an excellent choice for developers who want to integrate AI capabilities into their applications without relying on external cloud services.


With Ollama, you can:

  • Run LLMs locally on your machine.
  • Switch between different model versions effortlessly.
  • Integrate AI capabilities into your applications via a simple API.


Why Integrate Ollama with DeepSeek-R1?

DeepSeek-R1 is a state-of-the-art language model that offers high performance and flexibility. By integrating it with Ollama in your Spring Boot application, you can:

  • Build AI-powered features like chatbots, content generators, and more.
  • Keep your AI logic local, ensuring data privacy and reducing latency.
  • Easily switch between different versions of DeepSeek-R1 based on your application’s needs.


Step 1: Install Ollama

To get started, you’ll need to install Ollama on your system. Run the following command in your terminal:

curl -fsSL https://ollama.com/install.sh | sh

Successful Installation Output:


>>> Cleaning up old version
>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
>>> Creating ollama user...
>>> Adding ollama user to groups...
>>> Creating ollama systemd service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service
>>> Nvidia GPU detected
>>> API available at 127.0.0.1:11434

Once installed, Ollama will be ready to use, and the API will be available at http://localhost:11434.


Step 2: Application Configuration

Next, configure your Spring Boot application by updating the application.yml file:

spring:
  application:
    name: demo-deepseek-r1.ollama

# Server configuration
server:
  port: 8080
  error:
    include-message: always

# Ollama configuration
ollama:
  endpoint: http://localhost:11434/api/generate
  model: deepseek-r1:1.5b
  timeout:
    connect: 30000
    read: 60000
This configuration sets up the Ollama endpoint, model, and timeout settings for your application.


Step 3: Core Implementation


Create the following records to handle requests and responses:


// OllamaRequest.java
@JsonInclude(JsonInclude.Include.NON_NULL)
public record OllamaRequest(
    String model,
    String prompt,
    boolean stream
) {}

// OllamaResponse.java
@JsonIgnoreProperties(ignoreUnknown = true)
public record OllamaResponse(
    String model,
    String response,
    String created_at,
    boolean done
) {}

Service Layer


Implement the OllamaService to interact with the Ollama API:

@Service
public class OllamaService {

    private static final String OLLAMA_API_URL = "http://localhost:11434/api/generate";
    private final RestTemplate restTemplate;

    public OllamaService(RestTemplateBuilder restTemplateBuilder) {
        this.restTemplate = restTemplateBuilder
                .build();
    }

    public String generateResponse(String prompt) {
        try {
            OllamaRequest request = new OllamaRequest("deepseek-r1:1.5b", prompt, false);
            HttpHeaders headers = new HttpHeaders();
            headers.setContentType(MediaType.APPLICATION_JSON);

            ResponseEntity<OllamaResponse> response = restTemplate.exchange(
                    OLLAMA_API_URL,
                    HttpMethod.POST,
                    new HttpEntity<>(request, headers),
                    OllamaResponse.class
            );

            if (response.getStatusCode().is2xxSuccessful() && response.getBody() != null) {
                return response.getBody().response() != null
                        ? response.getBody().response()
                        : "Received empty response from model";
            }
            return "Ollama API returned status: " + response.getStatusCode();
        } catch (RestClientException e) {
            return "Error communicating with Ollama: " + e.getMessage();
        }
    }
}

REST Controller

Create a REST controller to expose the chat endpoint:

@RestController
@RequestMapping("/api/chat")
public class ChatController {

    private final OllamaService ollamaService;

    public ChatController(OllamaService ollamaService) {
        this.ollamaService = ollamaService;
    }

    @PostMapping
    public ResponseEntity<String> chat(@RequestBody String prompt) {
        if (prompt == null || prompt.isBlank()) {
            return ResponseEntity.badRequest().body("Prompt cannot be empty");
        }
        String response = ollamaService.generateResponse(prompt);
        return ResponseEntity.ok(response);
    }
}


Model Version Compatibility

Here’s a quick reference for DeepSeek-R1 model versions and their requirements:



*Check official model availability at:

Ollama Model Library


Testing the Integration

To test the integration, use the following curl  command or postman:

curl -X POST -H "Content-Type: text/plain" -d "Explain AI in simple terms" http://localhost:8080/api/chat

Ouput









Source Code

Here on GitHub.




Sunday, January 26, 2025

Spring Retry: Handling Transient Failures Gracefully in Java 21

In modern applications, transient failures (e.g., network timeouts, database connection issues, or external API unavailability) are inevitable. To build resilient systems, we need mechanisms to retry failed operations gracefully.

With Java 21 and Spring Boot 3, we can leverage Spring Retry to implement robust retry logic. In this post, I'll show you how to integrate Spring Retry into your application, complete with examples using virtual threads and asynchronous processing.




Why Use Spring Retry?

Spring Retry provides a declarative way to retry operations that may fail due to transient issues. Key features include:

  • Retry Logic: Automatically retry failed operations with configurable attempts and backoff strategies.
  • Fallback Mechanism: Define recovery logic when all retries fail.
  • Integration with Spring: Seamlessly integrates with Spring Boot and other Spring components.


Key Concepts:

@Retryable:

Marks a method as retryable. You can specify the exceptions to retry, the maximum number of attempts, and the backoff strategy.

@Recover:

Defines a fallback method to execute when all retries fail. You can have multiple @Recover methods to handle different exceptions.

Why Multiple @Recover Methods?

Different exceptions may require different recovery logic. For example:

  • A RuntimeException might require logging.
  • An IOException might require returning a default response.

By defining multiple @Recover methods, you can handle each exception type appropriately.

Example 1: Retry with Virtual Threads

This example demonstrates how to retry an HTTP call using virtual threads.

import org.springframework.retry.annotation.Backoff;
import org.springframework.retry.annotation.Recover;
import org.springframework.retry.annotation.Retryable;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.util.concurrent.ExecutionException;

@Service // This makes it a Spring-managed bean
public class VirtualThreadExample {

    @Retryable(
            retryFor = {RuntimeException.class, IOException.class, InterruptedException.class, ExecutionException.class, Exception.class},
            maxAttempts = 4,                                   // Total 4 attempts (1 initial + 3 retries)
            backoff = @Backoff(delay = 1000, multiplier = 2)// Exponential backoff: 1s, 2s, 4s
    )
    public void getResponse(String urlRest) throws IOException {
        System.out.println("Attempting to call: " + urlRest);
        throw new RuntimeException("Negative Test cases for VirtualThreadExample");
    }

    @Recover
    public void recover(RuntimeException e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }

    @Recover
    public void recover(IOException e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }

    @Recover
    public void recover(InterruptedException e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }

    @Recover
    public void recover(ExecutionException e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }

    @Recover
    public void recover(Exception e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }
}

Expected Output:


When the getResponse method is called, it will retry 4 times (1 initial attempt + 3 retries) with exponential backoff. If all retries fail, the appropriate @Recover method will be called.

Attempting to call: https://jsonplaceholder.typicode.com/posts/2
Attempting to call: https://jsonplaceholder.typicode.com/posts/3
Attempting to call: https://jsonplaceholder.typicode.com/posts/1
Attempting to call: https://jsonplaceholder.typicode.com/posts/2
Attempting to call: https://jsonplaceholder.typicode.com/posts/1
Attempting to call: https://jsonplaceholder.typicode.com/posts/3
Attempting to call: https://jsonplaceholder.typicode.com/posts/1
Attempting to call: https://jsonplaceholder.typicode.com/posts/2
Attempting to call: https://jsonplaceholder.typicode.com/posts/3
Attempting to call: https://jsonplaceholder.typicode.com/posts/1
Attempting to call: https://jsonplaceholder.typicode.com/posts/2
Attempting to call: https://jsonplaceholder.typicode.com/posts/3
Processed 3 posts in 7060 millis
Program Completed !!
All retries failed for URL: https://jsonplaceholder.typicode.com/posts/3
Error details: Negative Test cases for VirtualThreadExample
All retries failed for URL: https://jsonplaceholder.typicode.com/posts/1
Error details: Negative Test cases for VirtualThreadExample
All retries failed for URL: https://jsonplaceholder.typicode.com/posts/2
Error details: Negative Test cases for VirtualThreadExample


Example 2: Retry with Asynchronous Processing


This example demonstrates how to retry a database operation asynchronously.

import org.springframework.retry.annotation.Backoff;
import org.springframework.retry.annotation.Recover;
import org.springframework.retry.annotation.Retryable;
import org.springframework.stereotype.Service;

@Service
public class AsyncExample {

    @Retryable(
            retryFor = {RuntimeException.class}, // Retry on runtime exceptions
            maxAttempts = 4,                    // Total 4 attempts (1 initial + 3 retries)
            backoff = @Backoff(delay = 1000, multiplier = 2) // Exponential backoff: 1s, 2s, 4s
    )
    public void saveUser(String user) {
        System.out.println("Saving user: " + user);
        try {
            Thread.sleep(1000); // Simulate database latency
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new RuntimeException("Thread interrupted while saving user", e);
        }
        // Simulate a transient failure
        throw new RuntimeException("Failed to save user due to a transient error");
    }

    @Recover
    public void recover(RuntimeException e, String user) {
        System.err.println("All retries failed for user: " + user);
        System.err.println("Error details: " + e.getMessage());
        // Fallback logic (e.g., log the error, notify, or take corrective action)
    }
}

Expected Output:


When the saveUser method is called, it will retry 4 times (1 initial attempt + 3 retries) with exponential backoff. If all retries fail, the @Recover method will be called.

Saving user: JohnDoe
Saving user: JohnDoe
Saving user: JohnDoe
Saving user: JohnDoe
All retries failed for user: JohnDoe
Error details: Failed to save user due to a transient error

Running the Application


To execute these examples, define ApplicationRunner beans in your Spring Boot application:

import org.springframework.boot.ApplicationRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.retry.annotation.EnableRetry;

import java.util.List;
import java.util.concurrent.*;

@SpringBootApplication
@EnableRetry
public class DemoSpringRetryApplication {

	public static void main(String[] args) {
		SpringApplication.run(DemoSpringRetryApplication.class, args);
	}

	@Bean
	ApplicationRunner asyncExampleRunner(AsyncExample example) {
		return args -> {
			String user = "JohnDoe";

			// Run the task asynchronously
			CompletableFuture.runAsync(() -> {
						System.out.println("Starting async task for user: " + user);
						example.saveUser(user); // Use the injected bean
						System.out.println("Async task completed for user: " + user);
					}, Executors.newVirtualThreadPerTaskExecutor())
					.exceptionally(ex -> {
						System.err.println("Failed to save user: " + user);
						ex.printStackTrace();
						return null;
					});

			System.out.println("Main thread continues executing...");
			try {
				Thread.sleep(2000); // Simulate main thread work
			} catch (InterruptedException e) {
				e.printStackTrace();
			}
			System.out.println("Main thread finished.");
		};
	}

	@Bean
	ApplicationRunner virtualThreadExampleRunner(VirtualThreadExample example) {
		return args -> {
			try (ExecutorService myExecutor = Executors.newVirtualThreadPerTaskExecutor()) {
				// List of posts to process
				List<Integer> posts = List.of(1, 2, 3);
				long start = System.nanoTime();

				// Submit a task for each post
				List<Future<Object>> futures = posts.stream()
						.map(post -> myExecutor.submit(() -> {
							example.getResponse("https://jsonplaceholder.typicode.com/posts/" + post);
							return null; // Explicitly return null for Future<Void>
						}))
						.toList();

				// Wait for all tasks to complete
				for (Future<Object> future : futures) {
					future.get(); // Ensures task completion
				}

				long duration = (System.nanoTime() - start) / 1_000_000;
				System.out.printf("Processed %d posts in %d millis%n", posts.size(), duration);
				System.out.println("Program Completed !!");
			} catch (InterruptedException | ExecutionException e) {
				System.err.println("error " + e.getMessage());
			}
		};
	}
}

Dependencies


Add the following dependencies to your pom.xml:

<dependencies>
    <!-- Spring Retry -->
    <dependency>
        <groupId>org.springframework.retry</groupId>
        <artifactId>spring-retry</artifactId>
    </dependency>
    <!-- Spring AOP (required for @EnableRetry) -->
    <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-aspects</artifactId>
    </dependency>
</dependencies>

Conclusion


By leveraging Spring Retry alongside Java 21's virtual threads and asynchronous processing, you can create resilient systems that gracefully handle transient failures. Whether making HTTP calls or working with databases, retry mechanisms ensure your application stays reliable and robust under pressure.

Key Takeaways:

  • Use @Retryable to define retry logic and @Recover for fallback behavior.
  • Multiple @Recover methods allow you to handle different exceptions appropriately.
  • Virtual threads and asynchronous processing improve concurrency and performance.







Integrating Ollama with DeepSeek-R1 in Spring Boot

Are you looking to leverage the power of Ollama and DeepSeek-R1 in your Spring Boot application? This post will walk you through the entire ...