TechNotes Central: January 2025

Monday, January 27, 2025

🚀 How to Integrate Ollama with DeepSeek-R1 in Spring Boot

Are you looking to leverage the power of Ollama and DeepSeek-R1 in your Spring Boot application? This post will walk you through the entire process, from understanding what Ollama is to implementing a seamless integration.

What is Ollama?

Ollama is a powerful tool designed to simplify the deployment and management of large language models (LLMs) locally. It provides an easy-to-use API for interacting with models like DeepSeek-R1, making it an excellent choice for developers who want to integrate AI capabilities into their applications without relying on external cloud services.

With Ollama, you can:

Run LLMs locally on your machine.
Switch between different model versions effortlessly.
Integrate AI capabilities into your applications via a simple API.

Why Integrate Ollama with DeepSeek-R1?

DeepSeek-R1 is a state-of-the-art language model that offers high performance and flexibility. By integrating it with Ollama in your Spring Boot application, you can:

Build AI-powered features like chatbots, content generators, and more.
Keep your AI logic local, ensuring data privacy and reducing latency.
Easily switch between different versions of DeepSeek-R1 based on your application’s needs.

Step 1: Install Ollama

To get started, you’ll need to install Ollama on your system. Run the following command in your terminal:

curl -fsSL https://ollama.com/install.sh | sh

Successful Installation Output:


>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
>>> Creating ollama user...
>>> Adding ollama user to groups...
>>> Creating ollama systemd service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service
>>> Nvidia GPU detected
>>> API available at 127.0.0.1:11434

Once installed, Ollama will be ready to use, and the API will be available at http://localhost:11434.

To verify it's working:

ollama serve

ollama list

ollama pull deepseek-r1:1.5b

If deepseek-r1:1.5b isn’t listed, pull it.

Test the model with curl:

curl -X POST http://localhost:11434/api/generate \
  -d '{"model": "deepseek-r1:1.5b", "prompt": "Hello", "stream": false}'

Step 2: Application Configuration

Next, configure your Spring Boot application by updating the application.yml file:

spring:
  application:
    name: demo-deepseek-r1.ollama

# Server configuration
server:
  port: 8080
  error:
    include-message: always

# Ollama configuration
ollama:
  endpoint: http://localhost:11434/api/generate
  model: deepseek-r1:1.5b
  timeout:
    connect: 30000
    read: 60000

This configuration sets up the Ollama endpoint, model, and timeout settings for your application.

Step 3: Core Implementation

Create the following records to handle requests and responses:

// OllamaRequest.java
@JsonInclude(JsonInclude.Include.NON_NULL)
public record OllamaRequest(
    String model,
    String prompt,
    boolean stream
) {}

// OllamaResponse.java
@JsonIgnoreProperties(ignoreUnknown = true)
public record OllamaResponse(
    String model,
    String response,
    String created_at,
    boolean done
) {}

Service Layer

Implement the OllamaService to interact with the Ollama API:

@Service
public class OllamaService {

    private final RestTemplate restTemplate;
    private final OllamaProperties properties;

    public OllamaService(OllamaProperties properties) {
        this.properties = properties;

        RequestConfig config = RequestConfig.custom()
                .setConnectTimeout(Timeout.ofMilliseconds(properties.getTimeout().getConnect()))
                .setResponseTimeout(Timeout.ofMilliseconds(properties.getTimeout().getRead()))
                .build();

        CloseableHttpClient httpClient = HttpClients.custom()
                .setDefaultRequestConfig(config)
                .build();

        HttpComponentsClientHttpRequestFactory requestFactory = new HttpComponentsClientHttpRequestFactory(httpClient);
        this.restTemplate = new RestTemplate(requestFactory);
    }

    public String generateResponse(String prompt) {
        try {
            OllamaRequest request = new OllamaRequest(properties.getModel(), prompt, false);
            HttpHeaders headers = new HttpHeaders();
            headers.setContentType(MediaType.APPLICATION_JSON);

            ResponseEntity<OllamaResponse> response = restTemplate.exchange(
                    properties.getEndpoint(),
                    HttpMethod.POST,
                    new HttpEntity<>(request, headers),
                    OllamaResponse.class
            );

            if (response.getStatusCode().is2xxSuccessful() && response.getBody() != null) {
                return response.getBody().response() != null
                        ? response.getBody().response()
                        : "Received empty response from model";
            }
            return "Ollama API returned status: " + response.getStatusCode();
        } catch (RestClientException e) {
            return "Error communicating with Ollama: " + e.getMessage();
        }
    }
}

REST Controller

Create a REST controller to expose the chat endpoint:

@RestController
@RequestMapping("/api/chat")
public class ChatController {

    private final OllamaService ollamaService;

    public ChatController(OllamaService ollamaService) {
        this.ollamaService = ollamaService;
    }

    @PostMapping
    public ResponseEntity<String> chat(@RequestBody String prompt) {
        if (prompt == null || prompt.isBlank()) {
            return ResponseEntity.badRequest().body("Prompt cannot be empty");
        }
        String response = ollamaService.generateResponse(prompt);
        return ResponseEntity.ok(response);
    }
}

Model Version Compatibility

Here’s a quick reference for DeepSeek-R1 model versions and their requirements:

*Check official model availability at:

Ollama Model Library

Testing the Integration

To test the integration, use the following curl command or postman:

curl -X POST -H "Content-Type: text/plain" -d "Explain AI in simple terms" http://localhost:8080/api/chat

Ouput

🪟 Bonus: Using Ollama in WSL on Windows

If you're on Windows using WSL, follow these steps to expose the Ollama service to Windows:

🔒 WSL Side: Open the Port

sudo ufw enable
sudo ufw allow 11434
sudo systemctl stop ollama
sudo lsof -i :11434
export OLLAMA_HOST=0.0.0.0
ollama serve

Verify Ollama is listening:

sudo ss -tulnp | grep 11434

🪟 Windows Side: Port Forwarding wit admin permission

$wsl_ip = (wsl hostname -I).Split()[0]

netsh interface portproxy add v4tov4 `
  listenport=11434 listenaddress=0.0.0.0 `
  connectport=11434 connectaddress=$wsl_ip

New-NetFirewallRule -DisplayName "Ollama-WSL" `
  -Direction Inbound -Protocol TCP -LocalPort 11434 -Action Allow

🔁 Optional: Persistence Across Reboots

WSL Side

echo 'export OLLAMA_HOST=0.0.0.0' >> ~/.bashrc
echo 'pkill ollama; ollama serve > /tmp/ollama.log 2>&1 &' >> ~/.bashrc

Windows Side

$action = New-ScheduledTaskAction -Execute "wsl" -Argument "-e bash -c 'ollama serve'"
$trigger = New-ScheduledTaskTrigger -AtStartup
Register-ScheduledTask -TaskName "Ollama-WSL" -Action $action -Trigger $trigger -RunLevel Highest

Source Code

Here on GitHub.

🙌 Final Thoughts

Running LLMs locally has never been easier. With Ollama, DeepSeek-R1, and Spring Boot, you can build blazing-fast AI-powered apps while keeping full control over your data.

Sunday, January 26, 2025

Spring Retry: Handling Transient Failures Gracefully in Java 21

In modern applications, transient failures (e.g., network timeouts, database connection issues, or external API unavailability) are inevitable. To build resilient systems, we need mechanisms to retry failed operations gracefully.

With Java 21 and Spring Boot 3, we can leverage Spring Retry to implement robust retry logic. In this post, I'll show you how to integrate Spring Retry into your application, complete with examples using virtual threads and asynchronous processing.

Why Use Spring Retry?

Spring Retry provides a declarative way to retry operations that may fail due to transient issues. Key features include:

Retry Logic: Automatically retry failed operations with configurable attempts and backoff strategies.
Fallback Mechanism: Define recovery logic when all retries fail.
Integration with Spring: Seamlessly integrates with Spring Boot and other Spring components.

Key Concepts:

@Retryable:

Marks a method as retryable. You can specify the exceptions to retry, the maximum number of attempts, and the backoff strategy.

@Recover:

Defines a fallback method to execute when all retries fail. You can have multiple @Recover methods to handle different exceptions.

Why Multiple @Recover Methods?

Different exceptions may require different recovery logic. For example:

A RuntimeException might require logging.
An IOException might require returning a default response.

By defining multiple @Recover methods, you can handle each exception type appropriately.

Example 1: Retry with Virtual Threads

This example demonstrates how to retry an HTTP call using virtual threads.

import org.springframework.retry.annotation.Backoff;
import org.springframework.retry.annotation.Recover;
import org.springframework.retry.annotation.Retryable;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.util.concurrent.ExecutionException;

@Service // This makes it a Spring-managed bean
public class VirtualThreadExample {

    @Retryable(
            retryFor = {RuntimeException.class, IOException.class, InterruptedException.class, ExecutionException.class, Exception.class},
            maxAttempts = 4,                                   // Total 4 attempts (1 initial + 3 retries)
            backoff = @Backoff(delay = 1000, multiplier = 2)// Exponential backoff: 1s, 2s, 4s
    )
    public void getResponse(String urlRest) throws IOException {
        System.out.println("Attempting to call: " + urlRest);
        throw new RuntimeException("Negative Test cases for VirtualThreadExample");
    }

    @Recover
    public void recover(RuntimeException e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }

    @Recover
    public void recover(IOException e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }

    @Recover
    public void recover(InterruptedException e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }

    @Recover
    public void recover(ExecutionException e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }

    @Recover
    public void recover(Exception e, String urlRest) {
        System.err.println("All retries failed for URL: " + urlRest);
        System.err.println("Error details: " + e.getMessage());
    }
}

Expected Output:

When the getResponse method is called, it will retry 4 times (1 initial attempt + 3 retries) with exponential backoff. If all retries fail, the appropriate @Recover method will be called.

Attempting to call: https://jsonplaceholder.typicode.com/posts/2
Attempting to call: https://jsonplaceholder.typicode.com/posts/3
Attempting to call: https://jsonplaceholder.typicode.com/posts/1
Attempting to call: https://jsonplaceholder.typicode.com/posts/2
Attempting to call: https://jsonplaceholder.typicode.com/posts/1
Attempting to call: https://jsonplaceholder.typicode.com/posts/3
Attempting to call: https://jsonplaceholder.typicode.com/posts/1
Attempting to call: https://jsonplaceholder.typicode.com/posts/2
Attempting to call: https://jsonplaceholder.typicode.com/posts/3
Attempting to call: https://jsonplaceholder.typicode.com/posts/1
Attempting to call: https://jsonplaceholder.typicode.com/posts/2
Attempting to call: https://jsonplaceholder.typicode.com/posts/3
Processed 3 posts in 7060 millis
Program Completed !!
All retries failed for URL: https://jsonplaceholder.typicode.com/posts/3
Error details: Negative Test cases for VirtualThreadExample
All retries failed for URL: https://jsonplaceholder.typicode.com/posts/1
Error details: Negative Test cases for VirtualThreadExample
All retries failed for URL: https://jsonplaceholder.typicode.com/posts/2
Error details: Negative Test cases for VirtualThreadExample

Example 2: Retry with Asynchronous Processing

This example demonstrates how to retry a database operation asynchronously.

import org.springframework.retry.annotation.Backoff;
import org.springframework.retry.annotation.Recover;
import org.springframework.retry.annotation.Retryable;
import org.springframework.stereotype.Service;

@Service
public class AsyncExample {

    @Retryable(
            retryFor = {RuntimeException.class}, // Retry on runtime exceptions
            maxAttempts = 4,                    // Total 4 attempts (1 initial + 3 retries)
            backoff = @Backoff(delay = 1000, multiplier = 2) // Exponential backoff: 1s, 2s, 4s
    )
    public void saveUser(String user) {
        System.out.println("Saving user: " + user);
        try {
            Thread.sleep(1000); // Simulate database latency
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new RuntimeException("Thread interrupted while saving user", e);
        }
        // Simulate a transient failure
        throw new RuntimeException("Failed to save user due to a transient error");
    }

    @Recover
    public void recover(RuntimeException e, String user) {
        System.err.println("All retries failed for user: " + user);
        System.err.println("Error details: " + e.getMessage());
        // Fallback logic (e.g., log the error, notify, or take corrective action)
    }
}

Expected Output:

When the saveUser method is called, it will retry 4 times (1 initial attempt + 3 retries) with exponential backoff. If all retries fail, the @Recover method will be called.

Saving user: JohnDoe
Saving user: JohnDoe
Saving user: JohnDoe
Saving user: JohnDoe
All retries failed for user: JohnDoe
Error details: Failed to save user due to a transient error

Running the Application

To execute these examples, define ApplicationRunner beans in your Spring Boot application:

import org.springframework.boot.ApplicationRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.retry.annotation.EnableRetry;

import java.util.List;
import java.util.concurrent.*;

@SpringBootApplication
@EnableRetry
public class DemoSpringRetryApplication {

	public static void main(String[] args) {
		SpringApplication.run(DemoSpringRetryApplication.class, args);
	}

	@Bean
	ApplicationRunner asyncExampleRunner(AsyncExample example) {
		return args -> {
			String user = "JohnDoe";

			// Run the task asynchronously
			CompletableFuture.runAsync(() -> {
						System.out.println("Starting async task for user: " + user);
						example.saveUser(user); // Use the injected bean
						System.out.println("Async task completed for user: " + user);
					}, Executors.newVirtualThreadPerTaskExecutor())
					.exceptionally(ex -> {
						System.err.println("Failed to save user: " + user);
						ex.printStackTrace();
						return null;
					});

			System.out.println("Main thread continues executing...");
			try {
				Thread.sleep(2000); // Simulate main thread work
			} catch (InterruptedException e) {
				e.printStackTrace();
			}
			System.out.println("Main thread finished.");
		};
	}

	@Bean
	ApplicationRunner virtualThreadExampleRunner(VirtualThreadExample example) {
		return args -> {
			try (ExecutorService myExecutor = Executors.newVirtualThreadPerTaskExecutor()) {
				// List of posts to process
				List<Integer> posts = List.of(1, 2, 3);
				long start = System.nanoTime();

				// Submit a task for each post
				List<Future<Object>> futures = posts.stream()
						.map(post -> myExecutor.submit(() -> {
							example.getResponse("https://jsonplaceholder.typicode.com/posts/" + post);
							return null; // Explicitly return null for Future<Void>
						}))
						.toList();

				// Wait for all tasks to complete
				for (Future<Object> future : futures) {
					future.get(); // Ensures task completion
				}

				long duration = (System.nanoTime() - start) / 1_000_000;
				System.out.printf("Processed %d posts in %d millis%n", posts.size(), duration);
				System.out.println("Program Completed !!");
			} catch (InterruptedException | ExecutionException e) {
				System.err.println("error " + e.getMessage());
			}
		};
	}
}

Dependencies

Add the following dependencies to your pom.xml:

<dependencies>
    <!-- Spring Retry -->
    <dependency>
        <groupId>org.springframework.retry</groupId>
        <artifactId>spring-retry</artifactId>
    </dependency>
    <!-- Spring AOP (required for @EnableRetry) -->
    <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-aspects</artifactId>
    </dependency>
</dependencies>

Conclusion

By leveraging Spring Retry alongside Java 21's virtual threads and asynchronous processing, you can create resilient systems that gracefully handle transient failures. Whether making HTTP calls or working with databases, retry mechanisms ensure your application stays reliable and robust under pressure.

Key Takeaways:

Use @Retryable to define retry logic and @Recover for fallback behavior.
Multiple @Recover methods allow you to handle different exceptions appropriately.
Virtual threads and asynchronous processing improve concurrency and performance.

Monday, January 27, 2025

🚀 How to Integrate Ollama with DeepSeek-R1 in Spring Boot

What is Ollama?

Why Integrate Ollama with DeepSeek-R1?

Step 1: Install Ollama

Successful Installation Output:

If deepseek-r1:1.5b isn’t listed, pull it.

Test the model with curl:

Step 2: Application Configuration

Step 3: Core Implementation

Create the following records to handle requests and responses:

Service Layer

REST Controller

Model Version Compatibility

Testing the Integration

Ouput

Source Code

Sunday, January 26, 2025

Spring Retry: Handling Transient Failures Gracefully in Java 21

Why Use Spring Retry?

Key Concepts:

Why Multiple @Recover Methods?

Example 1: Retry with Virtual Threads

Expected Output:

Example 2: Retry with Asynchronous Processing

Expected Output:

Running the Application

Dependencies

Conclusion

Key Takeaways:

🚀 Streaming PostgreSQL Changes to BigQuery using Cloud Run Jobs + Cloud Scheduler 🔄