← 返回首页
babylon/hat at code-reflection · openjdk/babylon · GitHub
Skip to content

Navigation Menu

Toggle navigation
Sign in
Appearance settings
Search or jump to...

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Resetting focus

Latest commit

 

History

History
 code-reflection
Top

README.md

Heterogeneous Accelerator Toolkit (HAT)

HAT is a toolkit that allows developers to express data-parallel applications in Java, optimize, offload and execute them on hardware accelerators.

  • Heterogeneous: a variety of devices and their corresponding programming languages.
  • Accelerator: GPUs, FPGA, CPUs, etc.
  • Toolkit: a set of libraries for Java developers.

HAT uses the code reflection API from the Project Babylon.

The toolkit offers:

  • An API for Kernel Programming on Accelerators from Java.
  • An API for Combining multiple kernels into a compute-graph.
  • An API for Java object mapping to hardware accelerators using Panama FFM.
  • An extensible backend system for multiple accelerators:
    • OpenCL
    • CUDA
    • Java

Prerequisites

  • HAT currently requires Babylon JDK, which contains the code reflection APIs.
  • A base JDK >= 25. We currently use OpenJDK 26 for development.
  • A GPU SDK (one or more of the SDKs below) to be able to run on GPUs:
    • An OpenCL implementation (e.g., Intel, Apple Silicon, CUDA SDK)
      • OpenCL >= 1.2
    • CUDA SDK >= 12.9
  • cmake >= 3.22.1
  • gcc >= 12.0, or clang >= 17.0

Compatible systems

We actively develop and run tests on the following systems:

  • Apple Silicon M1-M4
  • Linux Fedora >= 42
  • Oracle Linux 10
  • Ubuntu >= 22.04

Quick Start

1. Build Babylon JDK

git clone https://github.com/openjdk/babylon cd babylon bash configure --with-boot-jdk=${JAVA_HOME} make clean make images

2. Update JAVA_HOME and PATH

export JAVA_HOME=<BABYLON-DIR>/build/macosx-aarch64-server-release/images/jdk export PATH=$JAVA_HOME/bin:$PATH

3. Build HAT

sdk install jextract #if needed cd hat java @.bld

Done!

Run Examples

For instance, matrix-multiply:

java @.run ffi-opencl matmul --size=1024

Some examples have a GUI implementation:

java @.run ffi-opencl mandel

Full list of examples:

Run Unit-Tests

OpenCL backend:

java @.test-suite ffi-opencl

CUDA backed:

java @.test-suite ffi-cuda

Full Example Explained

The following example compute the square value of an input vector. The example is self-contained and it can be directly run with the java command.

Place the following code in the hat directory.

import hat.*; import hat.Accelerator.Compute; import hat.backend.*; import hat.buffer.*; import optkl.ifacemapper.MappableIface.*; import jdk.incubator.code.Reflect; import java.lang.invoke.MethodHandles; public class ExampleHAT { // Kernel Code: This is the function to be offloaded to the accelerator (e.g., // a GPU). The kernel will be executed by many GPU threads, in this case, // as many threads as elements in `array`. // The `kc` object can be used to obtain the thread identifier and map // the data element to process. // HAT kernels follow the SIMT programming model (Single Instruction Multiple Thread) // mode. // Kernel code is reflectable. Thus, the HAT runtime and HAT compiler can build // and optimize the code model. Once the code model is optimized, HAT generates // OpenCL/CUDA C99 code. @Reflect public static void squareKernel(@RO KernelContext kc, @RW S32Array array) { // HAT kernels support a reduced set of Java. // Kernels express the work to be done per thread (GPU/accelerator thread). if (kc.gix < array.length()) { int value = array.array(kc.gix); array.array(kc.gix, (value * value)); } } // The following method represents the compute layer, in which we specify // the number of threads to be deployed on the accelerator. The number of threads // is specified in an ND-Range. An ND-Range could be 1D, 2D and 3D. // In this example, we launch 1D-range with the number of threads equal to // the input array size. @Reflect public static void square(@RO ComputeContext cc, @RW S32Array array) { var ndRange = NDRange.of1D(array.length()); // Dispatch the kernel. The HAT runtime will offload the kernels // reached from this point and run the generated GPU kernels on the // target accelerator. // Furthermore, HAT automatically transfers data to the accelerator. // This is a blocking call, and when it returns control to the main // Java thread, results (outputs) are available to be consumed. cc.dispatchKernel(ndRange, kc -> squareKernel(kc, array)); } static void main(String[] args) { final int size = 4096; // Create a new accelerator object var accelerator = new Accelerator(MethodHandles.lookup(), Backend.FIRST); // Instantiate an array on the target accelerator. // Data is stored off-heap using the Panama FFM API. var array = S32Array.create(accelerator, size); // Data initialization for (int i = 0; i < array.length(); i++) { array.array(i, i); } // Offload and dispatch of the compute-graph on the target accelerator. // This is a blocking call. Once this call finalizes, the results (outputs) // will be available to consume by the current Java thread. accelerator.compute((@Reflect Compute) cc -> ExampleHAT.square(cc, array)); // Test result boolean isCorrect = true; for (int i = 0; i < size; i++) { if (array.array(i) != i * i) { isCorrect = false; } } if (isCorrect) { IO.println("Result is correct"); } else { IO.println("Result is NOT correct"); } } }

Run this example in the babylon/hat directory. If you run from another directory, update the --class-path parameter accordingly. Use the java version built with the Babylon JDK.

java --enable-preview \ --add-modules=jdk.incubator.code \ --enable-native-access=ALL-UNNAMED \ --class-path build/hat-optkl-1.0.jar:build/hat-core-1.0.jar:build/hat-backend-ffi-shared-1.0.jar:build/hat-backend-ffi-opencl-1.0.jar \ -Djava.library.path=/Users/juanfumero/repos/babylon/hat/build \ ExampleHAT

If you run with HAT=INFO you can see which accelerator was used:

$ HAT=INFO java --enable-preview ... ExampleHAT.java [INFO] Config Bits = 8000 [INFO] Platform :"Apple" [INFO] Version :"OpenCL 1.2 (Jan 16 2026 07:22:26)" [INFO] Name :"Apple" [INFO] Device Type : GPU 4 [INFO] OpenCLBackend::OpenCLQueue::dispatch [INFO] numDimensions: 1 [INFO] GLOBAL [4096,1,1] [INFO] LOCAL [ nullptr ] // The driver will setup a default value Result is correct

Documentation

Visit the docs folder.

Contributing

Contributions are welcome. Please see the OpenJDK Developers' Guide.

Development Workflow

  1. Fork the repository
  2. Create a feature branch: git checkout -b <branch>
  3. Commit with clear messages
  4. Run formatting and tests:
    1. For OpenCL: java @.est-suite ffi-opencl
    2. For CUDA: java @.test-suite ffi-cuda
  5. Submit a pull request

Contacts/Questions

You can interact, provide feedback and ask questions using the babylon-dev mailing list.

Footer

© 2026 GitHub, Inc.