- 
                Notifications
    You must be signed in to change notification settings 
- Fork 19
GettingStarted
This quick tutorial will get you through setting up a JavaCL project.
You can either create a plain Maven project by hand and modify its pom.xml as below, or use the fully automated way that uses a Maven archetype :
- Create a new project [https://javacl.googlecode.com/svn/wiki/images/NetbeansArchetype-NewProject.png]
- Choose Maven / Project from Archetype: [https://javacl.googlecode.com/svn/wiki/images/NetbeansArchetype-ChooseProject.png]
- In the Maven Archetypescreen, click onAdd, fill-in the details then click onOk:- Group Id = com.nativelibs4java
- Archetype Id = javacl-simple-tutorial
- Version = 1.0.0-RC1
- Repository = https://nativelibs4java.sourceforge.net/maven[https://javacl.googlecode.com/svn/wiki/images/NetbeansArchetype-AchetypeDetails.png]
 
- Group Id = 
- Select the custom archetype you've just created and click on Next, thenFinish: [https://javacl.googlecode.com/svn/wiki/images/NetbeansArchetype-CustomArchetype.png]
This will work with the open source Community Edition as well as with the Ultimate Edition. If you haven't tried it yet, IntelliJ IDEA is a great IDE to work with...
It's very easy to create the sources for this tutorial with IDEA :
- Make sure to install Maven and set the M2_HOMEenvironment variable properly
- Create a new project [https://javacl.googlecode.com/svn/wiki/images/IdeaArchetype-FileNewProject.png]
- Select Create a project from scratch: [https://javacl.googlecode.com/svn/wiki/images/IdeaArchetype-NewProject.png]
- Choose a name for your tutorial project and select Maven Modulefor the type of the project, then click onNext: [https://javacl.googlecode.com/svn/wiki/images/IdeaArchetype-NewMavenProject.png]
- Tick Create from archetypeand click onAdd archetype, then fill-in the details :- Group Id = com.nativelibs4java
- Archetype Id = javacl-simple-tutorial
- Version = 1.0.0-RC1
- Repository = https://nativelibs4java.sourceforge.net/maven[https://javacl.googlecode.com/svn/wiki/images/IdeaArchetype-AddArchetype.png]
 
- Group Id = 
- Click OkandFinish: you're done
First, please make sure you've properly installed Maven.
Then simply type the following commands in a shell :
mvn archetype:generate -DarchetypeGroupId=com.nativelibs4java -DarchetypeArtifactId=javacl-simple-tutorial -DarchetypeVersion=1.0.0-RC1 -DremoteRepositories=https://nativelibs4java.sourceforge.net/mavenThis will generate a directory with the following layout :
JavaCLTutorial
|__ pom.xml
|__ src/
     |__ test 
     |__ main/
          |__ java/
          |    |__ tutorial/
          |         |__ JavaCLTutorial1.java
          |         |__ JavaCLTutorial2.java
          |         |__ JavaCLTutorial3.java
          |__ opencl/
               |__ tutorial/
                    |__ TutorialKernels.cl
The three versions of the JavaCLTutorial class correspond to the progression of this tutorial, as you'll see below.
This is the OpenCL equivalent of the traditional Hello-world example : we're just going to perform parallel piece-wise additions on two float vectors, storing the results on a third float vector.
It contains two parts :
- an OpenCL source code that will be compiled at run-time and run on an OpenCL device (CPU or GPU)
- a Java host program that sets up the JavaCL context, reads and compiles the OpenCL source code and calls it with appropriately initialized arguments.
This file contains a single cross-platform JavaCL dependency :
  <repositories>
    <repository>
      <id>nativelibs4java</id>
      <name>nativelibs4java Maven2 Repository</name>
      <url>https://nativelibs4java.sourceforge.net/maven</url>
    </repository>
  </repositories>
  
  <dependencies>
	<dependency>
	  <groupId>com.nativelibs4java</groupId>
	  <artifactId>javacl</artifactId>
	  <version>1.0.0-RC1</version>
	</dependency>
  </dependencies>It also contains the configuration needed by the JavaCL Generator plugin, which will help make your kernels programming experience a lot (type-)safer and more enjoyable (see below).
Here's our first kernel in src/main/opencl/tutorial/TutorialKernels.cl :
__kernel void add_floats(__global const float* a, __global const float* b, __global float* out, int n) 
{
    int i = get_global_id(0);
    if (i >= n)
        return;
    out[i] = a[i] + b[i];
}It's very simple : it takes the global id of this execution (it will be executed many times in parallel, with only the global id differing between two executions) and adds values found in a and b at index i, storing the result in out at index i.
Here's the contents of src/main/java/tutorial/JavaCLTutorial1.java :
package tutorial;
import com.nativelibs4java.opencl.*;
import com.nativelibs4java.opencl.util.*;
import com.nativelibs4java.util.*;
import org.bridj.Pointer;
import static org.bridj.Pointer.*;
import static java.lang.Math.*;
public class JavaCLTutorial1 {
    public static void main(String[] args) {
        CLContext context = JavaCL.createBestContext();
        CLQueue queue = context.createDefaultQueue();
        int n = 1024;
        Pointer<Float>
            aPtr = allocateFloats(n),
            bPtr = allocateFloats(n);
        for (int i = 0; i < n; i++) {
            aPtr.set(i, (float)cos(i));
            bPtr.set(i, (float)sin(i));
        }
        // Create OpenCL input buffers (using the native memory pointers aPtr and bPtr) :
        CLBuffer<Float> 
            a = context.createFloatBuffer(Usage.Input, aPtr),
            b = context.createFloatBuffer(Usage.Input, bPtr);
        // Create an OpenCL output buffer :
        CLBuffer<Float> out = context.createFloatBuffer(Usage.Output, n);
        // Read the program sources and compile them :
        String src = IOUtils.readText(JavaCLTutorial1.class.getResource("TutorialKernels.cl"));
        CLProgram program = context.createProgram(src);
        // Get and call the kernel :
        CLKernel addFloatsKernel = program.createKernel("add_floats");
        addFloatsKernel.setArgs(a, b, out, n);
        CLEvent addEvt = addFloatsKernel.enqueueNDRange(queue, new int[] { n });
        
        Pointer<Float> outPtr = out.read(queue, addEvt); // blocks until add_floats finished
        // Print the first 10 output values :
        for (int i = 0; i < 10 && i < n; i++)
            System.out.println("out[" + i + "] = " + outPtr.get(i));
        
    }
}If you're using Maven in command-line, type this in the shell in project's folder (otherwise, click on Run in your IDE of choice ;-)) :
mvn compile exec:java -Dexec.mainClass=com.mycompany.JavaCLTutorial1The end of the output will look like this (please allow some time the first time Maven runs : it will download many files, but won't do it again ;-)) :
out[0] = 1.0
out[1] = 1.3817732
out[2] = 0.49315056
out[3] = -0.8488725
out[4] = -1.4104462
out[5] = -0.6752621
out[6] = 0.6807548
out[7] = 1.4108889
out[8] = 0.84385824
out[9] = -0.49901175Saw that "javacl-generator" plugin configuration in the pom.xml file ?
JavaCL Generator is a tool that parses any .cl file present in src/main/opencl and creates a wrapper class that only accepts the correct argument types and numbers, instead of the all-forgiving CLKernel.setArgs(Object...) that might make your program crash at runtime if you used an incorrect argument type (or missed an argument).
As the Maven pom.xml file you've got from the archetype already contains the correct configuration for the generator, you can simply modify the host code as follows (see JavaCLTutorial2.java):
        /* 
        This code is no longer needed :
        // Read the program sources and compile them :
        String src = IOUtils.readText(JavaCLTutorial.class.getResource("TutorialKernels.cl"));
        CLProgram program = context.createProgram(src);
        
        // Get and call the kernel :
        CLKernel addFloatsKernel = program.createKernel("add_floats");
        addFloatsKernel.setArgs(a, b, out, n);
        CLEvent evt = addFloatsKernel.enqueueNDRange(queue, new int[] { n });
        */
        // Instantiate the auto-generated program wrapper and call the kernel :
        TutorialKernels kernels = new TutorialKernels(context);
        CLEvent addEvt = kernels.add_floats(queue, a, b, out, n, new int[] { n }, null);
        Pointer<Float> outPtr = out.read(queue, addEvt); // blocks until add_floats finishedSimple, isn't it ?
The program wrappers will be regenerated automatically at each compilation, so they'll keep in sync with your OpenCL kernels !
In the code above, we've initialized the a and b buffers with pointers to memory that we've filled by hand in Java... Why not fill it straight in OpenCL (i.e. on the GPU, if the context uses a GPU device) ?
Let's consider this other kernel in src/main/opencl/TutorialKernels.cl :
__kernel void fill_in_values(__global float* a, __global float* b, int n) 
{
    int i = get_global_id(0);
    if (i >= n)
        return;
    a[i] = cos((float)i);
    b[i] = sin((float)i);
}With the following changes in the Java code (see JavaCLTutorial3.java) :
        // Create OpenCL input and output buffers :
        CLBuffer<Float> 
            a = context.createFloatBuffer(Usage.InputOutput, n), // a and b are now read AND written to
            b = context.createFloatBuffer(Usage.InputOutput, n),
            out = context.createFloatBuffer(Usage.Output, n);
        // Instantiate the auto-generated program wrapper and chain calls to the two kernel :
        TutorialKernels kernels = new TutorialKernels(context);
        CLEvent fillEvt = kernels.fill_in_values(queue, a, b, n, new int[] { n }, null);
        CLEvent addEvt = kernels.add_floats(queue, a, b, out, n, new int[] { n }, null, fillEvt);
        Pointer<Float> outPtr = out.read(queue, addEvt); // blocks until add_floats finishedSee how we've chained dependent operations through events (operations are executed asynchronously from the Java program) :
- the add_floats kernel execution must wait for the fill_in_values execution to finish
- the out.read operation must wait for the add_floats execution to finish.
That's it ! In this short tutorial, you've seen how to :
- setup a simple JavaCL project (in command-line, with Netbeans or IntelliJ IDEA)
- allocate OpenCL buffers
- call and chain executions of simple OpenCL kernels
- use the JavaCL Generator to get free compile-time checks of OpenCL kernel arguments and less boilerplate code
Want to see more ? Give a look a some samples and demos.
Need more info on how to write kernels ? Make sure to read the official OpenCL 1.0 Reference Pages.
Something not clear / got questions ? Join the NativeLibs4Java Community's mailing-list !