| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 1 | page.title=Renderscript Computation |
| 2 | parent.title=Computation |
| 3 | parent.link=index.html |
| 4 | |
| 5 | @jd:body |
| 6 | |
| 7 | <div id="qv-wrapper"> |
| 8 | <div id="qv"> |
| 9 | <h2>In this document</h2> |
| 10 | |
| 11 | <ol> |
| 12 | <li><a href="#overview">Renderscript System Overview</a></li> |
| Robert Ly | 3419e07 | 2012-11-12 19:42:52 -0800 | [diff] [blame] | 13 | <li><a href="#filterscript">Filterscript</a></li> |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 14 | <li> |
| 15 | <a href="#creating-renderscript">Creating a Computation Renderscript</a> |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 16 | <ol> |
| 17 | <li><a href="#creating-rs-file">Creating the Renderscript file</a></li> |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 18 | <li><a href="#calling">Calling the Renderscript code</a></li> |
| 19 | </ol> |
| 20 | </li> |
| 21 | </ol> |
| 22 | |
| 23 | <h2>Related Samples</h2> |
| 24 | |
| 25 | <ol> |
| 26 | <li><a href="{@docRoot}resources/samples/RenderScript/HelloCompute/index.html">Hello |
| 27 | Compute</a></li> |
| 28 | </ol> |
| 29 | </div> |
| 30 | </div> |
| 31 | |
| 32 | <p>Renderscript offers a high performance computation API at the native |
| 33 | level that you write in C (C99 standard). Renderscript gives your apps the ability to run |
| 34 | operations with automatic parallelization across all available processor cores. |
| 35 | It also supports different types of processors such as the CPU, GPU or DSP. Renderscript |
| 36 | is useful for apps that do image processing, mathematical modeling, or any operations |
| 37 | that require lots of mathematical computation.</p> |
| 38 | |
| 39 | <p>In addition, you have access to all of these features without having to write code to |
| 40 | support different architectures or a different amount of processing cores. You also |
| 41 | do not need to recompile your application for different processor types, because Renderscript |
| 42 | code is compiled on the device at runtime.</p> |
| 43 | |
| 44 | <p class="note"><strong>Deprecation Notice</strong>: Earlier versions of Renderscript included |
| 45 | an experimental graphics engine component. This component |
| 46 | is now deprecated as of Android 4.1 (most of the APIs in <code>rs_graphics.rsh</code> |
| 47 | and the corresponding APIs in {@link android.renderscript}). |
| 48 | If you have apps that render graphics with Renderscript, we highly |
| 49 | recommend you convert your code to another Android graphics rendering option.</p> |
| 50 | |
| 51 | <h2 id="overview">Renderscript System Overview</h2> |
| 52 | <p>The Renderscript runtime operates at the native level and still needs to communicate |
| 53 | with the Android VM, so the way a Renderscript application is set up is different from a pure VM |
| 54 | application. An application that uses Renderscript is still a traditional Android application that |
| 55 | runs in the VM, but you write Renderscript code for the parts of your program that require |
| 56 | it. No matter what you use it for, Renderscript remains platform |
| 57 | independent, so you do not have to target multiple architectures (for example, |
| 58 | ARM v5, ARM v7, x86).</p> |
| 59 | |
| 60 | <p>The Renderscript system adopts a control and slave architecture where the low-level Renderscript runtime |
| 61 | code is controlled by the higher level Android system that runs in a virtual machine (VM). The |
| 62 | Android VM still retains all control of memory management and binds memory that it allocates to |
| 63 | the Renderscript runtime, so the Renderscript code can access it. The Android framework makes |
| 64 | asynchronous calls to Renderscript, and the calls are placed in a message queue and processed |
| 65 | as soon as possible. Figure 1 shows how the Renderscript system is structured.</p> |
| 66 | |
| 67 | <img id="figure1" src="{@docRoot}images/rs_overview.png" /> |
| 68 | <p class="img-caption"><strong>Figure 1.</strong> Renderscript system overview</p> |
| 69 | |
| 70 | <p>When using Renderscript, there are three layers of APIs that enable communication between the |
| 71 | Renderscript runtime and Android framework code:</p> |
| 72 | |
| 73 | <ul> |
| 74 | <li>The Renderscript runtime APIs allow you to do the computation |
| 75 | that is required by your application.</li> |
| 76 | |
| 77 | <li>The reflected layer APIs are a set of classes that are reflected from your Renderscript |
| 78 | runtime code. It is basically a wrapper around the Renderscript code that allows the Android |
| 79 | framework to interact with the Renderscript runtime. The Android build tools automatically generate the |
| 80 | classes for this layer during the build process. These classes eliminate the need to write JNI glue |
| 81 | code, like with the NDK.</li> |
| 82 | |
| 83 | <li>The Android framework layer calls the reflected layer to access the Renderscript |
| 84 | runtime.</li> |
| 85 | </ul> |
| 86 | |
| 87 | <p>Because of the way Renderscript is structured, the main advantages are:</p> |
| 88 | <ul> |
| 89 | <li>Portability: Renderscript is designed to run on many types of devices with different |
| 90 | processor (CPU, GPU, and DSP for instance) architectures. It supports all of these architectures without |
| 91 | having to target each device, because the code is compiled and cached on the device |
| 92 | at runtime.</li> |
| 93 | |
| 94 | <li>Performance: Renderscript provides a high performance computation API with seamless parallelization |
| 95 | across the amount of cores on the device.</li> |
| 96 | |
| 97 | <li>Usability: Renderscript simplifies development when possible, such as eliminating JNI glue code.</li> |
| 98 | </ul> |
| 99 | |
| 100 | <p>The main disadvantages are:</p> |
| 101 | |
| 102 | <ul> |
| 103 | <li>Development complexity: Renderscript introduces a new set of APIs that you have to learn.</li> |
| 104 | |
| 105 | <li>Debugging visibility: Renderscript can potentially execute (planned feature for later releases) |
| 106 | on processors other than the main CPU (such as the GPU), so if this occurs, debugging becomes more difficult. |
| 107 | </li> |
| 108 | </ul> |
| 109 | |
| 110 | <p>For a more detailed explanation of how all of these layers work together, see |
| 111 | <a href="{@docRoot}guide/topics/renderscript/advanced.html">Advanced Renderscript</a>.<p> |
| 112 | |
| Robert Ly | 3419e07 | 2012-11-12 19:42:52 -0800 | [diff] [blame] | 113 | <h2 id="filterscript">Filterscript</h2> |
| 114 | |
| 115 | <p>Introduced in Android 4.2 (API Level 17), Filterscript defines a subset of Renderscript |
| 116 | that focuses on image processing operations, such as those |
| 117 | that you would typically write with an OpenGL ES fragment shader. You still write your scripts |
| 118 | using the standard Renderscript runtime APIs, but within stricter |
| 119 | constraints that ensure wider compatibility and improved optimization across |
| 120 | CPUs, GPUs, and DSPs. At compile time, the precompiler evaluates Filterscript files and |
| 121 | applies a more stringent set of warnings and errors than |
| 122 | it does for standard Renderscript files. The following list describes the major constraints |
| 123 | of Filterscript when compared to Renderscript:</p> |
| 124 | |
| 125 | <ul> |
| 126 | <li>Inputs and return values of root functions cannot contain pointers. The default root function |
| 127 | signature contains pointers, so you must use the <code>__attribute__((kernel))</code> attribute to declare a custom |
| 128 | root function when using Filterscript.</li> |
| 129 | <li>Built-in types cannot exceed 32-bits.</li> |
| 130 | <li>Filterscript must always use relaxed floating point precision by using the |
| 131 | <code>rs_fp_relaxed</code> pragma.</li> |
| 132 | <li>Filterscript files must end with an <code>.fs</code> extension, instead of an <code>.rs</code> extension.</li> |
| 133 | </ul> |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 134 | |
| 135 | <h2 id="creating-renderscript">Creating a Renderscript</h2> |
| 136 | |
| Robert Ly | 3419e07 | 2012-11-12 19:42:52 -0800 | [diff] [blame] | 137 | <p>Renderscript scales to the amount of |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 138 | processing cores available on the device. This is enabled through a function named |
| 139 | <code>rsForEach()</code> (or the <code>forEach_root()</code> method at the Android framework level). |
| Robert Ly | 3419e07 | 2012-11-12 19:42:52 -0800 | [diff] [blame] | 140 | that automatically partitions work across available processing cores on the device.</p> |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 141 | |
| 142 | <p>Implementing a Renderscript involves creating a <code>.rs</code> file that contains |
| 143 | your Renderscript code and calling it at the Android framework level with the |
| 144 | <code>forEach_root()</code> or at the Renderscript runtime level with the |
| 145 | <code>rsForEach()</code> function. The following diagram describes how a typical |
| 146 | Renderscript is set up:</p><img src="{@docRoot}images/rs_compute.png"> |
| 147 | |
| 148 | <p class="img-caption"><strong>Figure 1.</strong> Renderscript overview</p> |
| 149 | |
| 150 | <p>The following sections describe how to create a simple Renderscript and use it in an |
| 151 | Android application. This example uses the <a href= |
| 152 | "{@docRoot}resources/samples/RenderScript/HelloCompute/index.html">HelloCompute Renderscript |
| 153 | sample</a> that is provided in the SDK as a guide (some code has been modified from its original |
| 154 | form for simplicity).</p> |
| 155 | |
| 156 | <h3 id="creating-rs-file">Creating the Renderscript file</h3> |
| 157 | |
| 158 | <p>Your Renderscript code resides in <code>.rs</code> and <code>.rsh</code> files in the |
| 159 | <code><project_root>/src/</code> directory. This code contains the computation logic |
| 160 | and declares all necessary variables and pointers. |
| 161 | Every <code>.rs</code> file generally contains the following items:</p> |
| 162 | |
| 163 | <ul> |
| 164 | <li>A pragma declaration (<code>#pragma rs java_package_name(<em>package.name</em>)</code>) |
| 165 | that declares the package name of the <code>.java</code> reflection of this Renderscript.</li> |
| 166 | |
| 167 | <li>A pragma declaration (<code>#pragma version(1)</code>) that declares the version of |
| 168 | Renderscript that you are using (1 is the only value for now).</li> |
| Robert Ly | 3419e07 | 2012-11-12 19:42:52 -0800 | [diff] [blame] | 169 | |
| 170 | <li><p>A root function (or kernel) that is the main entry point to your Renderscript. |
| 171 | The default <code>root()</code> function must return |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 172 | <code>void</code> and accept the following arguments:</p> |
| 173 | |
| 174 | <ul> |
| 175 | <li>Pointers to memory allocations that are used for the input and output of the |
| 176 | Renderscript. Both of these pointers are required for Android 3.2 (API level 13) platform |
| 177 | versions or older. Android 4.0 (API level 14) and later requires one or both of these |
| 178 | allocations.</li> |
| 179 | </ul> |
| 180 | |
| 181 | <p>The following arguments are optional, but both must be supplied if you choose to use |
| 182 | them:</p> |
| 183 | |
| 184 | <ul> |
| 185 | <li>A pointer for user-defined data that the Renderscript might need to carry out |
| 186 | computations in addition to the necessary allocations. This can be a pointer to a simple |
| 187 | primitive or a more complex struct.</li> |
| 188 | |
| 189 | <li>The size of the user-defined data.</li> |
| 190 | </ul> |
| Robert Ly | 3419e07 | 2012-11-12 19:42:52 -0800 | [diff] [blame] | 191 | |
| 192 | <p>Starting in Android 4.1 (API Level 16), you can choose to define your own root function arguments |
| 193 | without adhering to the default root function signature described previously. In addition, |
| 194 | you can declare multiple root functions in the same Renderscript. To do this, use the <code>__attribute__((kernel))</code> |
| 195 | attribute to define a custom root function. For example, here's a root function |
| 196 | that returns a <code>uchar4</code> and accepts two <code>uint32_t</code> types: </p> |
| 197 | |
| 198 | <pre> |
| 199 | uchar4 __attribute__((kernel)) root(uint32_t x, uint32_t y) { |
| 200 | ... |
| 201 | } |
| 202 | </pre> |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 203 | </li> |
| 204 | |
| 205 | <li>An optional <code>init()</code> function. This allows you to do any initialization |
| Robert Ly | 3419e07 | 2012-11-12 19:42:52 -0800 | [diff] [blame] | 206 | before the root function runs, such as initializing variables. This |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 207 | function runs once and is called automatically when the Renderscript starts, before anything |
| 208 | else in your Renderscript.</li> |
| 209 | |
| 210 | <li>Any variables, pointers, and structures that you wish to use in your Renderscript code (can |
| 211 | be declared in <code>.rsh</code> files if desired)</li> |
| 212 | </ul> |
| 213 | |
| 214 | <p>The following code shows how the <a href= |
| 215 | "{@docRoot}resources/samples/RenderScript/HelloCompute/src/com/example/android/rs/hellocompute/mono.html"> |
| 216 | mono.rs</a> file is implemented:</p> |
| 217 | <pre> |
| 218 | #pragma version(1) |
| 219 | #pragma rs java_package_name(com.example.android.rs.hellocompute) |
| 220 | |
| 221 | //multipliers to convert a RGB colors to black and white |
| 222 | const static float3 gMonoMult = {0.299f, 0.587f, 0.114f}; |
| 223 | |
| 224 | void root(const uchar4 *v_in, uchar4 *v_out) { |
| 225 | //unpack a color to a float4 |
| 226 | float4 f4 = rsUnpackColor8888(*v_in); |
| 227 | //take the dot product of the color and the multiplier |
| 228 | float3 mono = dot(f4.rgb, gMonoMult); |
| 229 | //repack the float to a color |
| 230 | *v_out = rsPackColorTo8888(mono); |
| 231 | } |
| 232 | </pre> |
| 233 | |
| Robert Ly | 3419e07 | 2012-11-12 19:42:52 -0800 | [diff] [blame] | 234 | <h4>Setting floating point precision</h4> |
| 235 | <p>You can define the floating point precision required by your compute algorithms. This is useful if you |
| 236 | require less precision than the IEEE 754-2008 standard (used by default). You can define |
| 237 | the floating-point precision level of your script with the following pragmas:</p> |
| 238 | |
| 239 | <ul> |
| 240 | <li><code>#pragma rs_fp_full</code> (default if nothing is specified): For apps that |
| 241 | require floating point precision as outlined by the IEEE 754-2008 standard. |
| 242 | </li> |
| 243 | <li><code>#pragma rs_fp_relaxed</code> - For apps that don’t require |
| 244 | strict IEEE 754-2008 compliance and can tolerate less precision. This mode enables |
| 245 | flush-to-zero for denorms and round-towards-zero. |
| 246 | </li> |
| 247 | <li><code>#pragma rs_fp_imprecise</code> - For apps that don’t have stringent precision requirements. This mode enables |
| 248 | everything in <code>rs_fp_relaxed</code> along with the following: |
| 249 | <ul> |
| 250 | <li>Operations resulting in -0.0 can return +0.0 instead.</li> |
| 251 | <li>Operations on INF and NAN are undefined.</li> |
| 252 | </ul> |
| 253 | </li> |
| 254 | </ul> |
| 255 | |
| 256 | <h4>Script intrinsics</h4> |
| 257 | <p>Renderscript adds support for a set of script intrinsics, which are pre-implemented |
| 258 | filtering primitives that reduce the amount of |
| 259 | code that you need to write. They also are implemented to ensure that your app gets the |
| 260 | maximum performance gain possible.</p> |
| 261 | |
| 262 | <p> |
| 263 | Intrinsics are available for the following: |
| 264 | <ul> |
| 265 | <li>{@link android.renderscript.ScriptIntrinsicBlend Blends}</li> |
| 266 | <li>{@link android.renderscript.ScriptIntrinsicBlur Blur}</li> |
| 267 | <li>{@link android.renderscript.ScriptIntrinsicColorMatrix Color matrix}</li> |
| 268 | <li>{@link android.renderscript.ScriptIntrinsicConvolve3x3 3x3 convolve}</li> |
| 269 | <li>{@link android.renderscript.ScriptIntrinsicConvolve5x5 5x5 convolve}</li> |
| 270 | <li>{@link android.renderscript.ScriptIntrinsicLUT Per-channel lookup table}</li> |
| 271 | <li>{@link android.renderscript.ScriptIntrinsicYuvToRGB Converting an Android YUV buffer to RGB}</li> |
| 272 | </ul> |
| 273 | |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 274 | <h3 id="calling">Calling the Renderscript code</h3> |
| 275 | |
| 276 | <p>You can call the Renderscript from your Android framework code by |
| 277 | creating a Renderscript object by instantiating the (<code>ScriptC_<em>script_name</em></code>) |
| 278 | class. This class contains a method, <code>forEach_root()</code>, that lets you invoke |
| 279 | <code>rsForEach</code>. You give it the same parameters that you would if you were invoking it |
| 280 | at the Renderscript runtime level. This technique allows your Android application to offload |
| 281 | intensive mathematical calculations to Renderscript. See the <a href= |
| 282 | "{@docRoot}resources/samples/RenderScript/HelloCompute/index.html">HelloCompute</a> sample to see |
| 283 | how a simple Android application can utilize Renderscript.</p> |
| 284 | |
| 285 | <p>To call Renderscript at the Android framework level:</p> |
| 286 | |
| 287 | <ol> |
| 288 | <li>Allocate memory that is needed by the Renderscript in your Android framework code. |
| 289 | You need an input and output {@link android.renderscript.Allocation} for Android 3.2 (API level |
| 290 | 13) platform versions and older. The Android 4.0 (API level 14) platform version requires only |
| 291 | one or both {@link android.renderscript.Allocation}s.</li> |
| 292 | |
| 293 | <li>Create an instance of the <code>ScriptC_<em>script_name</em></code> class.</li> |
| 294 | |
| 295 | <li>Call <code>forEach_root()</code>, passing in the allocations, the |
| 296 | Renderscript, and any optional user-defined data. The output allocation will contain the output |
| 297 | of the Renderscript.</li> |
| 298 | </ol> |
| 299 | |
| 300 | <p>The following example, taken from the <a href= |
| 301 | "{@docRoot}resources/samples/RenderScript/HelloCompute/index.html">HelloCompute</a> sample, processes |
| 302 | a bitmap and outputs a black and white version of it. The |
| 303 | <code>createScript()</code> method carries out the steps described previously. This method calls the |
| 304 | Renderscript, <code>mono.rs</code>, passing in memory allocations that store the bitmap to be processed |
| 305 | as well as the eventual output bitmap. It then displays the processed bitmap onto the screen:</p> |
| 306 | <pre> |
| 307 | package com.example.android.rs.hellocompute; |
| 308 | |
| 309 | import android.app.Activity; |
| 310 | import android.os.Bundle; |
| 311 | import android.graphics.BitmapFactory; |
| 312 | import android.graphics.Bitmap; |
| 313 | import android.renderscript.RenderScript; |
| 314 | import android.renderscript.Allocation; |
| 315 | import android.widget.ImageView; |
| 316 | |
| 317 | public class HelloCompute extends Activity { |
| 318 | private Bitmap mBitmapIn; |
| 319 | private Bitmap mBitmapOut; |
| 320 | |
| 321 | private RenderScript mRS; |
| 322 | private Allocation mInAllocation; |
| 323 | private Allocation mOutAllocation; |
| 324 | private ScriptC_mono mScript; |
| 325 | |
| 326 | @Override |
| 327 | protected void onCreate(Bundle savedInstanceState) { |
| 328 | super.onCreate(savedInstanceState); |
| 329 | setContentView(R.layout.main); |
| 330 | |
| 331 | mBitmapIn = loadBitmap(R.drawable.data); |
| 332 | mBitmapOut = Bitmap.createBitmap(mBitmapIn.getWidth(), mBitmapIn.getHeight(), |
| 333 | mBitmapIn.getConfig()); |
| 334 | |
| 335 | ImageView in = (ImageView) findViewById(R.id.displayin); |
| 336 | in.setImageBitmap(mBitmapIn); |
| 337 | |
| 338 | ImageView out = (ImageView) findViewById(R.id.displayout); |
| 339 | out.setImageBitmap(mBitmapOut); |
| 340 | |
| 341 | createScript(); |
| 342 | } |
| 343 | private void createScript() { |
| 344 | mRS = RenderScript.create(this); |
| 345 | mInAllocation = Allocation.createFromBitmap(mRS, mBitmapIn, |
| 346 | Allocation.MipmapControl.MIPMAP_NONE, |
| 347 | Allocation.USAGE_SCRIPT); |
| 348 | mOutAllocation = Allocation.createTyped(mRS, mInAllocation.getType()); |
| 349 | mScript = new ScriptC_mono(mRS, getResources(), R.raw.mono); |
| 350 | mScript.forEach_root(mInAllocation, mOutAllocation); |
| 351 | mOutAllocation.copyTo(mBitmapOut); |
| 352 | } |
| 353 | |
| 354 | private Bitmap loadBitmap(int resource) { |
| 355 | final BitmapFactory.Options options = new BitmapFactory.Options(); |
| 356 | options.inPreferredConfig = Bitmap.Config.ARGB_8888; |
| 357 | return BitmapFactory.decodeResource(getResources(), resource, options); |
| 358 | } |
| 359 | } |
| 360 | </pre> |
| 361 | |
| 362 | <p>To call Renderscript from another Renderscript file:</p> |
| 363 | <ol> |
| 364 | <li>Allocate memory that is needed by the Renderscript in your Android framework code. |
| 365 | You need an input and output {@link android.renderscript.Allocation} for Android 3.2 (API level |
| 366 | 13) platform versions and older. The Android 4.0 (API level 14) platform version requires only |
| 367 | one or both {@link android.renderscript.Allocation}s.</li> |
| 368 | |
| 369 | <li>Call <code>rsForEach()</code>, passing in the allocations and any optional user-defined data. |
| 370 | The output allocation will contain the output of the Renderscript.</li> |
| 371 | </ol> |
| 372 | |
| 373 | <pre> |
| 374 | rs_script script; |
| 375 | rs_allocation in_allocation; |
| 376 | rs_allocation out_allocation; |
| 377 | UserData_t data; |
| 378 | ... |
| 379 | rsForEach(script, in_allocation, out_allocation, &data, sizeof(data)); |
| 380 | </pre> |
| 381 | </p> |
| 382 | <p>In this example, assume that the script and memory allocations have already been |
| 383 | allocated and bound at the Android framework level and that <code>UserData_t</code> is a struct |
| 384 | declared previously. Passing a pointer to a struct and the size of the struct to <code>rsForEach</code> |
| 385 | is optional, but useful if your Renderscript requires additional information other than |
| 386 | the necessary memory allocations.</p> |
| 387 | |
| Robert Ly | 864090e | 2012-06-17 18:22:17 -0700 | [diff] [blame] | 388 | |
| Robert Ly | 3419e07 | 2012-11-12 19:42:52 -0800 | [diff] [blame] | 389 | <h4>Script groups</h4> |
| 390 | |
| 391 | <p>You can group Renderscript scripts together and execute them all with a single call as though |
| 392 | they were part of a single script. This allows Renderscript to optimize execution of the scripts |
| 393 | in ways that it could not do if the scripts were executed individually.</p> |
| 394 | |
| 395 | <p>To build a script groupm, use the {@link android.renderscript.ScriptGroup.Builder} class to create a {@link android.renderscript.ScriptGroup} |
| 396 | defining the operations. At execution time, Renderscript optimizes the run order and the connections between these |
| 397 | operations for best performance. |
| 398 | |
| 399 | <p class="note"><strong>Important:</strong> The script group must be a direct acyclic graph for this feature to work.</p> |