Bunnies on the GPU! by Thibault Imbert

Pirate!We recently announced AIR 3.2 with Stage3D support for mobile. Some of you have asked about learning more about GPU programming, best practices and also some of you had questions regarding the BunnyMark test we had in the AIR 3.2 video.

The BunnyMark demo we showed, uses a tiny framework developed internally at Adobe (named GPUSprite). It makes use of support classes called GPUSprite/GPUSpriteLayer, which optimize rendering by allowing a large number of of sprites to be drawn in a single draw call (batching).

This means all those sprites must be able to sample their image from the same sprite sheet.  Performance is really nice.  You could extend it to multiple sprite sheets, but you will have to organize your content into layers with one layer per SpriteSheet.

All the objects on a higher layer will be rendered on top of objects on lower layers.  For a scrolling game with background, foreground, characters, effect layers and so on, it is a nice authoring solution. Of course this has way more restrictions than what Starling would allow, it actually has very limited features.

So do not consider this as a silver bullet, consider this as an example to learn how to program GPUs efficiently, or also tweak frameworks to achieve higher framerates.

You can download GPUSprite on the GraphicsCorelib github repo.
Download the BunnyMark code (using GPUSprite).

Special thanks to Iain Lobb for the original BunnyMark.
Special thanks to Philippe Elsass for the modified BunnyMark test.

Live demo of GPUSprite below :

Get Adobe Flash player

Comments (28)

  1. focus wrote::

    Hey, awesome performance, thanks for the opportunity to see it’s sources and know more about efficient Stage3D using! BTW, what does this key “CTResolveURLAgainstRootSWF” setted to true (from the application xml)? Can’t find it’s description.

    Tuesday, March 20, 2012 at 11:11 am #
  2. 12 wrote::

    Have you tried using two fixed vertex streams (one filled with redundant quad data plus another with constant indexes) and do instancing by uploading transform matrices for each quad, instead of reuploading the whole vertex buffer every frame ?

    Tuesday, March 20, 2012 at 11:21 am #
  3. Davor wrote::

    Hi Thibault..that’s great example, but I noticed that frame rate drops very low (10 frames) when I use pressed middle mouse for scrolling..any idea how to avoid that?

    Tuesday, March 20, 2012 at 11:29 am #
  4. Philippe wrote::

    Awesome performance, I’m glad AIR 3.2 finally brings Flash in good position in the mobile landscape.

    I know the license wasn’t restrictive but it isn’t very classy to not mention from which source this demo was actually adapted; line-by-line as far as the background animation is concerned.

    Tuesday, March 20, 2012 at 11:52 am #
  5. Haha, reached the limit, it wont add any more bunnies.

    16,250 @ 60fps.

    Well done sirs. Well done.

    Tuesday, March 20, 2012 at 2:27 pm #
  6. Paolo wrote::

    Salut Thibault! C’est complètement dément! Dis-moi, est-ce qu’avec FP11.2 / AIR 3.2 / Starling, la performance des blendModes est meilleure? Sinon, quelle serait la meilleure avenue? PixelBender?

    Merci, et beau travail!

    Tuesday, March 20, 2012 at 4:47 pm #
  7. Thibault Imbert wrote::

    Hi Philippe,

    Sorry about that! Completely forgot about the modified version of the test! This is fixed now, I included credits in blog post and source files.

    Thibault

    Tuesday, March 20, 2012 at 4:56 pm #
  8. It’s great to see AIR catching up to NME’s performance.

    When you couldn’t use Stage3D in AIR, tests like this performed at least 9 times slower than NME.

    http://philippe.elsass.me/2011/11/nme-ready-for-the-show/

    NME still seems to be performing a fair bit faster on mobile (like 3000 bunnies on the iPad 1 at 60 FPS, compared to 2500) and it’s nice to be able to blend NME’s drawTiles command with the standard display list, unlike Stage3D.

    However, this is all great for the browser, because we’re all using Flash Player, there :)

    Tuesday, March 20, 2012 at 7:09 pm #
  9. Chris wrote::

    Hi, thank for these tests. I wondered what results people were achieving on mobile? I am getting about 1200 bunnies @ 60 fps on Galaxy Nexus. Cheers
    Chris

    Wednesday, March 21, 2012 at 3:16 am #
  10. On Nexus One, Stage3D is significantly faster than NME, 1500 sprites vs 2500.

    On Galaxy Nexus, NME has a slight edge, 5500 to 5000.

    Pretty damned close!

    @Joshua – You can still mix display list content on top of Stage3D. It’s has performance penalties, but depending on the use case those might not apply (dialog’s, main menu, leaderboard, etc)

    Wednesday, March 21, 2012 at 5:42 pm #
  11. Actually, worth pointing out that they actually are mixing the display list in this demo.

    The FPS Counter is just a regular old TextField.

    Wednesday, March 21, 2012 at 5:46 pm #
  12. Renan Muniz wrote::

    i need more bunnies :D

    Thursday, March 22, 2012 at 1:12 pm #
  13. hayesmaker wrote::

    If this is running on the GPU, why does it kill my CPU?

    Friday, March 23, 2012 at 3:30 pm #
  14. RetroModular wrote::

    @hayesmaker – Stage3D falls back to software rendering if your graphics card drivers aren’t modern enough, and that uses the CPU not the GPU.

    More information here: http://goo.gl/dS6m0

    Friday, March 23, 2012 at 5:53 pm #
  15. Thibault Imbert wrote::

    Hi hayesmaker,

    Yes, RetroModular is right. If Stage3D cannot leverage your graphics card (drivers too old, blacklisted, or GPU acceleration checkbox unchecked in the settings UI), then the CPU is used to emulate the GPU, this is very expensive for a CPU :)

    In 11.2 coming really soon, we relaxed the drivers gating to 2008 instead of 2009, and we will be even more aggressive in the next releases. Soon, most people should not experience software fallback anymore.

    Thibault

    Saturday, March 24, 2012 at 8:19 am #
  16. Franck wrote::

    Ok, so if you love bunnies, i found the perfect introduction for that sequence :
    http://www.etsy.com/listing/94711866/nature-will-show-you-the-way-art-print

    :)

    Saturday, March 24, 2012 at 11:49 am #
  17. Elliot Mebane wrote::

    Great sample. I love the clever little GPUSprite framework. I modified the sample to be pure Stage3D (displays the FPS in a Stage3D layer and no DisplayList overlay). I was surprised to see almost no performance improvement.

    More info and source here:
    http://www.roguish.com/blog/?p=495

    Monday, March 26, 2012 at 9:52 pm #
  18. Caleb wrote::

    This is great!!

    I just wondered out of curiosity, if you ran this on an Ipad1, what kind of bunny mark would you be looking at?

    I am not familiar with mobile dev at this time so have long wondered.

    Wednesday, March 28, 2012 at 10:23 am #
  19. tomsamson wrote::

    Oh Adobe, will you ever learn before it is too late?

    Always these made up performance demos which are just very unrealistic for 99% of actual usage in a real indepth project..

    And then 2012 and still No hardware acceleration for anything besides video other than stuff made with the new api, so everything one could make inside the flash ide gets no hardware acceleration.
    Using the fladsh ide is for many designers the only reason to use flash instead of any other code only solution available (and better than flash for code only solutions)
    (the only proper reason to use flash is visually creating stuff in the flash ide).

    Will Adobe ever add proper hardware acceleration for everything flash no matter if made in IDE or using AS1 or 2 or 3?

    Like, you know, Scaleform is doing on consoles for so many years?

    If not, then don’t talk nonsense in the vein of taking anything flash (and gaming) serious.

    Seriously..

    Wednesday, March 28, 2012 at 11:05 am #
  20. NAZ wrote::

    Thanks Thibault!,

    Is this demos meant to work out of the box with iOS?? .. i’m publishing it to my iPad2 and with 700 bunnies i get 10FPS.

    I’m using the merged SDK of AIR3.2 and Flex 4.6. and FDT4.5

    Can i have some light on this from anyone maybe Joshua, you got it running with good performance.

    Big thanks!.
    NAZ

    Thursday, March 29, 2012 at 10:23 am #
  21. NAZ wrote::

    Fixed:

    Nevermind, i was compiling as debug from FDT.. now i got the same results as Joshua.

    Thursday, March 29, 2012 at 12:33 pm #
  22. paha wrote::

    @tomsamson

    This issue has been explained by guys at Adobe countless of times. GPUs don’t just work the same way as traditional Flash display list which has features that would be impossible to do in the same way on a GPU. It would create more issues with old content than add any real speed improvements.

    In short: to take advantage of GPU rendering just use frameworks running on the Stage3D. They’re not so difficult but of course knowledge on GPU programming will help a lot.

    Friday, March 30, 2012 at 2:50 pm #
  23. jack wrote::

    Yesterday I updated my flash player to version 11.2 from 11.1 and noticed that I can no longer get up to the max of 16,250 bunnies. The framerate starts to dip at around 8000, which is still very impressive!

    The other issue is that when I compile your source using FB4.6 with AIR3.2 and target FP11.2, I cannot get anywhere near what I’m seeing here. I imported the FB project and had to resolve the conflict of having two different playerglobal.swc. I also had to set wmode=direct in the html template. After those changes I was able to publish and the performance starts to drop at around 2750 bunnies. I’m running both swfs in the same browser (Safari) using the same Flash Player (11.2), so why are the results so drastically different?

    Thanks,
    Jack

    Friday, March 30, 2012 at 4:27 pm #
  24. knarF wrote::

    This is just so weird… If I right-click, the animation runs A LOT smoother, going from 3 to 21FPS @ 500 bunnies!!

    Can someone explain that?

    hayesmaker: Same here, I read somewhere that Adobe had some problems with their code so they removed GPU acceleration in some cases in some version of Flash. I think it was in 10.0 but removed in 11 — Not sure which, but I’ve been considering to downgrade my Flash version as this seems to affect my setup — Or I may just wait and hope Adobe figures it out ;)

    Sunday, April 1, 2012 at 10:34 pm #
  25. NAZ wrote::

    Great that you are back! :D

    Performance it’s great, one little thing… i haven’t managed to find how to change the green color of the background …

    Does anyone know???

    Thanks!.
    NAZ.

    Tuesday, April 24, 2012 at 3:22 pm #
  26. Maliak wrote::

    @tomsamson

    way to be a dickhead when triple AAA game engines are already supporting flash. Keep at it, little wannabe html-is-the-never-reached-future shill. zing.

    Tuesday, May 15, 2012 at 10:36 pm #
  27. James Almeida wrote::

    Whoa!! I hit the bunny limit 16250 bunnies! On a Windows 15 2.8Ghz with (32bit) 3Gig of Ram and an ATI 4550 with 512M of ram!! 60FPS!! This is ridiculous (in an AWESOME WAY)!!!

    Tuesday, June 5, 2012 at 4:34 pm #
  28. James Bachalo wrote::

    Could someone clarify whether this example uses the display list or targets Stage3D? I assume the latter since in looking at the source I see “wmode” : “direct” and the source code seems to indicate this as well https://github.com/graphicscore/graphicscorelib/blob/master/src/com/adobe/example/GPUSprite/GPUSpriteRenderLayer.as What’s needed for comparison is an identical demo coded using the display list, bitmap caching and using render mode =GPU to see if there is any performance difference. Targeting Stage3D either directly as in this example or thru use of a framework like Starling equals greater code complexity and production time. We need more real world examples of when it makes sense to do so!

    Wednesday, February 13, 2013 at 2:40 pm #

Trackbacks/Pingbacks (2)

  1. Chris Churn | Website and application development on Friday, March 23, 2012 at 6:12 pm

    [...] is from http://www.bytearray.org/?p=4074. All credits for this benchmark are on this page. [...]

     
  2. [...] Currently, Backstage2D’s code base is mostly a playground for proof of concept of some API ideas. Some stuff in this post may not match the git repo (for example, I’m still using “layer” instead of “surface”). There’s a bunch left to do, but it is working enough to run a modified version of MoleHill_BunnMark that some folks from Adobe put together (I actually lifted most of my GPU code from that example code, heh). The BunnyMark example was adapted from Iain Lobb’s BunnyMark, with some additions from Phillipe Elsass. You can view the Backstage2D version of BunnyMark here (and check out the original BunnyMark MoleHill here). [...]