When developing Flash content, profiling the VM or custom code called can be done through the profiler bundled in Flash Builder or through the Flash Player configuration file (mm.cfg). Those give you low level details about the VM but does not give you an overview of what is happening behind the scenes concerning the rendering steps for instance. In the team, we are thinking about a way to give you some more details concerning the rendering engine.

What is really being called ? And the time it took for it.

This would allow us to provide you a rendering profiler perfect for fine tuning. It could generate a heat map of the rendering operations to see which operation internally is costing too much CPU cycles. A perfect way to detect also some side-effects which were not supposed to be there but where finally injected through the development of the application. A filter, some nested bitmap caching, or some blend mode or even a transparent bitmap used where it should not.

To get those infos you can use Shark on MacOS or VTune on Windows and Linux. The idea is simple, you run it, start collecting samples and start your SWF.  Once you are done, you stop collecting and it gives you an output. Without the Flash Player with the debug symbols you will not see anything relevant, just symbol adresses but nothing humanly readable. Like the following image :

No Debug Symbols

You can already see that some processes are costing more time than others. With the debug symbols (included to a special version of the debugging player), you can see all the internals. So let's say I use a BitmapData object to paint pixels in my application. The following output, shows me clearly that almost 18% of the processing time was taken by the BitmapData.setPixel32 which calls internally the AddDirtyRect API from SurfaceImage and the internal core BitmapDataObject class and its setPixel32 API :

BitmapData.setPixel

By the way, you can see that the AVM spend some time leveraging SSE2 instructions through the AVMCore class. Same thing, let's say you have to audit a SWF to see if some bad practices are used there. In the following SWF, we see a PixelBlitThread which is the API used for bitmap caching, so there is a lot of chances that you would take a look at the code of this SWF to see if the bitmap caching is used properly and if memory is ok :

Bitmap Caching

Pure jewel for advanced optimizations. We are thinking about approaches to make this available (not in this raw-technical format) to every Flash developer. Do not hesitate to share ideas.