When Flash Player 10 was released, you heard about this new Vector class, if you are not using it already in your FP10 projects, well you definitely should. When talking with some guys at DotEmu, they confessed me that their whole emulation engine (ported from Java to AS3) which was able to emulate any kind of 8bit and 16bit console was not fast enough to be used in production in Flash Player 9.
When Flash Player 10 was released, they switched from Array to Vectors and realized that their engine was running much faster thanks to methods like BitmapData.setVector() and the Vector class.
After a few days I realised that the JPEG encoding class used in the corelib package was using a lot of Arrays and could benefit from the Vector class and I was pretty sure some tricky optimization could be done. That's what I did and modified its code to make it faster, I was happy to see that the new version was around 2.5 times faster (on my machine). Now more than 4 times.
Here is a little demo (encoding a 2880*2880 BitmapData) showing the difference between the old version and the new one, if you could post your results through the comments that would be interesting :
So what did I do ?
- I used bitwise operators as much as possible.
- I replaced all Arrays with fixed length Vectors.
- I used pre-incrementation rather than post-incrementation (thanks Joa for this one
). - I casted to int almost all my Vector indices access.
- Other minor stuff you always do to optimize your code
The encoding process could be even faster thanks to Pixel Bender for the RGB to YUV process, unfortunately as you may know loops are forbidden when Pixel Bender jam is running in the Flash Player. Pre-processed the RGB to YUV with a simple ColorMatrixFilter or Pixel Bender, this didn't bring performance improvements. I am sure this can still be optimized if ported to haXe for instance, if some of you are using an asynchronous version of the original version, you can integrate the optimization from this one.
Download it here.
Update : 05/24/09 : New version with tiny additions, which should make it run a little bit faster.
Update : 05/26/09 : New version with tiny additions, a little bit faster. (Thanks Joa and Kyle for your tips)
Update : 05/27/09 : New version with some unuseful code removed.
Update : 05/30/09 : New version with tiny additions, a little bit faster again. (Thanks tst)
Comments (156)
I’m on an iMac with a 3.06 GHz Core 2 Duo.
Original corelib version: 14370 ms
Optimized version: 6457 ms
Note quite 2.5x for me but still easily over 2x.
original : 12593ms
optimized : 5073ms
Well, that’s pretty good.
original corelib version : 21911ms
optimized version : 7053ms
great!
XP – dual core 1 GB ram:
corelib = 26277ms
optimized = 9674ms
Me notebook doesn’t freeze with the optimized version.
Excellent work bro!
Hey Thibault, nice work!
You are right just about 2.x faster, my times were:
original corelib: 14145ms
optimized: 5915ms
I’m working on a couple project that leverage the corelib JPG and PNG encoders, so this is a real treat! Thanks!
Will have to take a look at the PNG encoder and see how it might benefit from Vectors.
Keep up the great work!
Rob
NICE! Posting results:
Original: 14701 ms
Optimized: 5746 ms
Impressive improvement. I’ll have to look into these fancy Vector things
Actually – it looks like the PNG encoder would be a bit tricker to optimize as it doesn’t appear to rely as heavily on lookup tables of data contained in arrays like the JPG encoder does – but still worth a look.
quad core (via my AIR Browser
) 16413ms vs 5850ms, indeed!
Mac mini core2duo 1.83GHz
19220ms / 6626ms
3 times faster.
14244ms vs 4941ms…pretty nice. Does it make any difference that the source image is 90% solid white?
Got a quad here as well…too bad flash can’t use it. Though I heard PB can use multiple cores, so if we could harness that, I imagine even greater speedups possible.
Core 2 Duo 2.6 Ghz:
Original: 16669ms
Optimized: 6517ms
Much improved.
Original Version: 16313ms
Optimized Version: 5640ms
That’s in Firefox 3.0.10 in Mac OS X 10.5.7 on a 2GHz Intel Core 2 Duo MacBook with 2GB of PC3-8500 DDR3.
Any particular reason you didn’t make the result text selectable? That would be a nice minor improvement. So would adding title elements to any icons without captions (especially ambiguous two-image icons such as icon-author.png).
18598
8057
Macbook
Core 2 Duo E6650 @ 2.33 w/ 4Gb RAM on Vista 32:
18861 ms vs. 7457 ms !
original : 27158ms
optimized : 11976ms
awesome!!! tested on iMac version 10.5.6, Safari 4 got result 18093ms and 6084ms : 3 times faster!! Do you have a plan for PNGEncoder too?
Oh, I forgot to ask – Do you feel like updating the PNNGEncoder class as well? That would be awesome…
17071ms vs 5287ms on my trusty ol dual core macbook pro. Very cool!
original : 21824ms
optimized : 9870ms
MacMini Core2Duo 2.0G
original: 22358
optimised: 7907
very nice.
original: 17986
optimized: 7326
quad core.
by the way: * 0.01 instead of / 100 could boost your code a little more.
14303ms vs 4861ms
I still prefer async encoding
But this one is realy good adition
thanx
Hi Thibaullt,
On my PPC G5 :
original : safari beta crash
optimized : 20265
thanks
JP
MBP 2.4GHZ CORE 2 DUO
original : 18243ms
optimized : 7980ms
original: 29912
optimized: 10347
WOW!
awesome indeed…
original : 18036ms
optimized : 7270ms
dualcore+winxp+fp10debug+ff3
Good job!
original : 12889ms
optimized : 4292ms
FF3 & Player 10.0.2.54
22711 vs 9945 ms
on my MacBook 2GHz Intel Core 2 Duo, 2GB Ram
Impressive work! thanks for sharing.
Original: 15968ms
Optimized: 6339ms
Nice work!
Original 19751 ms
optimized 8121 ms
2.4 times faster
20387ms down to 8637ms pretty sweet!
AMD Athlon 64×2 Dual Core 2.19Ghz
2G ram – 20523 : 6615.
good job!!!
13641 vs 4327
Very nice optimization. I didn’t know vectors are that much faster.. Could you do this trick for the PNG encoder too?
Vista,
Intel Core Duo 2.26ghz,
4G RAM,
15526ms
5186ms
Got to give this a try in ImageSizer now…
Great Job! About the YUV conversion – I wonder why you do not use a simple ColorMatrix filter for that. YUV is the same as YCbCr and the matrix for that is: Y’ = 16 + ( 65.481 * R’ + 128.553 * G’ + 24.966 * B’)
Cb = 128 + (-37.797 * R’ – 74.203 * G’ + 112.0 * B’)
Cr = 128 + (112.0 * R’ – 93.786 * G’ – 18.214 * B’)
http://en.wikipedia.org/wiki/YCbCr
Hi ruffy,
Nice one, I added it
Thanks
Hi Mario,
Yes! A simple ColorMatrixFilter, good idea, I will make some tests today and let you know
Thanks
Great!
But I can not do anything when it works.
Async encoding may be more better:
http://blog.inspirit.ru/?p=201
Much faster…
original – 12432
optimised – 4006
2.83GHz Core2 Quad
13876ms -> 4764ms on my Macbook Pro 15 incher.
Good work and thanks for sharing!
MacBook Pro Intel Core 2 Duo 2.4 Ghz 2Gb Ram on a Firefox 3.0.1
15133ms
4647ms
3.2 times faster!!
original: 21105ms
optimized: 8826ms
MBP 2.4GHZ 2GB RAM
corelib version: 21732 ms
optimized: 9547 ms
I’m on an Intel Q9550 (Quad core) 2.83GHz with 4GB RAM
corelib: 12200ms
optimized: 4248ms
Thanks for all your feedbacks, it seems like some of you have it running more than 3 times faster, nice!
Thibault
I compared two jpeg files generated by two encoders, and the faster’s is 1 byte less than the corelib’s(near the end of file).
Great, I was annoyed with the speed of the old JPEGEncoder, so this is a lifesaver! Thanks!
Sorry, but there is flaw in the faster JPEGEncoder.
Comparing the source file, here is the flaw coming in:
[code]
fillbits.len = ++bytepos;
fillbits.val = (1<<(bytepos+1))-1;
[/code]
it should be:
[code]
fillbits.len = ++bytepos;
fillbits.val = (1<<(bytepos))-1;
[/code]
Thank you for the encoder!
I’m on Windows XP, Pentium 4, 3GHz, 3GB of RAM.
Coreliob version: 25897ms
Optimized version: 8665ms
Original corelib version: 18693 ms
Optimized version: 8448 ms
Thanks a lot Kyle, byte flaw fixed
Thibault
default: 14944
optimized: 4820
3.1 faster
nice
33107 vs 12102.
Kinda good
11062ms
3730ms
nice!
XP
6400 @ 2.13GHz
1.97GB Ram
Intel(R) Core 2
Very pedestrian work computer.
14487ms vs 8068ms
1.8 times faster
Great work.
48093ms vs. 17735ms
Xeon 3GHz
FP 10.0.22.87 DEBG
With lot of apps launched in same time.
AMD Athlon 64 X2 Dual 2.71Ghz, 2GB RAM
22448 ms
8091 ms
22448/8091 ≈ 2.77
I’m on an iMac OSX 10.5.7 with a 3.06 GHz Core 2 Duo, safari 4
Original corelib version: 14507 ms
Optimized version: 6222 ms
2.33x for me
Nice work
Mac OS X 10.5.7 (Safari 4)
2.6GHz Intel Core 2 Duo
Original: 16944 ms
Optimized: 7880 ms
Very cool!
this example is excellent, I have been recently working on an alchemy libjpeg port. It runs fast and produces smaller jpeg files and is asynchronous ….
I have published quick comparison at :
http://segfaultlabs.com/blog/post/asynchronous-jpeg-encoding
Source code is also available
j ai rien compris au code… en tout cas ca a l air de bien marcher
merci
“on mérite pas!!!…on est tout p’tit”
hmmmm your post is interesting, but there’s a much faster way to do it, which is to use Alchemy to compiple the IJG JPEG Codec (or libjpeg) into an AS3 library and use its encoding feature. On my machine, your improved algorithm took 10seconds to encode the jpeg, but with Alchemy i can finish encoding a similar sized 8MP image in less than 4 seconds due to the way Alchemy leverages gcc compiler’s highly optimized code generation. Furthermore, libjpeg has other nice and handy features such as preserving EXIF metadata tags and image decoding.
Hi malczak,
Very nice use of Alchemy! ShaderJob makes things easier for an asynchronous behavior
Hi aaron,
Yes, as malczak posted, you can make it even faster with Alchemy thanks to libjpeg for instance, but the idea behind this demo was more to show how the Vector class and simple code optimizations in AS3 could make things run faster. libjpeg is very nice, I am pretty sure it can also be optimized
best,
Thibault
Awesome work Thibault.
I just made an asynchronous version as well.
http://www.auricom.com/devote/an-asynchronous-vector-optimized-jpeg-encoder
Amazing Thibault !
13888ms vs 5198ms
I can’t even find any new trick to accelerate it a little.
I hope we will have this as an official optimized version of JPEGEncoder in Flex SDK.
Do you think you could rework it to make it work both synchronous and asynchronous by just calling the right method ?
Could someone elabourate on how pre-incrementation is faster than post-incrementation?
array: 22968ms
vector: 7967ms
Hi Mark,
The PNG encoder could also be optimized but as Robert said, the JPEG algorithm does not rely on lookup tables. So any optimization would not bring that much performance improvements.
Thibault
@Thibault, @Mark
In my personal opinion, compression and related can be done with Alchemy ports particularly when working in AIR env. produced swc libs are quite large, and need more tests, but a results are a much better. What do You think about it ?
Orig: 23669
Opt: 7151
3.3x faster!
Mac OS X (leopard), Firefox.
Hi malczak,
Alchemy is a great project which allows us to check how AS3 and the compiler can be optimized. I wish we could have such optimizations in forthcoming AS3 compilers. For the moment I see Alchemy as a wonderful field of play for that.
Talking about AIR, yes, Alchemy SWC are a bit large, and yes this would suit better with AIR, but honestly I think that for everyday uses with “normal” images to compress, a simple asynchronous version of the version I posted would be fine. I think Alchemy could be interesting for heavier process like runtime sound or video compression. Well, so many things to experiment !
16712ms vs 5501ms
- Dual core Opteron 185
- 32bit Vista.
Nice work!
Hi Mario,
I finally pre-processed the RGB to YUV with some Pixel Bender but it does not really bring some performance improvements. I am still investigating, I let you know
AWESOME!
Now someone just needs to make an asynchronous version…
2.8x faster (Lenovo 3000 N200)
Is there a plan of PNGEncoder?
Inter Core 2 Duo 2.53ghz x 2
3.25ghz RAM
Encoding time (original): 16268ms
Encoding time (optimized): 6281ms
Impressive improvement.
Congrats!
I’ve modified the com.adobe.images.JPGEncoder according to IJG’s jpeglib,
using ColorMatrixFilter to do RGB2YUV and added two interger fDCT method. The result is 20% to 35% gain on performance!
Hope it could be faster with Vector!
source:
http://www.ideaboard.cn/svn/AS3Utils/trunk/com/adobe/images/JPGEncoder2.as
Hi Kyle,
Nice! But the link you posted needs authentification (user & password)
Thibault
Sorry, that svn repo is not public.
source: http://elics.cn/kyle/projects/as3/jpgencoder2/com/adobe/images/jpgencoder2.as
demo: http://elics.cn/kyle/projects/as3/jpgencoder2/jpegencoder2test.swf
demo source: http://elics.cn/kyle/projects/as3/jpgencoder2/jpegencodertest.as
Hi Kyle,
Very interesting approach with the two integer fDCT methods. An interesting thing, during my tests, I noticed that encoding time was faster without any RGB to YUV pre-processing through ColorMatrixFilter and even with Pixel Bender.
Do you allow me to integrate those two fDCT methods and see if it runs even faster with Vectors ?
Thanks!
Thibault
hi, I’ve just done some more test with Kyle’s class. here are my results for 1024×1024 bitmapData:
orginal JPGEncoder : 6715ms
bytearray.org encoder : 4262ms
Kyle’s encoder : 6358ms
Alchemy async encoder : 1711ms
Alchemy sync encoder : 326ms
In both Alchemy runsm, files produced were about ~30% smaller.
Hi malczak,
Thanks for the bench, that’s interesting. I did some minor optimizations in the last version I just uploaded an hour ago.
Can you try it and tell me if you some improvements ?
Thanks!
Thibault
Thibault:

Sure.
The code was from IJG’s jpeglib, no Legal Issues I guess
I’m still working on flash player 9 platform, no plan for fp10 yet. So it would be great if you make use of the code.
malczak:
Sounds not so cheering to me
I’m kind of stuck onto fp9, Alchemy is not my choice.
hi,
here is result with new version of Yours class (single run)
bytearray.org encoder : 3417ms
no change is size difference
Hi malczak,
Thanks, little improvement here that’s cool. Damn, too hard to fight against Alchemy
I think I can reach the alchemy async speed, but async is far
Thibault
malczak:
About the file size difference, it could be result of optimal Huffman coding tables, which costs more calculation. Non-Alchemy encoders would not take that burden.
@Thibault
it can run faster, but then it may cause flash player blocking
async Alchemy version can still be tuned
@Kyle
this is yet another advantage of Alchemy port
Core 2 Duo 2.4, 3GB, Vista
14595ms
3184ms
Great job, guys!
15500 vs 3700 on a 2,93GHz iMac running a Debug FlashPlayer 10,0,22,87 … that’s over 4 times faster now!
“chapeau !”
I think you can stop optimizing the JPEG encoder and move on to the PNG encoder!
(!) seriously, optimize the PNG encoder next, please ^^
Hi Patrick,
Wow nice results, yes it seems that some people are having a 4 times faster encoding now.
I am still optimizing some parts, I think I can bring it around 2000ms.
I will definitely take a look at the PNG encoder.
Thibault
hi, Im still on and testing
bytearray.org encoder : 3352ms
alchemy async : 1855ms
alchemy sync : 264ms
Core2 Quad Q9450 (2.66Ghz):
16390ms
4580ms
Hi Thibault,
I was wondering while reading your encoder:
1. There’s a lot of “int()” in the code.
But by decompile the swf you always see a “convert_i” instruction before assigning to a int, “int()” would insert a “callproperty public::int, 1″ before “convert_i”, which would be redundancy in my opinion.
2. You used “const” instead of direct constants.
“const” was compiled to local variable (write protect must happend during compile time), so “const” substituting constants cause “getlocal” instructions substituting “pushbyte” instructions. Is that faster?
3. Why the buffer-and-copy-while-init things for Vector lists?
Hi Kyle,
1. I thought it would be redundant also. In fact I removed the explicit int() casting when only accessing Vector indices without any mathematical operations done. If I remove the explicit casting when doing mathematical operations encoding is much slower.
2. To be honest, I used const here cause it made sense to use them, but it does not really bring huge performance improvements, some tiny ms won only.
3. When using the Vector global function to convert an Array to Vector you cannot specify that the new vector just created has a fixed length. That’s the reason why I created buffers to fill fixed length Vectors, but what I can do is just set the fixed property of the newly created vectors to true. That’s what I just added
Thanks for noticing this silly thing
Thibault
Thanks, Thibault.
The int() worked great!
And believe it or not, BitString should be optimized too!
http://elics.cn/svn/kylesas3utils/trunk/JPGEncoder2/
Hi,
I have optimized your version from 05/27/09 – it’s about 25% faster. Here is the code: http://pastebin.com/f388c8084 and here is the diff: http://pastebin.com/f52526eff
Hi tst,
Good optimization !, especially the fDCTQuant method
In my tests, it’s about 5-10% faster which is already pretty nice !
I have updated the online version with your additions.
Thibault
MBP 2.6GHZ CORE 2 DUO
4GB ram
player 10.0.22.87
original : 13161ms
optimized : 2810ms
cooool !!!
Well, who need the Vector?
http://elics.cn/kyle/projects/as3/JPGEncoder/
Hi Kyle,
Awesome optimization with linked lists !. It allows you to have same performance improvements also in FP9 (as I read before you are stuck to FP9 for your developments right now, you must be freakin’ happy) great job
Thibault
Yeah~! Freakin’ happy!
Although still about 5% slower than the Vector version, it works greatly on fp9. I love linked list!
Thanks to this great post!
Quad Core (2 x 2 GHz Dual-Core Intel Xeon) Mac
23838ms
5258ms
4.5x improvement! What do I win??
with a MacPro 2×3 GHz Dual-Core
old 15531 ms
new 3473 ms
could it be possible for the png-encoder ?
Thanks for all
Julien
12458 ms VS 2842 ms…
C’est ce qu’on appelle de l’optimisation. félicitation.
old 31531 ms
new 5473 ms
original: 13354 ms
optimized: 2901 ms
4.6x faster!
…on a Dell XPS 1530 Core 2 Duo 2.5 GHz 4GB RAM
THX!
it’s great, thank you
hope you would like to implement anti-freezing solution
Hi, I have updated my post about libjpeg and Alchemy.
http://segfaultlabs.com/blog/post/asynchronous-jpeg-encoding/
here are my results for 1024×1024 bitmapData. bitmapData was created using Perlin noise.
orginal JPGEncoder : 8316ms
Kyle’s encoder : 8982ms
bytearray.org encoder : 5699ms
Alchemy async encoder : 1772ms
Alchemy sync encoder : 347ms
Source code for alchemy project and example are available.
XP, P4 Dual 3.00GHz 2Go Ram
original : 30289
optimized : 7239
Very fast !!!
Kyle,
Do you allow me to use your JPEGEncoder implementation for FP9 ? I would be happy integrating it in AlivePDF which is still targeting FP9 where I cannot use Vector.
let me know
Thibault
@Kyle,
In your code, in the method initQuantTables() – there is no reference to aasf array, which used to be in the original code. I think because of this your code is producing much bigger JPG files than it should.
19.3 vs 4.8 I’m downloading this one right now =)
XP, P4 DualCore 3.0 ghz 3 gb ram
original : 31329
optimized : 7256
Wow, very nice Thibault!
I’m on a 2.4 GHz Intel Core 2 Duo.
Original version: 19270ms
Optimized version 4377ms
That’s excellent!
I’m not near as tech savvy as anyone who has commented so far. I’d like to know how to install it. I have to replace it with something don’t I. Sorry if I sound like an idiot. Any help would be appreciated. Thanks so much, Andrea
So, the Alchemy C encoder is faster than encoder from this page
http://segfaultlabs.com/blog/post/asynchronous-jpeg-encoding/
Thank you all, guys!
Hi AlexG,
Yes if you need the fastest way to encode it, Alchemy all the way
Thibault
Sorry, I’ve been busy for a while.
Thibault,
The source code is open, feel free to use.
tst,
Yes, aasf was for float encoding, which I replaced with a int encoding method.
File size must be the side effect, or trade off for efficiency ;(
Cool Kyle
Thanks.
Core Q8400, 4G RAM, Vista x64, FPlayer 10.0.22.87
original 15649
optimized 3824
cool
corelib: ~43s
optimized: ~8s
Ubuntu 32-bit 9.04 on Thinkpad T43 (Centrino, Pentium M 2.0Ghz)
I had issues with Alchemy, it is compiled into an SWC and I didnt realise to use it with Flash CS3. Did someone use it?
Thanks
Salut,
Je possède un processeur Intel i7 920 boosté à 3.6 Ghz, avec lequel j’obtiens :
Original : 10127 ms
optimized : 2226 ms
J’avais tendance à ne plus percevoir ses performances, mais là je suis franchement bluffé !
Is there any async version for this class?
Hi james,
Yep :
http://segfaultlabs.com/blog/post/asynchronous-jpeg-encoding
Thibault
Hi Thibault, thanks!
But i was looking for the pure actionscript encoder shown on this page, as i’m targeting different operating systems..
thanks by the way
2x 2.8 quad-core
10Gb ram
corelib version:12443 ms
optomized version:2564ms
thanks for this awesome class
How do you install the Alchemy encoder? And how to use it?
Kyle
Does your SVN looks out of services. Can you provide a link to your sample and source code?
original : 29297ms
optimized : 6911ms
#Machine config#
processor : pentium 4 multi pro
RAM : 3GB
Need to know more fast performance tricks in 3D manipulation with AS3 in using pv3D
thanks
nice,posting results:
original : 11271 ms
optimized : 2977 ms
notebook Sony VAIO FW31ZJ
runing on 2 monitors x 1920×1080
Flash CS4 and Photoshop CS4 running in background, Vista x64…
Mac Pro w/9GB RAM
2 x 3 GHz Quad-core (8 core)
11761ms
2442ms
4.8x faster !!
22268
4315
MacBook Air 1.8GHz w/SSD
Original: 27118ms
Optimised: 4187ms
Roughly 6.5x improvement! INCREDIBLE!
Awsome!!! thanks!
You are right just about 2.x and more faster, my times were:
original corelib: 17002ms
optimized: 3642ms
hi thibault, can u write a post that shows us examples of what u did that sped things up?
chez moi :
7207 ms
and
2465 ms
ma machine :
vista intégrale sp2 64 bits
processeur i7 940 à 2.93 GHZ
12 go de ram
voilà ! la biz
Excellent work. There is some possible room for improvement. Vectors have a forEach() method that is at least 120 percent faster than using a for loop. You could use BitmapData.getVector() to copy the bitmapdata macroblocks into a Vector and then the RGB to YUV could be done using forEach(). Greater improvement can be acheived by putting an entire image into a Vector (rather than on a per macroblock basis) – but would require more refactoring.
Carl
I got an avg 15% increase in performance just by pulling out the microblock loops in encode and placing them in their own function. I’m really confused as to why that’d be faster, but I’m not going to argue with 15% performance gain
As a follow up on the my post above, more gains if you:
- move DU[ZigZag[]] to fDCTQuant() instead of having it return something.
-small gains by shifting in RGB2YUV instead of masking. In the RGB2YUV, it’s better to use a i<64 instead of two i<8 loops. As well, moving RGB2YUV to the microblock loop instead of having it's own function is a nice boost.
-And if you're willing to sacrifice the space, pulling the entire image into a Vector (along with using the i<64 loop instead of the two i<8 loops) at the beginning instead of 64 pixels in a set or getPixel every loop iteration provides another ~10% boost.
All told, the file you have for download here is ~30% slower than what it can be. And OMG those int casts are ridiculous! I mean, such a huge difference for so small a thing…
19261
5316
x 2.27
Hi Jordan,
Thanks for the tips, nice optimization gain also possible here. I will try to add those asap !
best,
Thibault
Well done
Original 20826 ms
Optimized 7700 ms
Great job
32949ms vs. 5945ms – very nice!
18,73s
5,59s
3 times faster. Impressive (and very efficient).
I have just put up a tutorial/guide on how to use the Alchemy JPEG encoder in Flash. It’s also an example on how to use a progressbar to monitor the encoding. Check out http://last.instinct.se/graphics-and-effects/using-the-fast-asynchronous-alchemy-jpeg-encoder-in-flash/640
Hi Klas,
Thanks for the link !
best,
Thibault
How can i use this as in flash?
Can you please give me an fla file?
thanks a lot!
Hi, many thanks for this. It’s very useful.
Jordan W, can you please post your tweaked version somewhere?
original: 9455ms
optimized version: 2797ms
nice work.
http://va.lent.in/blog/2010/06/23/100x-times-faster-md5-and-more/
Ok, so in practice with the changes posted on earlier, I’m getting around 30% max increased performance even without extracting the BitmapData to a Vector. BUT, performance gain depends greatly on what version of the player, on what platform, and on what operating system – 10% being the minimum gain I’ve found. The I’ve straightened up the code a bit, and posted it here – http://www.quixological.com/jordan/JPGEncode.as
A few notes, there is an “async” encode: encodeAsync(img:BitmapData, cback:Function=null, intensity:int=1) where intensity is how “async” you want it, where 1 is the fastest, taking about 15% over encode, and goes up however much you like, but 20 probably being the highest reasonable amount, adding 60% time over encode.
A benchmark test at http://www.quixological.com/jordan/jpgTest.swf (1.55MB) corelib version takes a while
Orig: 10211
Opt: 2029
5.05 X faster !!!!! Amazing
Trackbacks/Pingbacks (20)
[...] Imbert writes at ByteArray.org about Faster JPEG Encoding in Flash Player 10: So what did I do [...]
[...] ByteArray (Thibault Imbert) has demonstrated that for the JPEG encoding in corelib it is up to 2.5 times faster using Vectors than Arrays. Your mileage may vary heavily but it is almost a guaranteed speed boost due to less work. This obviously has great possibilities for speeding up code that uses lots of arrays. [...]
[...] original post – http://www.bytearray.org/?p=775 Filed under ActionScript 3.0 Leave a [...]
[...] 2: Thibault Imbert figured out a way to speed up JPEG encoding using the new FP10 Vector class. Good [...]
[...] 业界大牛 Thibault Imbert (bytearray.org) 近日将很常用的JPEGEncoder类用vector类改写了一下,做了一个vector版本的JPG编码类,效率大幅提升。正好,前不久我也已经做了一个vector版本的JPEG编码类,在项目中使用,最近正在整理代码准备分享出来。当然啦,我所做的没有Thibault那么深入,仅仅是把Array换成Vector,所以现在就拿他的版本重新修改了一下,加上了异步功能。 [...]
[...] maybe vector instead of array. I talked a bit with Thibault about it. He did a nice benchmark on this [...]
[...] weeks ago Thibault Imbert published an optimized version of Adobes JPGEncoder. And it rocks! However, if you may have very big-size bitmaps it takes too [...]
[...] Faster JPEG Encoding with Flash Player 10 ByteArray.org の JPEGEncoder は corelib のコードを Flash Player 10 向けに最適化されたものです。 何度かのアップデートを重ね、今ではオリジナルの4倍のスピードでエンコードできるそうです。 [...]
[...] http://www.bytearray.org/?p=775 [...]
[...] エンコード(BitmapData→JPEG)は、「as3corelib」のJPGEncoderクラスでもいいですが、FLASH Player 10 以上であれば、Vector型配列に最適化されている「Optimized JPEG Encoder」の方が処理速度が速いのでお勧め。デコード(JPEG→BitmapData)は、ちょっと面倒ですがLoaderクラスを使います。 [...]
[...] das muß ja rennen wie nichts gutes… Laut den Benchmarks soll es wesentlich schneller sein [Link1] [...]
[...] noticed how the encoding process was slow, after some searches I’ve discovered this handy optimized class from ByteArray to encode to JPEG format four times faster. Since the two classes have the same [...]
[...] me several days. I found optimized encoder versions for flash and haxe floating around the net (Faster JPEG Encoding with Flash Player 10) and tried the optimizations used there in my javascript version. As you can seen in the benchmarks [...]
[...] me several days. I found optimized encoder versions for flash and haxe floating around the net (Faster JPEG Encoding with Flash Player 10) and tried the optimizations used there in my javascript version. As you can seen in the benchmarks [...]
[...] 业界大牛 Thibault Imbert (bytearray.org) 近日将很常用的JPEGEncoder类用vector类改写了一下,做了一个vector版本的JPG编码类,效率大幅提升。正好,前不久我也已经做了一个vector版本的JPEG编码类,在项目中使用,最近正在整理代码准备分享出来。当然啦,我所做的没有Thibault那么深入,仅仅是把Array换成Vector,所以现在就拿他的版本重新修改了一下,加上了异步功能。 [...]
[...] with glitched JPEG encoding in Flash and have created a version of Thibault Imbert's optimised JPEG encoder that gives control over various glitch [...]
[...] Alchemy – asynchronous jpeg encoding を見た時、 bytearray.org の JPEGEncoder と比較して 10 倍位パフォーマンス良いように見えた segfaultlabs.com の [...]
[...] ja rennen wie nichts gutes… Laut den Benchmarks soll es wesentlich schneller sein [Link1] [Link2]…Habe es dann mal implementiert und siehe da es IST wesentlich schneller… Da ich [...]
[...] 优化思考3:既然直接保存BitmapData对象很吃内存,那么我通过JPGEncoder将其转化为ByteArray再进行保存,势必会减少开销。需要使用图片时,再通过loader.loadBytes来加载ByteArray数据。 经过试验发现,经过JPGEncoder转化后的ByteArray数据很小,只占用几十KB内存,效果明显。使用loader.loadBytes加载ByteArray显示也很顺利很流程。但一个致命的问题出现了,JPGEncoder执行效率非常低下,即使只是宽高1280*800的BitmapData,转化过程也会长达6秒之久(我的机器配置不低),更糟糕的是,由于Flash是单线程运行环境,所以此操作会导致整个界面卡住,程序陷入假死状态(延伸阅读:使用Vector优化JPGEncoder执行效率 异步JPGEncoder)。 [...]
[...] 2: Thibault Imbert figured out a way to speed up JPEG encoding using the new FP10 Vector class. Good [...]