Version history for x264 full
<<Back to software description
Changes for rev. 744 - rev. 745
- pic macros now keep track of which register holds the GOT, so variable access doesn't have to care
Changes for rev. 735 - rev. 736
- intra_rd_refine in B-frames
Changes for rev. 721 - rev. 735
- print average of macroblock QPs instead of frame's nominal QP
Changes for rev. 720 - rev. 721
- change the meaning of --ref: it now selects DPB size (including B-frames), rather than L0 size (which B-frames are added to)
Changes for rev. 719 - rev. 720
- add / fix support for FreeBSD, based on a patch by Igor Mozolevsky % igor A hybrid-lab P co P uk %
Changes for rev. 718 - rev. 719
- shut up some valgrind warnings
Changes for rev. 715 - rev. 717
- round esa range to a multiple of 4
- convert absolute difference of sums from mmx to sse2
- convert mv bits cost and ads threshold from C to sse2
- convert bytemask-to-list from C to scalar asm
- 1.6x faster me=esa (x86_64) or 1.3x faster (x86_32). (times consider only motion estimation. overall encode speedup may vary.)
Changes for rev. 714 - rev. 715
- use define _WIN32 instead of __WIN32__ or WIN32 defines.
- NSDN reference: http://msdn2.microsoft.com/en-us/library/b0084kay(VS.80).aspx
- Patch by BugMaster %BugMaster A narod P ru%
- Original thread:
- date: Dec 27, 2007 3:18 AM
- subject: [x264-devel] VS2008 compilation error (need of replacement __WIN32__ with _WIN32)
Changes for rev. 713 - rev. 714
- tweak x264_pixel_sad_x4_16x16_sse2 horizontal sum. 168 -> 166 cycles on core2.
Changes for rev. 711 - rev. 712
- also test arch-specific x264_zigzag_* implementations in checkasm.c
Changes for rev. 709 - rev. 711
- adds AliVec implementation of predict_16x16_p()
- over 4x faster than C version
- Add AltiVec implementation of
- x264_zigzag_scan_4x4_frame_altivec()
- x264_zigzag_scan_4x4ac_frame_altivec()
- x264_zigzag_scan_4x4_field_altivec()
- x264_zigzag_scan_4x4ac_field_altivec()
- each around 1.3 tp 1.8x faster than C version
- Patch by Noboru Asai % noboru P asai A gmail P com%
Changes for rev. 708 - rev. 709
- revert the x86_32 part of r708. elf shared libraries aren't important enough to be worth the extra lines of code to check for nasm.
Changes for rev. 704 - rev. 708
- faster removal of duplicate mv predictors
- reduce the data type used in some tables. 16KB smaller exe.
- check whether ld supports -Bsymbolic before using it
- mark asm functions as hidden
Changes for rev. 702 - rev. 704
- avoid a division in umh.
- patch by Dark Shikari.
- avoid a division in x264_mb_predict_mv_ref16x16.
- patch by Dark Shikari.
Changes for rev. 701 - rev. 702
- fix a memleak in h->mb.mvr
Changes for rev. 700 - rev. 701
- fix compilation as a shared library on x86_64 (regression in r696)
Changes for rev. 699 - rev. 700
- add support for x86_64 on Darwin9.0 (Mac OS X 10.5, aka Leopard)
- Patch by Antoine Gerschenfeld %gerschen A clipper P ens P fr%
Changes for rev. 697 - rev. 699
- Add AltiVec implementation of x264_pixel_ssd_8x8, 3x faster than C version
- Overall speed-up: 0.7% with --bframes 3 --ref 5 -m 7 --b-rdo
- Patch by Noboru Asai %noboru P asai A gmail P com%
- cover some more options in fprofile. (esa, bime, cqm, nr, no-dct-decimate, trellis2)
- previously, esa was slower with fprofile than without, since gcc thought it wasn't important. now esa benefits like anything else.
Changes for rev. 696 - rev. 697
- limit mvs to [-512,511.75] instead of [-512,512]
Changes for rev. 690 - rev. 692
- Add AltiVec implementation ofx264_hpel_filter. Provides a 10-11% overall speed-up with default encoding options
- Patch by Noboru Asai %noboru P asai A gmail P com%
- add AltiVec implementation of ssim_4x4x2_core, about 4x faster than C version.
- Overall: 0.1-0.2% faster with default encoding settings
- Patch by Noboru Asai %noboru P asai A gmail P com%
Changes for rev. 689 - rev. 690
- cosmetics in dsp function selection
Changes for rev. 687 - rev. 688
- cosmetics: use symbolic constants for frame padding radius
Changes for rev. 685 - rev. 686
- cosmetics: use separate variables for frame width and stride
Changes for rev. 683 - rev. 685
- Add AltiVec implementation of add4x4_idct, add8x8_idct, add16x16_idct, 3.2x faster on average
- 1.05x faster overall with default encoding options
- add AltiVec implementation of dequant_4x4 and dequant_8x8, 2.8x faster than C, 1.01x faster than previous revision with default encoding options
- Patch by Noboru Asai % noboru DD asai AA gmail DD com %
Changes for rev. 682 - rev. 683
- Add AltiVec implementation of quant_2x2_dc,
- fix Altivec implementation of quant_(4x4|8x8)(|_dc) wrt current C implementation
- Patch by Noboru Asai % noboru DD asai AA gmail DD com %
Changes for rev. 681 - rev. 682
- fix a possible nondeterminism with me=umh + threads.
Changes for rev. 678 - rev. 680
- don't overwrite pthread* namespace, because system headers might define those functions even if we don't want them
- port sad_*_x3_sse2 to x86_64
Changes for rev. 676 - rev. 678
- fix an arithmetic overflow in trellis at high qp.
- faster 4x4 sad
Changes for rev. 675 - rev. 676
- implement multithreaded me=esa
Changes for rev. 674 - rev. 675
- fix some integer overflows. now vbv size can exceed 2 Gbit.
Changes for rev. 673 - rev. 674
- allow --vbv-init to take absolute values (in kbit), in addition to the previous fractions of vbv-bufsize.
Changes for rev. 672 - rev. 673
- remove a bashism
Changes for rev. 671 - rev. 672
- reorder headers so that largefile support is defined before the first copy of stdio
Changes for rev. 670 - rev. 671
- regression in r669: broke saving of configure args if make has to re-run configure
Changes for rev. 669 - rev. 670
- regression in r669: --enable-shared should imply --enable-pic on some archs.
Changes for rev. 667 - rev. 669
- Update config.guess.
- Add a --host flag to allow overriding config.guess; this is particularly useful with a 64-bits kernel running a 32-bits userland to build 32-bits apps.
- Normalize any host triplet into a quadruplet via config.sub.
- Move option parsing before any use of architecture information.
Changes for rev. 666 - rev. 667
- mingw doesn't have strtok_r
Changes for rev. 663 - rev. 664
- cosmetics
Changes for rev. 662 - rev. 663
- limit vertical motion vectors to +/-512, since some decoders actually depend on that limit.
Changes for rev. 661 - rev. 662
- Add vertical and horizontal luma deblocking accelerated with Altivec,
- based on Graham Booker's code written for FFmpeg with slight modifications
- to re-use x264's macros
Changes for rev. 660 - rev. 661
- cosmetics in cpu detection
Changes for rev. 658 - rev. 659
- exempt 1080p from the non-mod16 warning.
Changes for rev. 657 - rev. 658
- allow compiling without yasm/nasm on x86 and x86-64 platforms
Changes for rev. 655 - rev. 656
- replace alloca with malloc everywhere. per manpage, use of alloca is discouraged. this may have a minor effect on the speed of ssim and esa, but that appears too small to measure.
Changes for rev. 654 - rev. 655
- require a ratecontrol method to be specified, it no longer defaults to cqp=26.
Changes for rev. 653 - rev. 654
- fix nnz computation in cavlc+8x8dct+deblock. (regression in r607)
Changes for rev. 651 - rev. 652
- c89 compile fix
Changes for rev. 650 - rev. 651
- cabac: use bytestream instead of bitstream.
- 35% faster cabac, 20% faster overall lossless, ~1% faster overall at normal bitrates.
Changes for rev. 649 - rev. 650
- remove the restriction on number of threads as a function of resolution (it was wrong anyway in the presence of B-frames), and raise the max number of threads in general (though more will have to be done before it can really scale to lots of cores).
Changes for rev. 648 - rev. 649
- tweak ssse3 quant
Changes for rev. 645 - rev. 648
- workaround gcc's inability to align variables on the stack.
- this crash was introduced in r642, but only because previous versions didn't use sse2 on the stack.
- faster cabac rdo. up to 10% faster at q0, but negligible at normal bitrates.
- change some tables from int to int8_t. 13KB smaller executable.
Changes for rev. 644 - rev. 645
- 32bit version of ssse3 satd.
- switch default assembler to yasm. it will still fallback to nasm if you don't have yasm.
Changes for rev. 639 - rev. 644
- simplify trellis
- fix an arithmetic overflow in trellis with QP >= 42
- 2x faster quant. 2% overall.
- side effects:
- not bit-identical to the previous algorithm.
- while the new algorithm covers a wider range of cqms than the previous one did,
- I couldn't find a good way to fallback to a general version for the extreme
- cqms. so now it refuses to encode extreme cqms instead of just being slower.
- lays a framework for custom deadzone matrices, though I didn't add an api.
- when encoding with a cqm, probe_skip now also uses the cqm, instead of the flat matrix
- cosmetics in asm macros
Changes for rev. 638 - rev. 639
- use only c-style comments in public header (patch by Vincent Torres)
Changes for rev. 636 - rev. 638
- Compile fix
- in hpel search, merge two 16x16 mc calls into one 16x17. 15% faster hpel, .3% overall.
Changes for rev. 635 - rev. 636
- remove private stuff from public headers. no more need for -D__X264__
Changes for rev. 634 - rev. 635
- adjust bitstream buffer sizes for very large frames
Changes for rev. 628 - rev. 634
- conflate HAVE_MMXEXT with HAVE_SSE2, since they were never used distinctly.
- Made -DNEED_ALTIVEC unnecessary, thanks to Guillaume Poirier.
- check x264_cpu_detect() before calling AltiVec functions.
- ssse3 detection. x86_64 ssse3 satd and quant.
- requires yasm >= 0.6.0
- Use -maltivec when building dependencies, or
cannot be used. - Do not declare vectors in non-AltiVec files.
- common/cpu.c: runtime AltiVec autodetection on Linux.
- configure, Makefile: do not build the whole project with -maltivec because it generates AltiVec code in weird places.
Changes for rev. 627 - rev. 628
- fix a small memleak.
- patch by Limin Wang.
Changes for rev. 626 - rev. 627
- compile fix for GCC-3.3 on OSX, based on a patch by
- Patrice Bensoussan % patrice P bensoussan A free P fr%
- Note: regression test still do not pass with GCC-3.3,
- but they never did as far as I can remember.
Changes for rev. 623 - rev. 624
- add ability to generate doxygen documentation; make dox
Changes for rev. 622 - rev. 623
- oops, scenecut detection failed to activate when using threads and not using B-frames
Changes for rev. 621 - rev. 622
- extras/getopt.c was BSD licensed. replace with a LGPL version (from glibc).
Changes for rev. 620 - rev. 621
- Fix build issues on Linux. Only gcc-4.x is supported, as on OSX.
- Cleans up a few inconsistencies in the code too.
Changes for rev. 619 - rev. 620
- tweak block_residual_write_cavlc.
- up to 1% faster lossless, no difference at normal bitrates.
Changes for rev. 618 - rev. 619
- don't assume int is exactly 4 bytes
Changes for rev. 617 - rev. 618
- make array_non_zero() compatible with -fstrict-aliasing
Changes for rev. 616 - rev. 617
- Honor CFLAGS and LDFLAGS set by the user
Changes for rev. 615 - rev. 616
- Check whether 'echo -n' works, otherwise try printf (fixes build on current OS X 10.5)
Changes for rev. 614 - full rev. 614
- Check version of nasm on OS X / Intel
Changes for rev. 613 - rev. 614
- wrong reference frames were used with refs>=14 + pyramid (regression in r607)
Changes for rev. 612 - rev. 613
- enable thread synchronization primitives on linux too
Changes for rev. 611 - rev. 612
- fix a crash with x264_encoder_headers() + threads
Changes for rev. 610 - rev. 611
- don't skip autodection on configure --enable-pthread
Changes for rev. 605 - rev. 606
- Do not assume anything about sizeof(cpu_set_t).
Changes for rev. 603 - rev. 604
- Add Altivec implementations of add8x8_idct8, add16x16_idct8, sa8d_8x8 and sa8d_16x16
- Note: doesn't take advantage of some possible aligned memory accesses, so there's still room for improvement
Changes for rev. 600 - rev. 601
- Merges Guillaume Poirier's AltiVec changes:
- * Adds optimized quant and sub*dct8 routines
- * Faster sub*dct routines
- ~8% overall speed-up with default settings