r2183
ZitatAlles anzeigenr2183
Sliced-threads: do hpel and deblock after returning
Lowers encoding latency around 14% in sliced threads mode with preset superfast.
Additionally, even if there is no waiting time between frames, this improves parallelism, because hpel+deblock are done during the (singlethreaded) lookahead.
For ease of debugging, dump-yuv forces all of the threads to wait and finish instead of setting b_full_recon.r2182
Add full-recon API option
Fully reconstruct frames even without dump-yuv.r2181
x86inc: switch to amdnops
Recent AMD CPUs' instruction decoders choke horribly on extremely long nops (i.e. with 4 prefixes).
Won't affect much, since we don't use ALIGN much.r2180
BMI1 decimate functions
Intel was nice enough to make tzcnt equal to "rep bsf", which is backwards-compatible.
This means we don't actually have to add new functions to make it work.r2179
Minor asm changesr2178
Add row-reencoding support to VBV for improved accuracy
Extremely accurate, possibly 100% so (I can't get it to fail even with difficult VBVs).
Does not yet support rows split on slice boundaries (occurs often with slice-max-size/mbs).
Still inaccurate with sliced threads, but better than before.r2177
Abstract bitstream backup/restore functions
Required for row re-encoding.r2176
Add an small per-MB cost penalty for lowres
Helps avoid VBV predictors going nuts with very low-cost MBs.
One particular case this fixes is zero-cost MBs: adaptive quantization decreases the QP a lot, but (before this patch), no cost penalty gets factored in for this, because anything times zero is zero.r2175
Remove explicit run calculation from coeff_level_run
Not necessary with the CAVLC lookup table for zero run codes.r2174
Export PSNR/SSIM in x264 APIr2173
x86inc: support yasm -f win64
Not necessary for x264, as -m amd64 already does the right thing, but used by external users of x86inc.r2172
Fix incorrect zero-extension assumptions in x86_64 asm
Some x264 asm assumed that the high 32 bits of registers containing "int" values would be zero.
This is almost always the case, and it seems to work with gcc, but it is *not* guaranteed by the ABI.
As a result, it breaks with some other compilers, like Clang, that take advantage of this in optimizations.
Accordingly, fix all x86 code by using intptr_t instead of int or using movsxd where neccessary.
Also add checkasm hack to detect when assembly functions incorrectly assumes that 32-bit integers are zero-extended to 64-bit.r2171
Fix possible alignment crash when linking from MSVC
x264_cavlc_init needs to be stack-aligned now.r2170
Fix rare overflow in 10-bit intra_satd_x3_16x16 asmr2169
ICL: fix out of tree building and resource file usage on Windowsr2168
Add error handling for out-of-tree build
r2167
Fix RGB colorspace input
BGR/BGRA input was correct.r2166
Fix interlaced + extremal slice-max-size
Broke if the first macroblock in the slice exceeded the set slice-max-size.r2165
Fix regression in r2141
Broke register preservation in x264_cpu_cpuid and x264_cpu_xgetbv.
Did not cause any problems.