Search code examples
INC instruction vs ADD 1: Does it matter?...


performanceassemblyx86incrementmicro-optimization

Read More
x86_64 assembly - loop conditions and out of order execution (macro-fusion with JCC)...


loopsassemblyx86-64cpu-architecturemicro-optimization

Read More
What's the most concise way to reverse a string using x86 or x86_64 assembly?...


assemblyx86-64micro-optimizationcode-size

Read More
Latency of assembly memory ops in modern CPUs...


performanceassemblyx86cpu-architecturemicro-optimization

Read More
How to optimize enum size?...


rustenumspaddingmemory-alignmentmicro-optimization

Read More
Fast method to copy memory with translation - ARGB to BGR...


cx86rgbssemicro-optimization

Read More
Is there a faster algorithm for max(ctz(x), ctz(y))?...


c++algorithmrustbit-manipulationmicro-optimization

Read More
PHP: Check if variable is type of string AND is not empty string?...


phpperformancemicro-optimization

Read More
Should I use Java's String.format() if performance is important?...


javastringperformancestring-formattingmicro-optimization

Read More
What C/C++ compiler can use push pop instructions for creating local variables, instead of just incr...


c++assemblyx86compiler-optimizationmicro-optimization

Read More
Is it possible to tell the branch predictor how likely it is to follow the branch?...


cgccx86compiler-optimizationmicro-optimization

Read More
Explain how minimum CPU time was computed for a difference of squares...


mathtimecpumicro-optimization

Read More
Bit packing of groups of n repeated bits in a 32-bit word, compact to 1 bit per group...


cbit-manipulationmicro-optimizationbit-packing

Read More
Why is `JArray.ToObject<List<T>>` faster than `JArray.ToObject<T[]>`...


c#.netjson.netmicro-optimization

Read More
Set an XMM register to a repeating byte pattern (broadcast a constant byte)...


assemblyssemicro-optimizationsse2

Read More
68000 Assembly – one-pass swap-and-sum of two word vectors (can it be done better?)...


assemblyoptimizationmicro-optimizationmotorola68000

Read More
68000 Assembly – Is branchless code faster for counting signed compare conditions?...


assemblyoptimizationmicro-optimization68000branchless

Read More
68000 Assembly – Reverse Array A into B via Stack Parameters...


assemblyoptimizationmicro-optimizationmotorola68000

Read More
68000 Assembly – Build a String from Characters *not* Present in Another & Return Its Length (st...


assemblyoptimizationmicro-optimizationmotorola68000

Read More
Dependency chain analysis...


performanceassemblyx86cpu-architecturemicro-optimization

Read More
Is performance reduced when executing loops whose uop count is not a multiple of processor width?...


performanceassemblyx86cpu-architecturemicro-optimization

Read More
Efficient AVX2 implementation of a 17x17-bit squaring operation with result truncation...


algorithmassemblybit-manipulationmicro-optimizationavx2

Read More
Cost of exception handlers in Python...


pythonperformanceexceptionmicro-optimization

Read More
Latency bounds and throughput bounds for processors for operations that must occur in sequence...


performancecpu-architecturemicro-optimization

Read More
Divide by 10 using bit shifts?...


mathbitmicro-optimizationlow-levelinteger-division

Read More
3D Morton code computation utilizing carry-less multiplication...


algorithmassemblybit-manipulationriscvmicro-optimization

Read More
How to get lg2 of a number that is 2^k...


performancebit-manipulationmicro-optimizationlogarithmbitcount

Read More
Why are bitwise operators slower than multiplication/division/modulo?...


pythonoptimizationbitwise-operatorsmicro-optimization

Read More
How can I guarantee that a variable will never be zero without using a conditional statement in C?...


ccompiler-optimizationmicro-optimizationbranchless

Read More
Performance penalty: denormalized numbers versus branch mis-predictions...


c++x86floating-pointmicro-optimizationbranch-prediction

Read More
BackNext