You can beat the binary search

[-]

ScottContini@reddit

Title is misleading. Yes you can beat it using parallelism built into SIMD architectures assuming the values fit into words that can be parallel processed. But asymptotically best search without parallelism is still Θ( n log n )

[-]

Kwantuum@reddit

Even with parallelism

[-]

Slime0@reddit

The choice of function names like vld1q_u16 and _mm_loadu_si128 for SIMD instructions has got to be one of the biggest hurdles to their general adoption. The amount of mental energy you have to devote just to understanding what the code is doing is ridiculous. It's completely unreadable.

[-]

angelicosphosphoros@reddit

Just make aliases: `static constexpr auto LoadSseUnaligned = _mm_loadu_si128;`

[-]

ack_error@reddit

This doesn't work because the calls are intrinsics and may also be macros. ARM ACLE specifically says that whether the intrinsics are defined as macros or whether the address of the intrinsics can be taken is unspecified, and some compilers define them as macros with arguments.

[-]

angelicosphosphoros@reddit

Well, in such case, it is possible to just use defines.

[-]

Slime0@reddit

That's not gonna help when I'm looking at code someone else wrote, or reading something online for information about how to use the functions.

[-]

svick@reddit

If you use .Net, then the first one is AdvSimd.LoadVector128 and the second one is Sse2.LoadVector128.

And you can also use Vector128.Load if you want to have cross-platform code.

[-]

floodyberry@reddit

simd is generally adopted, it's used everywhere. the reason it's not in general purpose code is not the function names, it's that

a) it's not portable

b) not all cpus support all extensions, so code must either be compiled for a specific architecture, or dynamic dispatching must be used

c) subtle differences, even between extensions on the same architecture, can require completely different implementations

[-]

ack_error@reddit

As others have said, most names follow a pattern once you're used to it. The ARM NEON intrinsic names are better than Intel's. It's partly a consequence of trying to squish readable names into a global namespace for intrinsics that must be available from both C and C++ without overloading. And sometimes the names are obscure because the operations are obscure:

https://developer.arm.com/architectures/instruction-sets/intrinsics/vqrdmlah_s16

Wouldn't really help to have an intrinsic like signed_saturating_rounding_doubling_multiply_accumulate_high_half(). But there are other annoyances with common SIMD intrinsics, like:

Having the IDE give unhelpful (src1, src2, src3) as the arguments for a madd intrinsic because that's how the intrinsic was defined in the compiler header, only to cross-reference it to the documentation which lists them as a, b, and c, which then has to be translated to Vd, Vn, and Vm registers in order to match then to V[n], V[m], and V[d] and then operand1/2/3 in the pseudocode (in different order), in order to figure out which ones are the multiply terms and which one is the addend.

Accidentally triggering an illegal instruction because on Intel, despite regularly named intrinsics, 16-bit, 32-bit, and 64-bit logical shifts left and right are SSE2, 16-bit and 32-bit arithmetic shifts right are SSE2, but 64-bit arithmetic shifts right are AVX-512. Or similarly on ARM, finding out that a particular multiply intrinsic is ARMv8 while the multiply add intrinsic is ARMv8.1, and nowhere on the intrinsic page for the latter does it tell you that the madd version requires ARMv8.1 / FEAT_RDM unless you also cross-check the assembly instruction.

Having to type vreinterpretq_f32_u32(vandq_u32(vreinterpretq_u32_f32(vals), sign_mask)) because ARM decided to use the unwieldy "reinterpret" instead of "cast".

Having to do nasty pointer casts because Intel did silly things like have a 64-bit load intrinsic _mm_loadl_epi64 take a __m128i pointer even though it only loads 64 bits.

[-]

cosmic-parsley@reddit

Seriously feels like a naming constraint was “doesn’t overflow a default excel column”.

You do get a feel for things after a while, but they keep adding more with even more obscure names. A few extra vowels and underscores would make a world of difference.

[-]

cdb_11@reddit

You could probably come up with a better naming convention, like maybe u16x8_load or something. But let's not make the names longer, because the majority of these are just basic operations, like basic arithmetic or memory loads/stores.

[-]

cdb_11@reddit

The names actually do make sense, so you can get used to it.

[-]

saf_e@reddit

Its just about const, you replace log2(x) with log(4) and get better absolite number, but asymptotically it's the same.

[-]

pepejovi@reddit

This is one of those articles that I can tell is high quality, but I need half the text to be hyperlinks to explanations of the words it's using.. It's going to take me the better part of a day just to halfway understand the terminology used in the explanation :D

[-]

melt4082@reddit

Podem me tirar uma dúvida por gentileza.Estou querendo matricular meu filho na Kodland. Valor de 3900,000 pagos em 12 vezes de 331,00. Curso de desenvolvimento de jogos e animações. Estou um pouco insegura porque além de ser on-line, vi miitas reclamações no reclame aqui, mas a com pontuação elevada de confiança. Alguém aqui sabe me informar se já soube de alguma criança que concluiu o curso lá e ficou satisfeita, aprendeu mesmo.

[-]