Easton Man's Channel

@EastonMan 看的新闻
+碎碎念
+膜大佬
+偶尔猫猫
+伊斯通听的歌

01:32 · Mar 16, 2025 · Sun

Daniel Lemire's blog
Speeding up C++ code with template lambdas

Let us consider a simple C++ function which divides all values in a range of integers:

void divide(std::span<int> i, int d) {
    for (auto& value : i) {
        value /= d;
    }
}

If the divisor d is known at compile-time, this function can be much faster. E.g., if d is 2, the compiler might optimize away the division and use a shift and a few cheap instructions instead. The same is true with all compile-time constant: the compiler can often do better knowing the constant.

In C++, a template function is defined using the template keyword followed by a parameter (usually a type parameter) enclosed in angle brackets < >. The template parameter acts as a placeholder that gets replaced with actual data type when the function is called.

In C++, you can turn the division parameter into a template parameter:

template <int d>
void divide(std::span<int> i) {
    for (auto& value : i) {
        value /= d;
    }
}

The template function is not itself a function, but rather a recipe to generate functions: we provide the integer d and a function is created. This allows the compiler to work with a compile-time constant, producing faster code.

If you expect the divisor to be between 2 and 6, you can call the template function from a general-purpose function like so:

void divide_fast(std::span<int> i, int d) {
    if(d == 2) {
        return divide<2>(i);
    }
    if(d == 3) {
        return divide<3>(i);
    }
    if(d == 4) {
        return divide<4>(i);
    }
    if(d == 5) {
        return divide<5>(i);
    }
    if(d == 6) {
        return divide<6>(i);
    }

    for (auto& value : i) {
        value /= d;
    }
}

You could do it with a switch/case if you prefer but it does not simplify the code significantly.

Unfortunately we have to expose a template function, which creates noise in our code base. We would prefer to keep all the logic inside one function. We can do so with lambda functions.
In C++, a lambda function(or lambda expression) is an anonymous, inline function that you can define on-the-fly, typically for short-term use. Starting with C++20, you have template lambda expressions.
We can almost do it like so:

void divide_fast(std::span<int> i, int d) {
    auto f = [&i]<int divisor>() {
      for (auto& value : i) {
        value /= divisor;
      }
    };
    if(d == 2) {
        return f<2>();
    }
    if(d == 3) {
        return f<3>();
    }
    if(d == 4) {
        return f<4>();
    }
    if(d == 5) {
        return f<5>();
    }
    if(d == 6) {
        return f<6>();
    }

    for (auto& value : i) {
        value /= d;
    }
}

Unfortunately, it does not quite work. Given template lambda expressions, you cannot directly pass template parameters, and you need something ugly (‘template operator()&LTparams>’):

void divide_fast(std::span<int> i, int d) {
    auto f = [&i]<int divisor>() {
      for (auto& value : i) {
        value /= divisor;
      }
    };
    if(d == 2) {
        return f.template operator()<2>();
    }
    if(d == 3) {
        return f.template operator()<3>();
    }
    if(d == 4) {
        return f.template operator()<4>();
    }
    if(d == 5) {
        return f.template operator()<5>();
    }
    if(d == 6) {
        return f.template operator()<6>();
    }

    for (auto& value : i) {
        value /= d;
    }
}

In practice, it might still be a good choice. It keeps all the messy optimization hidden inside your function.

source

07:45 · Mar 15, 2025 · Sat

Chips and Cheese
Raytracing on Intel’s Arc B580
#ChipAndCheese

Telegraph | source
(author: Chester Lam)

Telegraph

Raytracing on Intel’s Arc B580

Intel’s discrete GPU strategy has emphasized add-on features ever since Alchemist launched. Right from the start, Intel invested heavily in dedicated matrix multiplication units, raytracing accelerators, and hardware video codecs. Battlemage continues that…

ChipAndCheese

17:56 · Mar 13, 2025 · Thu

https://tree.aza.moe/

写了一个可视化整棵树的网页，用来表示层级的线条我居然只用 css 画出来了，好奇妙 qwq

感谢大家一起种树 🌲

15:21 · Mar 13, 2025 · Thu

杰哥的{运维，编程，调板子}小笔记
Intel Gracemont 微架构评测

source

23:23 · Mar 12, 2025 · Wed

今天是植树节，想试试和大家一起种一颗 tgcn 频道树 🌳 qwq

这里是 Easton Man's Channel，是 @sohadays 的树枝 🌿 在频道树的第 4 层哦~

（如果你也有公开频道，想成为这个频道的树叶的话，就去给 @tgtreebot 发送 /leaf easton_channel {你的频道名} 吧！ > <） ‎

t.me

TGCN 频道树！

这里是频道 @easton_channel - Easton Man's Channel，在频道树的第 4 层，是 @sohadays (🔗) 的树叶哦~

14:28 · Mar 11, 2025 · Tue

#今天看了啥 https://github.com/chili-chips-ba/wireguard-fpga

GitHub