@EastonMan 看的新闻
+碎碎念
+膜大佬
+偶尔猫猫
+伊斯通听的歌
Arch Linux: Recent news updates
Critical rsync security release 3.4.0

We'd like to raise awareness about the rsync security release version 3.4.0-1 as described in our advisory ASA-202501-1.

An attacker only requires anonymous read access to a vulnerable rsync server, such as a public mirror, to execute arbitrary code on the machine the server is running on. Additionally, attackers can take control of an affected server and read/write arbitrary files of any connected client. Sensitive data can be extracted, such as OpenPGP and SSH keys, and malicious code can be executed by overwriting files such as ~/.bashrc or ~/.popt.

We highly advise anyone who runs an rsync daemon or client prior to version 3.4.0-1 to upgrade and reboot their systems immediately. As Arch Linux mirrors are mostly synchronized using rsync, we highly advise any mirror administrator to act immediately, even though the hosted package files themselves are cryptographically signed.

All infrastructure servers and mirrors maintained by Arch Linux have already been updated.

source
(author: Robin Candau)
Daniel Lemire's blog
The ivory tower’s drift: how academia’s preference for theory over empiricism fuels scientific stagnation

Almost all of academic science has moved away from actual (empirical) science. It is higher status to work on theories and models. I believe that it is closely related to well documented scientific stagnation as theory is often ultimately sterile.

This tendency is quite natural in academia if there is no outside pressure… And is the main reason why academia should be ruthlessly judged by practitioners and users. As soon as academia can isolate itself in a bubble, it is bound to degrade.

It is worth trying to understand some of the factors driving this degradation… Theoretical work can sometimes be seen as more complex. This complexity can be mistakenly equated with higher intelligence or prestige. Empirical work, while also complex, often deals with tangible, observable data, which might seem more straightforward to the uninitiated.

Empirical work is more likely to lead to nuanced or inconclusive results while theory is often seemingly more direct and definitive. Theoretical research often requires fewer resources than large-scale empirical studies which might need extensive funding for equipment, data collection, and personnel. Thus you get to do more research with less using models and theory.

Theoretical work is often seen as requiring a high level of creativity to devise new frameworks or models. While empirical work also requires creativity in design, execution, and interpretation, the creativity in data collection or experimental design might be less recognized or appreciated.

The educational system often glorifies theoretical knowledge over practical skills until one reaches higher education or specialized training. E.g., we eagerly make calculus compulsory even if it has modest relevance in most practical fields. This educational bias can carry over into professional work.

Society must demand actual results. We must reject work that is said ‘to improve our understanding’ or ‘to lay a foundation for further work’. We must demand cheaper rockets, cures for cancer, software that is efficient. As long as academic researchers are left to their own devices, they will continue to fill the minds of the young with unnecessary models. They must be held accountable.

source
Daniel Lemire's blog
JavaScript hashing speed comparison: MD5 versus SHA-256

source
Matt Keeter
Fidget

Blazing fast implicit surface evaluation

source
(author: Matt Keeter ([email protected]))
Daniel Lemire's blog
Counting the digits of 64-bit integers

source
Daniel Lemire's blog
Artificial Intelligence as the Expert’s Lever: Elevating Human Expertise in the Age of AI

The more likely outcome of the rise of generative artificial intelligence is higher value for the best experts… where ‘expert’ means ‘someone with experience solving real problems’.
“While one may worry that AI will simply render expertise redundant and experts superfluous, history and economic logic suggest otherwise. AI is a tool, like a calculator or a chainsaw, and tools generally aren’t substitutes for expertise but rather levers for its application.
By shortening the distance from intention to result, tools enable workers with proper training and judgment to accomplish tasks that were previously time-consuming, failure-prone or infeasible. Conversely, tools are useless at best — and hazardous at worst — to those lacking relevant training and experience. A pneumatic nail gun is an indispensable time-saver for a roofer and a looming impalement hazard for a home hobbyist.
For workers with foundational training and experience, AI can help to leverage expertise so they can do higher-value work. AI will certainly also automate existing work, rendering certain existing areas of expertise irrelevant. It will further instantiate new human capabilities, new goods and services that create demand for expertise we have yet to foresee.” (Autor, 2024)


source
Daniel Lemire's blog
How does your URL parser handle Unicode?

Most strings today in software are Unicode strings. It means that you can include mathematical symbols, emojis and so forth. There are many different versions of the letter ‘M’, for example: the Roman letter M (U+004D) is semantically different from the Roman numeral Ⅿ (U+216F) while they both often have the same visual representation. John Cook has an interesting post on Unicode Stegonography: you can possibly use this ambiguity to hide messages in plain view. E.g., if you need to warn someone that you are in danger, you could send a text with the Roman numeral M. Normal people reading the text would not notice the difference.

What about URLs like Microsoft.com? What if you replace the Roman letter by a Roman numeral, is it still the same domain?

It is. URL parsers are required to normalize the URLs which involves, among other things, replacing look-alike letters with Roman letters if they are to be compliant with the WHATWG URL specification.

But do they? Do the URL parsers actually do this hard work? Let us check.

Java. I could not get the standard Java library to return to me the host. It simply returns a null String.
 String url = "https://microsoft.coⅯ";
 URI uri = new URI(url);
 String host = uri.getHost();

C#. The .NET library seems to just returns the domain as-is with the Roman numeral.
string url = "https://microsoft.coⅯ";
Uri uri = new Uri(url);
string host = uri.Host;

PHP. The standard PHP interpreter just returns the domain as-is, with the Roman numeral
$url = "https://microsoft.coⅯ";
$parsed_url = parse_url($url);
if ($parsed_url === false) {
 echo "URL could not be parsed.";
} else {
 $host = $parsed_url['host'];
}


Go. Go also does not do normalization.
urlString := "https://microsoft.coⅯ"
parsedURL, err := url.Parse(urlString)
if err != nil {
        fmt.Println("URL could not be parsed:", err)
        return
}
host := parsedURL.Host

Python. You guessed it: no normalization. It happily returns the Roman numeral.
url = "https://microsoft.coⅯ"
parsed_url = urllib.parse.urlparse(url)
host = parsed_url.netloc

JavaScript. JavaScript does it correctly. It will convert https://microsoft.coⅯ to https://microsoft.com.
const url = "https://microsoft.coⅯ";
const urlObj = new URL(url);
const host = urlObj.hostname;

C++. C++ does not have a standard URL parser, but if you use the ada URL parser, you will get correct results. If you are using the Node.js runtime environment, the underlying parser is the C++ ada URL parsing library.
auto url = ada::parse("https://microsoft.coⅯ");
if (!url) { /* failure */ }
std::string_view host = url->get_host();


source
Back to Top