I would be more interested if they are ever going to cancel HAN unification. Looking at their "Factors for Exclusion" list it could be summarized by "we made some mistakes in past but are sticking to it" :D
IVD works, theoretically and practically (recent versions of OpenType have an explicit support for them). It's not their fault that Japanese vendors have been not very quick to adopt them.
If a Japanese and Taiwanese person type things with their keyboards and end up with the same bytes for different logical characters then no things do not work practically for any practical definition of "practically".
Your argument is absurd because people don't see code---they see glyphs, and using the same code for slightly different glyphs is a non-issue when they are not interchanged. (And when they are interchanged, both would see glyphs "correct" to them anyway.) Japaneses are sensitive to Han unification only because they recognize more glyph variations (Z-variants) than what Unicode originally could, and IVS is exactly a tool for ensuring exact glyphs assuming cooperative vendors. Not to mention that Han unification was already quite weakened by source separation principles in the first place.
Chinese AI labs are reducing Japanese images and text out of AI models - they leave much smaller amount for text models that has to be literate in Japanese, and explicitly nuke it out of dataset for image models so that it only supports Simplified and English languages, so to avoid GIGO.
I mean, making or help making sovereign AI models is nowhere near responsibilities of Unicode, but Han Unification and sort of a default-enforced IVD support is literally adding small but non-zero amount of fuel to cultural division and xenophobia perpetuate in East Asia. I doubt blaming users would work here.
While I agree that Han Unification is not optimal (and fixing them is a welcoming development), it is already too late to reverse it. Even counter-proposals like TRON didn't work at all so far. IVD is the best compromise we can have in this situation.
> cultural division and xenophobia perpetuate in East Asia
By the way, I recently have seen multiple claims from Japanese Twitter users that Korea would have been better keeping Chinese characters (Hanja) in use. If this is a cultural division and xenophobia we are talking about, I will gladly take it---why on earth do they have any saying in Korea's choice of scripts? The "sinosphere" is an illusion, the fact that CJKV countries have or had shared the same set of characters is just a fun fact and not a cultural mandate or anything else like that.
> IVD is the best compromise we can have in this situation.
Maybe, but no one is running an ivdfy-filter through every single Japanese documents and the issue keeps going. Maybe one way to make it happen is to make the Simplified forms singularly canonical to the CJK Unified Ideographs so to classify everything in that form as Chinese, and define Japanese script as being always flagged with IVDs, though I don't know what the storage and processing implication of that might be. But my point is that maintaining the position that users can optionally choose to not display text in a wrong language and Unification issues are merely user errors don't make any sense to me.
> Korea would have been better keeping Chinese characters (Hanja) in use.
I can't speak for all, but I, for one, do regularly encounter machine translation failures in Korean contents due to homophones even with LLM-based ones in the ways that don't happen with Japanese. It manifests as either homonym errors[1] or the MTL resorting to phonetic transcripts that I have no idea about[2]. Both happens in formal writings like newspaper Web articles in addition to casual social media posts. Since it appears that there's no way this issue could happen with "our" system, it sometimes feel like reverting to that could fix it.
1: (like "plain/plane", had the source been English and this was somehow happening)
2: (like "That arm might be fukuzatukossetsushiteru" had the source been Japanese)
(update: looks like there was someone/some groups ragebaiting Korean and Japanese Twitter users with Korean transition into the Hangul phonetic script for Twitter impression incentives money. Those tweets had not reached me at the time of writing above comment, and my opinion that bringing back Kanji/Hanzi could solve some translation/communication issues is not based on whatever they used as fuels, though I fear it might have been actually close to it)
I need a table emoji because then I could combine it with a horse emoji. This would be "Pferd Tisch" (Horse Table) in German which sounds similar to "Fertig" which translates to "done". Yes I want it only for that dumb joke.
If the seahorse emoji is introduced, we will have to train new foundation models. The costs connected to the introduction of the seahorse emoji will be in the billions.
You're absolutely right—the seahorse emoji was added in Unicode version 19.0.0 after OpenAI purchased the Unicode Consortium and converted it to a for-profit corporation.
The double space after a period, with the added effort to avoid browser whitespace condensing, is an interesting style choice. Is it meant to mimic old academic publications?
While present in some of their previous articles sparingly, this is the first one to use it consistently.
THANK YOU. I was confused at the normalization example given, and had to think through it. (id, name, age) is already at 5NF, and the only one it doesn’t satisfy is 6NF.
Wait. Are you claiming that there's some sort of link between "Gates's relationship with financier Jeffrey Epstein started in 2011" and the Gates Foundation which launched in 2000 by merging with the Gates Sr. Foundation from all the way back in 1994?
> What should this function be named? I didn't care. Where should this config live? I didn't care. My brain was full. Not from writing code - from judging code.
Does it matter anymore? Most good engineering principles are to ensure code is easy to read and maintain by humans. When we no longer are the target audience for that, many such decisions are no longer relevant.
* Cracking face
* Left/Right thumb sign
* Monarch butterfly
* Pickle
* Lighthouse
* Meteor
* Eraser
* Net with handle