﷽ <- the single codepoint the author is talking about.
It's pretty great fun pasting it into various text entry fields to see how they behave.
In standard-ish single-line-ish Apple text fields on my Mac (iMessage text entry field, Chrome Omnibox), it renders like this, which... I'm not sure is correct? https://cleanshot.com/share/0GkNJGQ7
On the other hand it renders akin to Chrome in TextEdit.
In your iMessage screenshot, this character is being rendered in Noto Nastaliq Urdu[1], which is a font that uses the nastaliq flavor of the Arabic script (as compared to the more widely used naskh flavor, which you're most probably seeing in Chrome's rendering).
What's curious to me is that Apple only uses Noto Nastaliq Urdu if Urdu is enabled in preferred languages and is higher than any other Arabic-script language. [2] Is that so on your machine?
In some discussion about Arabic rendering on another website[0], it was pointed out that the Basmala is its own codepoint in part because it is (or was?) a legal requirement on Pakistani documents and comes from an Urdu character-set. It's possible that, as a character effectively originating from and used by Urdu speakers, Apple defaults it to Nastaliq regardless of your font settings.
Interestingly, pasting it in a new VS Code tab renders it like the browser does (the wide version), but the tab and suggested filename is the Nastaliq style.
Thank you for sharing! I wonder what distinguishes e.g. Omnibox/iMessage from TextEdit/Chrome textareas (especially since iMessage's entry box can be made multiline) to cause the divergent rendering!
it really is the same four words written in a different calligraphy and arranged in a different way (more horizontally). arabic calligraphy can take liberties with orientation of text and even arrangement of letters. the name of the game is make deciphering it a puzzle, but easy enough for the reader to have fun and not get bored.
This is my favourite single character to demonstrate that you cannot lay text out without knowing the font, which people sometimes try to claim is possible in terminals: in some fonts, it’s 10em wide and less than 1em tall, but in others, it’s under 3em wide and perhaps 2em tall.
(If people aren’t convinced by that, my next area is complex text layout, starting with my name in the Telugu script, <https://temp.chrismorgan.info/క్రిస్.svg>, also augmenting that with how the r can be drawn to the left or underneath or even a little to the right of the k, which I really should add to that SVG file.)
There seems to be a multiple-of-19 "code" in the Quran. Many of these observations require the Arab alphabet to be interpreted as numbers; in Mohammed's days the Abjad system was used for this (similar but simpler than the Roman numerals, that also re-use the letters as numbers).
So using the Abjad system to give number values to the Arab letters there are many counts that add up to a multiple of 19. A critic (and I try to be one) so note that every 19 tries ("would this add up to a multiple of 19?") you are expected to find one that does add up to a multiple of 19!
In order to show how many cases add up, I created a unit test suite to demonstrate the claims.
I started my software career maintaining and forking Unicode parsers. Arabic, Hindi, Chinese and Thai among many other complex languages. It was great fun and it helped me get a deep understanding of how complex writing was and appreciate the beauty of being able to reduce this complexity down to data structures and functions.
Arabic is so dense, and Quran came to perfect it.
Reading the following verse from Quran, makes me proud that I'm a native Arabic speaker:
"إِنَّا أَنزَلْنَاهُ قُرْآنًا عَرَبِيًّا لَّعَلَّكُمْ تَعْقِلُونَ"
"Indeed, We have sent it down as an Arabic Qur'an that you might understand."
Quran 12:2
In case anyone is interested in a secular perspective, I would phrase this as: "the Prophet's tribe's own linguistic heritage dominated following his success as a religious leader".
At the time of Prophet Muhammad, Arabic was broken into various dialects, with a notable split between Western (Hijazi) and Eastern (Najdi) Arabic. Muhammad belonged to the Quraysh tribe in Mecca, whose dialect was Western.
The Islamic empire rapidly grew to include different Arabic dialects and non-Arabs after Muhammad's death. A linguistic unification of the Quran's text and pronunciation was needed. The third "Caliph" (religious leader), Uthman ibn Affan, developed a single definitive version based on the Quraysh dialect and rigorously destroyed all variants.
Subsequently, non-Arab scholars - who had less native intuition of the Quraysh dialect - codified Arabic grammar based on the Quran and pre-Quranic poetry. This included the system of lines and dots above and below letters to indicate the shape of short vowels.
Yup. I practice western calligraphy but haven't been able to make much inroads into Arabic. Partly because of the lack of high quality instruction material (which is available as books for Western calligraphy) and partly because of the complexity. The fact that it's right to left and I'm right handed presents some challenges too (my hand will come over letters I've just written and there's a risk of smudging them). The nibs are usually cut the other way and I'm still struggling drawing basic letters.
Islam prohibits representational art and so, except for a few pockets, all the skills of Muslim artists went into two things - Calligraphy and geometric tessellations (what's called "arabesque" and which you see on mosques, rugs etc.). The calligraphy itself has several hands (which is what we call "fonts"). The most popular one is called https://en.wikipedia.org/wiki/Naskh_(script) which is the one used for the copies of the Koran from Saudi Arabia. It's very legible and doesn't lend itself to too much flourishing. The Basmala glyph mentioned in the article looks like Naskh with the S of Basmala (س) elongated. There are others too. Thuluth (which is used in the copies of the Koran for ornamental work like the titles of the chapters), Nastaliq (which people often call Urdu or Persian because of how those languages are usually written in this hand), Kufic (which is an angular hand that overlaps with tessellations in ornamental work), Mughlai (which is a denser hand that's common in the Indian Subcontinent) and several others. There are even local variants with which you can identify geography. This style is specific to the Malabar coast in Kerala and, as far as I know, it's seen only there. https://en.wikipedia.org/wiki/Arabi_Malayalam_script#/media/...
Cool article, but the one it acknowledges and riffs off of, at https://lr0.org/blog/p/arabic/ , is much more informative, and tells all about how we got decent Arabic script rendering on our browers and OSs, but still notably imperfect.
Which makes me think: come on, in the age of Claude, the gap between "we know what to do" and "here is the working code" is narrower than ever.
Who will be the one to pick up the job? Has to be an Arabic speaker I guess!
We have a framed calligraphy at home, and it was fascinating to teach my wife how it is read. The one in the article (wikimedia picture) is actually read from the bottom up, yet it is somehow legible.
It’s interesting to see an article to decipher the sentence when I know Arabic handwriting calligraphy in different fonts! The “bisme” itself btw is also a combination of two different words. And “kashida” is a persian word not Arabic, the Arabic one is probably “maddah”.
It's pretty great fun pasting it into various text entry fields to see how they behave.
In standard-ish single-line-ish Apple text fields on my Mac (iMessage text entry field, Chrome Omnibox), it renders like this, which... I'm not sure is correct? https://cleanshot.com/share/0GkNJGQ7
On the other hand it renders akin to Chrome in TextEdit.
reply