An authoritative source on this Inventing the Alphabet by Johnanna Drucker. She covers not only the modern evidence but also attempts to classify alphabets throughout history, with particular focus on the Middle Ages. The first half is a bit dry -- how much do we really care what various scholars in the 16th Century made up about the history of the alphabet? -- (a lot was made up), but the second half looks at the modern archaeological contribution to the study of alphabetic origins and is very interesting.
There are also lots of scans of really interesting Medieval manuscripts cataloging alphabets in the book.
From the book's "Chapter 7. Modern Archaeology --
Putting the Evidence of the Alphabet in Place"
"We can now describe the origin of the alphabet chronologically and
geographically with some degree of reliability. The basic outlines are these:
The alphabet was formed in the context of cultural exchanges between
Semitic-speaking people from the Levant and communities in Egypt after or around 1800 BCE. The earliest evidence is dated to Wadi el- Hol, a site in Egypt just west of the Nile, north of Luxor. Later inscriptions in the Sinai and throughout the Fertile Crescent show the gradual distribution
and evolution of alphabetic writing, with most evidence dating from the
fourteenth century BCE and after."
Regarding "how much do we really care what various scholars in the 16th Century made up about the history of the alphabet?" we should, if we're interested in knowing the context in which various claims, which pop up even today, appeared for the first time. The context often explains the motivation which resulted in specific "made up" narrations, some still (often unfortunately) influencing our lives.
A fascinating subject, and there's an extension of it about which I've been curious for years, it's why English doesn't use diacritics or accents on alphabetical characters as do many European languages—French, German, etc. For instance, in French the letter 'e' can take four forms—without accent, grave è, acute/aigu é, circumflex ê. This makes the letter much more flexible and greatly assists with pronunciation.
So why doesn't English use them? It's clear the language needs them with words such as through, thorough and thought which are confusing enough for native speakers let alone those learning English as a second language. And how about inconsistencies such as the verb to lead and the metal Pb lead, or other strange 'gh' words such as cough?
Similarly, proper nouns such as Wycombe, Warwick, etc. defy logic when it comes to pronunciation and would greatly benefit from diacritical marks.
I've never considered myself a particularly good speller and I reckon I would have benefited from diacritics had English used them.
English spelling becomes a lot more rational [1] when you understand that you are spelling words largely according to Middle English. Sound changes that occur after that are largely not reflected in English spelling.
As for why English doesn't use diacritics, well, my hypothesis is that Middle English just didn't need them. Per the Wikipedia article on Middle English phonology, there's 5 short vowels + unstressed vowel + 7 long vowels + diphthonging with /j/ or /w/. With just 5 short vowels, it's possible to write every one with the 5 basic vowels of Latin script, and the long vowels and diphthongs can be written with paired vowels, using different vowels for the extra long vowel sounds (cf. ee versus ea)--no need for diacritics, even if it is a little clumsy.
The Great Vowel Shift came along and fucked up all the long vowels, and separate changes caused "a" to break into 2-3 sounds (crowding into the "o" sound, as well), leaving us with clearly multiple sounds for the same short "a" and "long" vowel spellings that bare little to no resemblance to their corresponding pronunciations. Some of the sounds (in particular, ee and ea) merged into one sound as part of the Great Vowel Shift, too.
[1] The other thing that ruins English spelling is a proclivity to borrow foreign words with foreign pronunciation and foreign spellings (even if it's not written in Latin script!), so to some extent, you have to play a guessing game as to etymology to work out spelling. But ignoring those cases, you can actually pretty reliably work out the pronunciation of most English words by effectively applying the (mostly regular) phonological changes from Middle English.
Pyjamas, guru, khaki, avatar, bandana, jodhpurs, and shampoo are but a few of the many words drawn from Hindi and Sanskrit.
Urushiol, kudzu, futon, and karate from Japanese.
Orangutan, bamboo, cassowary, paddy, ramie, rattan, gong, and camphor from Indonesian (and related languages).
Ammonia, banana, bongo, cola, dengue, ebony, and gnu are of African origin (numerous languages).
Algebra, algorithm, alchemy, and alcohol all come from Arabic ('al' is the definite article in Arabic). So too do numerous terms for textiles: chiffon, gabardine, satin, tafetta, and wadding. Tahini, tuna, tamarind, talc, tangerine, and talisman as well.
All these use a non-Latin script, in some cases no script at all.
Fair point as I'd chosen to use Hindi (culture) rather than India (geographic region), though of course India itself derives from Hindi (the word).
A fuller set of candidate languages, from Wikipedia: Hindi, Urdu, Kannada, Malayalam, Sanskrit, Tamil, Teluga, Bengali, Assamese, Bengali, and Marathi.
I'm sure someone will pitch in with specific sub-dialects as well shortly.
I'd reported languages based on Wikipedia's citing. That's publicly-editable, and if you have sourced references the links I'd provided can be updated.
I was intending to post a comment about foreign words imported into English in my long reply to jcranmer's point about same but the post ended up too long to do so. As I mentioned, I'm thinking of posting an addendum. Some points are likely to be controversial but I intend to explain why.
"The other thing that ruins English spelling is a proclivity to borrow foreign words with foreign pronunciation and foreign spellings (even if it's not written in Latin script!),"
You're right, I can't agree more. A pet peeve of mine is the lackadaisical way English imports words and how little attention native speakers give to their correct pronunciation. For people who aren't good spellers and who don't memorize words at first glance, spelling them can be daunting given that their pronunciation often differs wildly from their spelling. I recall as kid trying to check the spelling of précis in the dictionary and going half-mad with frustration, every attempt at looking up various variations on 'praysis' failed.
Similarly, we often import words into English more out of fashion than of necessity. A quintessential example is tsunami, whilst marginally shorter than its English equivalent, it provides no descriptive meaning as do the English words, and almost no Anglophone speaker pronounces the word correctly nor even bothers do so. Instead, we get mishmashes that sound like 'sooonami', which, no doubt, not only sound terrible to Japanese speakers but also the bastardized sound pays no respect to the Japanese language. It seems very few know or bother to note that the transliterated spelling is actually very close to the original pronunciation. Given that the arrangement of its first three letters is a rather uncommon occurrence in English then one has to wonder why it doesn't automatically ring bells and alert English speakers that they need to pay careful attention to its pronunciation. Tsu, ツ - katakana, つ - hiragana, ought to be pronounced as it's written—like the sound of a hissing snake, but despite its English spelling being a good facsimile for the Japanese, one never seems to hear any pronunciation that's even close. No doubt, it takes more effort for English speakers to pronounce the unfamiliar sound but very few try—even well-spoken BBC announcers slur the word.
The question remains why we English speakers bothered to steal the word and import it into English in the first instance, especially given that we've made such a hash of using it.
Unfortunately, the bastardization of tsunami's pronunciation is yet another instance of the arrogance Anglophone speakers have in respect of English and their disregard for other languages (given, pro rata, the numbers who know English as a second language versus English speakers who've learned foreign languages). Amongst, English speakers it's a common assumption that the world only revolves around English and that everyone should speak the language to communicate with them.
Whilst English is essentially the current lingua franca many around the globe are aware of the old racist adage "if the natives don't understand then just shout a little louder", and rightly they resent such arrogance. We ought to be very cognisant of their opinion but we're often blind to the fact.
dredmorbius provides us with a good list of imported words but unlike tsunami there were no English equivalents so incorporating them into the language was just commonsense. Even here, we again see the irreverence English speakers have for some of these words. It makes sense to import the Urdu/Persian word khaki into English as it's easier than saying dust-coloured or yellowish-brown but it really grates one's sensibility to hear it being pronounced kaaake. Pronouncing Jodhpurs I reckon is more subtle and ignoring the 'h' or not trying to pronounce jodh correctly is perhaps acceptable but I'd like to hear from someone familiar with its correct pronunciation. (BTW the word conjures up bad memories, I had to wear those ugly things as a kid and hated them).
Whilst dredmorbius's list uses non-Latin scripts, the time-honoured tradition of importing words from languages that use Latin scripts continues to make sense if suitable English words don't exist or if existing phrases are long, for instance, blitzkrieg and zeitgeist. are good examples of modern imports. Whether English ought to honour the capitalized noun format for imported words that are capitalized by default on grounds of authenticity is moot (but I'd reckon it'd only have a snowball's chance). Personally, I'd agree with Mark Twain on this who thought it would be a sensible idea if English adopted the practice as a general rule for all nouns as it would help avoid confusion. But then, I'm only pipe-dreaming.
If you think I'm overly pedantic about these issues then you're likely correct, but again I'd stress I've come to these opinions because I've found languages difficult. Errors in one's texts whether they be spelling mistakes or grammatical errors detract readers' attention away from the text and lessen the impact of one's message. Excellent fluency gives one freedom of expression, without it, then it pays to be cautious.
I'll finish by saying again I'm of the opinion that the default keepers of English—those who have true fluency and excellent mastery of the language—often ignore these issues simply because they are fluent, as fluency gives them the ability to bend and manipulate the language at will (it's like riding a bike, once one knows how one never thinks about how it's done ever again, it's thus understandable why the mechanics of English receive so little attention these days).
Unfortunately, this situation leaves the rest of us to wallow around in the complexites of English without decent guidance, and I reckon this is bad. English has already fragmented noticeably in my lifetime, it's becoming more disorderly over time. Many factors have led to its increased entropy—globalization, internationalization, the Internet and because teachers have ceased to exercise its rules in a pedantic fashion as they once did—just to mention a few. Thus, I'm now of the opinion that without some ongoing method of keeping its structure coherent that in a century or two hence or perhaps even sooner it seems likely English will spilt into almost unrecognizable dialects and cease to be a dominant world language.
It'd be nice if I were to be wrong.
BTW, you'll note the mixture of US and British spellings here. Normally I'd harmonize to one or other before posting but this time I've left the text as written.
Words like 'manga' or 'anime' come from Japanese (written in kanji) or Greek words like onomatopoeia (Greek is a different alphabet from Latin) are ones that come to mind.
I realize my question was asking for examples of loanwords, but my intention was to ask for loanwords that also adapt the "foreign pronunciation and foreign spelling".
I am well aware that English borrowed plenty of words originating from non-Latin scripts, however based on your phrasing I thought you are saying something more than that is going on.
When a word is borrowed from a non-Latin script into English what else could happen other than adapting/borrowing the foreign pronunciation+spelling into English as well?
Excellent comment. Whilst I'm no scholar of languages I'm aware of some of those matters to which you refer so my earlier comment is somewhat rhetorical.
I became curious about diacritics decades ago when learning French at school. At best, I was only an average student of languages including English, so I found French quite a challenge—my strengths were in the science subjects and if I hadn't been pressured by my parents it's likely I wouldn't have studied it (my mother has French relatives thus I have French cousins and my father, a mechanical engineer, whose native language was English studied both French and Latin at school and later acquired some German—hence the pressure). In hindsight it was a damn good thing I was pressured as it made learning other languages later much easier (a modicum of German and Latin—learning German was out of necessity as I'd lived in Austria for some years).
My French pronunciation was pretty abysmal and grimaces on my teacher's face during oral French were testament to that. However it improved after diacritics came to the rescue, I then found it easier to pronounce words such as être, café and Rue Taherè with some degree of accuracy (Rue Taherè is in Paris where my French relatives once lived, thus it was essential I pronounced the grave correctly).
Sorry, but that long intro was necessary to justify my point that those who are not particularly good at languages and or spelling can benefit from the use of diacritics. At the beginning I didn't know how to pronounce those French words accurately but diacritics greatly simplified the process. What was particularly good about them was that I didn't have to remember the various pronunciations of 'e' across dozens of different words, that luxury isn't afforded to learners of English! French was hard enough with its gendered nouns and such without that additional burden. It's why I pity those who have to learn English as a second language, the lack of diacritics is just one awkward aspect of this 'tortured' language that makes learning it so difficult.
One thing I've learned is that those who have a natural aptitude for languages and spelling and who develop fluency easily often fail to appreciate how really difficult they are for many people and I include native speakers here.
Given my struggles with French I thought I was pretty dumb when it came to learning languages but my view changed a little some decades ago after I attended a lecture at the Australian Museum in Sydney given by Noam Chomsky. The notable linguist made an unexpected (and what I thought was a rather extraordinary) offhanded remark that I'll never forget which went something to the effect:
"I am not good at acquiring languages, a new language can waft down the corridors of MIT and it can completely pass me by only for me to later find that my colleagues acquired it with considerable ease—much to my chagrin."
What Chomsky taught me was something that I was already subliminally aware of, which is that people can acquire language skills in quite different ways and those who acquire them with ease do so in ways different to those of us who have difficulties (it's why they often don't understand why some of us find learning languages difficult).
I learn languages like mathematical equations, I atomise them bit by bit which means I want exact literal translations of foreign words and phrases before I'll parse them into idiomatic/vernacular English. Those who are good at languages don't do this, they learn to go directly from the foreign language to their native vernacular sans atomization. Atomization is a mode of thinking and it comes from a reductionist mind. Whilst experience has taught me that reducing languages to atomized bits is not conducive to learing them it's nevertheless very hard to kick the habit.
"The Great Vowel Shift came along and fucked up all the long vowels, and separate changes caused "a" to break into 2-3 sounds..."
This, I reckon, is still very relevant to present-day English. Whilst only part of the problem it accounts for the peculiarities we now have with, say, 'gh' words such as cough and though which I've mentioned. Though they've morphed from their earlier guttural sounds their spelling hasn't yet caught up. Combine, this with the many other influences on English, Anglo Saxon, Norman invasion, etc., etc. and it's little wonder the language is in a complicated mess.
I don't have any specific proposal about how diacritics could sort this out, I'm only suggesting that it would make sense if we English speakers spelled [spelt] words the way we pronounced them, say, somewhat akin to German.
Of course, attempting to force change on a language is doomed to failure, it seems the only way is for it to evolve. The 'z' and 's' spelling differences between US and British English are already controversial enough without trying to change, say, you to just u, or through to thru. And that's just spelling, imagine the difficulties of trying to force through grammatical changes such as changing the second person you to youse or to the archaic thou even though it would make sense. Incidentally, my father's old German textbook translated the German second person du as thou. I've not seen this done in any modern textbook but it's an almighty powerful way of embedding the notion of never committing the faux pas of using du and that sie must always be used unless of course it's very clear the situation calls for it.
What I find annoying is that few Anglophone speakers care about these mstters. Unlike the Académie française, (which although essentially ineffective, it at least acts as a reminder), there's never been any authorative oversight of English (and with its dubious history it's doubtful that any oversight would have ever been possible). Nevertheless, without it English has been left to self-styled people like Noah Webster to make improvements, however without any formal structure in place there's little to propel ongoing effort forward after they're dead. Murray, OED et al have done an excellent job at documenting word meanings and their spellings but they've had little impact on actual word usage. Moreover, it also seems that dictionaries have little impact on word usage in orher languages, words only get added to dictionaries after they've taken hold in a language, not before (I recall a once Dutch girlfriend of mine making this exact point about that remarkable Dutch tome—the Woordenboek der Nederlandsche Taal).
In essence, whilst reformers such as Webster have done good work and have done so with good reason, in some ways they've made matters worse. For instance, the fragmented spelling caused by differences between British and American English is already a nuisance and counterproductive in today's global world. Yes, it's perhaps only a small matter in the grand scheme of things but it's indicative of how difficult it is to make changes to a language, especially so English. Whilst, say, German has had minor reforms to its spelling and punctuation in recent decades (the German orthography reform of 1996), such reform is hardly applicable to English and can't be used as a test case for a number of reasons. For starters, German has changed very little over hundreds of years, and with a little effort modern-day German speakers can read texts going on a thousand years old but obviously that's nigh on impossible with English without special training.
Moreover, the tortuous history of English has left us with an almost incomprehensible English grammar. By that I mean it doesn't have 'clean' rules, it's much messier and less consistent than say Latin, or French, or German, and I'd suggest that there's no one who comprehensively understands all of its rules. I own a copy of The Cambridge Grammar of the English Language which, going on 1,900 pages of fine type, is testament to the fact. I've often thought this book should always be hidden from prospective learners of English for fear of giving them apoplexy: https://en.m.wikipedia.org/wiki/The_Cambridge_Grammar_of_the....
If you read that Wiki then it'll be obvious that even with such a well-researched book about English grammar written by experts in the field that there is still noticeable disagreement amongst them (see the section headed Reception). These disagreements back up assertions about inconsistencies in the language.
I was intending to comment on your point about words imported into English as it's an important aspect of the problem but the post is already too long. I'll think about it and perhaps I'll post it as an addendum.
This hypothesis seems to ignore the vast number of English words that come from French, even despite the vowel shift
Divorcee (we normally say divorsay not divorsee so it should be divorcée) and especially fiance (fiancé) are good examples (employee has changed pronunciation but technically should be employée).
Many of those French words in English though come from Norman French which is very different to the modern language. French having also undergone a major vowel shift as well as dropping pronunciation of some final consonants so that some of the “English” pronunciation of certain “French” words is arguably more correct
Your examples are not consistent dialectically, in my dialect and many related dialects (Northeastern Irish, Scottish and Northern English) “divorsay” is anglicised and pronounced “divorsee”.
That said, wealth, and proclivity to pretentiousness, not just dialect are also factors in the stylings of one’s speech, including which pronunciation to use.
I don't have a definite answer on why some writing systems decide to use diacritics why other don't. I'm not sure there is a single reason for English.
But I think a bit of historical context would help. Many European languages derive from Latin, and most other were influenced and borrowed its alphabet. Latin didn't have diacritics, despite having pronunciation accents. E.g. "mania" could mean "mănia" (madness) or "mānia" (a kind of spirit). With time, many short vowels disappeared or changed, e.g. "orphănus" became "orphenin" then "orphelin" in French. I suppose that, for languages that are mostly an evolution of Latin, introducing diacritics was a way to mark words that were different from Latin words. It was even more useful because Latin was the written standard (a moving standard across centuries).
By the way, French and English are languages were you can't be sure of the pronunciation of a word you've never seen. Some studies have shown that other languages are much more efficient to read and write. For instance, Spanish readers barely slow down when reading a text that contains rare words, while English readers stumble. I remember stumbling when I first encountered "antienne" in my native language, or "recipe" and "gaol" in English.
> French and English are languages were you can't be sure of the pronunciation of a word you've never seen.
English yes of course, but French pronunciation is very regular and only a few loanwords don't follow the rules. "Antienne" is regular and pronounced the same way all other -tienne words are as far as I know (like, say, Étienne). The most difficult rules to master though are those that take into account the origin of a word (Latin, Greek or a Germanic language), especially for how to pronounce "ch", but these are essentially loanwords.
The reverse is not true though, there are many different potential ways to write a word that produce the same pronunciation.
I always found it weird how English seems to basically consist of only exceptions and yet it is so much easier to learn and master than French. At least from my experience as a non-native speaker of either. And I say that as a native speaker of a very very regular language.
I always felt like in French you never know how to write something. And seeing it written you sort of leave out the last three or four or maybe none of the letters when speaking but you never really know from just seeing it written. Depends on if you are speaking or singing too. An Edith Piaf will sing French like it's German and pronounce every letter. But you're not supposed to.
And don't get me started on Monsieur. I mean come on, it basically is "my sir" i.e. "mon sieur". But you say "miss" (speak this as an English word) "yeux" (speak this as the French word for "eyes") and if you tell this to a French person they look at you like you are trying to kill them.
And in the end it does all sort of make sense and they all are intertwined.
Like "have" and "habe" and "a". (English, German and French). If you take a French person trying to pronounce either the English or German word you basically are left with just the "a" if you try to write the word down from someone saying it. "h" is mue. And then you leave out the last few letters. French people have a hard time saying "have" or "habe" properly.
"And I say that as a native speaker of a very very regular language."
And what language is that?
"I always felt like in French you never know how to write something."
Similarly, I've had friends and colleagues who are native German speakers and they've also had one hell of a difficulty in learning French, seemingly more so than native English speakers despite them being reasonably fluent in English. Maybe understanding English is easier for them because of its Germanic roots which are largely absent in French.
Perhaps native English speakers find learning French easier because English is a sort of bastard language that's also part French.
I'd be interested to hear from others who've had similar experiences.
Wasn't it William the Conqueror who made French the language of the English court? That's how English has quite some French influence too or so I've heard. Of course neither side will want to speak much of it. Nor will the English want to speak much of the quite German blood influence of the Royals and how the English behaved at the start of WWII. It's just all so mixed up through history.
It's weird how German and French are actually close in that they have quite extensive and regular grammar and genders, like Latin and if you translate French word for word to German, it sometimes sounds like someone is just speaking weirdly structured middle ages German or something and is actually understandable. Of course only in writing, since the pronunciation is "very French" :) And then there's Dutch which takes lots of English words and pronounces them a bit more German or takes German words and pronounces them quite English and then that's called Dutch (not to take anything away of course, there are completely Dutch words that aren't like that).
"Wasn't it William the Conqueror who made French the language of the English court? That's how English has quite some French influence too or so I've heard."
During much of this time French was the official language of the English Royal Court and government business. Thousands of French words entered the earlier English (Anglo Saxon—based) language during this time.
"Nor will the English want to speak much of the quite German blood influence of the Royals and how the English behaved at the start of WWII. It's just all so mixed up through"
Don't get me started on that. Those incestuous English/European monarchies caused untold harm to the world. The problems go back before WWII to much earlier than even WWI. Out of embarrassment and under threat of loss of the monarchy the British Royal family changed its name from the German Saxe-Coburg to Windsor in 1917. https://www.theguardian.com/uk-news/from-the-archive-blog/20...
> The reverse is not true though, there are many different potential ways to write a word that produce the same pronunciation.
For any "real" language it's probably impossible to have a 1-1 mapping in both directions.
French tries really hard to embed everything required for pronunciation in the spelling, but because of dialects and phonetic drift that leads to some pronunciations having multiple viable spellings.
German seems more concerned with the other direction: for each pronunciation (in high German) there is one obvious way to spell it (barring some baggage around c/k/ck and s/ss/ß, but even that has rules that can be applied to mostly clear it up). The other direction also mostly works, the spelling-pronunciation mapping is fairly obvious even if you don't know the word, but it is more ambiguous than French (no way to differentiate e è é or ê; and ë is mostly inferred).
English somehow seems to neither try to maintain a consistent spelling-pronunciation mapping nor a pronunciation-spelling mapping, nor a tradeoff between the two. At least unless you know the origin of the word and are aware of pertinent linguistic history of the last ~200 years, or develop a good heuristic understanding for those. Probably because of the amount of coordination required to maintain a decent mapping in either direction. Both France and Germany have influential central control on spelling.
It's also that french and german have somewhat purer linguistic roots, english started as a west germanic language, got admixtures of old norse, followed by massive injections of french and latin (roughly a quarter of the modern vocabulary, each), and uncontrolled pronunciation shifts.
Written old english was quite regular and phonetically sound. It also had a fair number of diacritics (after latinisation).
> For any "real" language it's probably impossible to have a 1-1 mapping in both directions.
I understand korean is rather good in that perspective, but also that it has a relatively simple phonology and the korean script is constructed, it was designed from scratch for korean.
Interestingly and not dissimilar to TFA's assertions, for a very long time it was only used by the common class and disdained by the aristocracy, 19th century nationalism and separation from china led to its revival.
> For any "real" language it's probably impossible to have a 1-1 mapping in both directions.
Ukrainian is pretty close. It helps that it doesn't have a lot of reduced vowels that sound indistinct, and that double consonants are very unusual. The only ambiguous part is stress, technically stress marks exist, but they are used only in dictionaries.
In fact, most Slavic languages are _reasonably_ close to 1-1 mapping. Some are closer, some are further. For example, Russian has a lot of reduced vowels that you have to remember how to spell.
> For any "real" language it's probably impossible to have a 1-1 mapping in both directions.
As far as I know, Finnish is very very close to 100% phonetic: one letter, one phoneme, with some slight exceptions and some wiggle room when it comes to dialectical variation and loanwords, especially those with nonnative phonemes like /b/ or /g/. The velar nasal ŋ is the biggest exception, featuring in "nk" and "ng" (like in many other languages) and the rules aren't entirely straightforward. Another exception is that in spoken language a glottal stop or gemination can appear at some morpheme boundaries but isn't reflected in written text.
Except that written Finnish represents, well, written Finnish, which is an artificial standard frozen centuries ago that no one actually speaks outside a few formal contexts. The country has diglossia like, for example, Greece up to the 1970s. When Finns speak, they use puhekieli, spoken Finnish, and there are differing ways to represent spoken Finnish in writing, and different readings for that written representation depending on the person's native dialect.
That's quite correct, of course. Still, you can write puhekieli (as most of us do daily in informal textual communication such as IMs) and it's still very close to phonetic, although there are individual differences in how "spoken" and dialectical their written puhekieli is. Some common puhekieli traits like the widening of the diphtongs "uo" and "yö" to /ua/ and /yä/ are more rarely reflected in informal writing unless one wants to specifically underline or exaggerate the pronunciation (eg. for comedic effect, or in written fiction featuring persons speaking a dialect).
"French tries really hard to embed everything required for pronunciation in the spelling, but because of dialects and phonetic drift that leads to some pronunciations having multiple viable spellings."
Agreed, but this often leaves native English speakers perplexed for want of understanding. For example, an English speaker would likely contend that there's little or no difference in the pronunciation of Bordeaux and, say, 'Bordo'.
Said speaker would also contend that the four-letter eaux is overkill for what could be satisfactorily replaced with just a single 'o'.
'o' has two (or more) possible pronunciations: o and ɔ (open and closed), while "au" has only one: o
"Bordeaux" is pronounced the same as "Bordo" would be, and "eaux" is the same as "ô" or "au".
The rules for "o" are a bit complicated but it's mostly always open before two consonnants, like in "dehors" and the first occurence in Bordeaux.
It depends a lot on local accent, but "cosse/Causses" have different "o" sounds in the standard pronunciation, while pose/pause have the same closed sound.
I pronounce "Bordeaux" like in "eaux" but I guess I haven't heard it pronounced very often. I grew up in Greece and I speak French from an early age and haven't lived long in France so there's nuances I miss probably.
Bordeaux should be pronounced like eaux so that part is correct :) I just mean that Bordo would be the same, it's all closed o sounds.
In the north and the south east (Toulouse, not Bordeaux) people tend to only use the open o sound though (especially obvious since Toulouse's nickname is la ville rose which should be a closed o) but that's not standard.
Interesting that you're from Greece, I have met quite a few Greeks who speak rather good French and it's always fun to notice how much of French comes from Greek.
> "Antienne" is regular and pronounced the same way all other -tienne words are as far as I know (like, say, Étienne).
There is nothing regular here, since most dictionaries have no other common noun with the same ending. For instance the famous Littré only has "antienne" and "laurentienne", the latter with a "s" pronunciation.
As for the rare words ending in "-tien" (and "-tienne" for feminine adjectives), roughly half of them are pronounced "t" and the others "s". E.g. "chrétien", "tien" (and derivatives like "entretien", "maintien"), against "béotien", "capétien", "égyptien".
So, in French, in the words "chrétienne" and "égyptienne", the "t" is pronounced "t" only with the former word, even if both words have their origin in Greek through Latin.
The rules of pronunciation in French are... hard to describe. Maybe it's because French is a natal language for me. I mean, how do you explain oeuf vs. oeufs? No, French has lots of particularities to it, just not as many as English. Spelling in French is definitely a lot more regular than in English, but the rules of spelling in French have lots of exceptions anyways (e.g., all words that start with 'af' have a double 'f' except [long list of words like afrique]).
Just like 'hour' and 'our' in English I assume. They just happened to sound exactly alike for different(?) reasons.
EDIT: Probably similar reasons, actually? The "h" here participates in in a vowel sound (effectively silencing it) and the 's' suffix is often (but not always!) silent in French?
Spelling is a tension between going purely for sounding things out (e.g IPA) and being able to find commonality with other pronunciations (dialects spoken by neighboring peoples), etc. It's going to get ad hoc because people.
I went down the rabbit hole of trying to figure out what "match"(as in the same thing) and "match"(what you start fires with) had to do with each other.
It turns out they are different words with different origins that are spelled and sound the same. whee.
A fun fact "match"(the fire one) shares a root with mucus, it has something to do with oil lamps. And candle snot is a thing.(it is the bit of the wick that needs trimmed.
They are pronounced the same, however in casual speak the "ou" sound tends to get dropped in favour of the "r". Think of the phrase "to our house," spoken quickly, with the end of each word blending into the beginning of the next.
There may be some truth to the Irish influence. On the US generally (more people of Irish descent than Ireland itself), and California specifically as a lot of transcontinental railroad workers settled here.
Ha! From my perspective as a native English speaker I say that at least you know what they are. I'd suggest that well in excess of 90% of native English speakers wouldn't have a clue about what they are—such as is the paucity of English in such matters.
Hence my original point about the lack of diacritics and such in English.
Ligatures are hard to use on a U.S. keyboard layout (EDIT: Even on French keyboard layouts, near as I can tell). I don't know how to type them anyways. I use a Unicode character picker if I ever really need them, and it's so rare. No one will fail to understand if I write oeufs, and I'm not in French elementary school where super-pedantic teachers might get angry if I don't get that right.
"The New Yorker" magazine uses umlauts to mark the different pronunciations of doubled vowels. The write "cöoperate" to distinguish the way you pronounce its double o from the sound of the double o in "chicken coop".
It should always be the second vowel in English I think.
There are 3 accents that see use in native English:
* the diaeresis (äëïöüÿ) or trema (in French influence), indicates that a vowel that should start its own syllable (that is, turning 1 syllable into 2, thus the name), not be a diphthong. Nowadays usually a hyphen is used instead, or (especially in American English) removed entirely for words that have become common. Most commonly this is 'oö' or 'eë', but others exist e.g. "naïve". This is not to be confused with the German umlaut, which does something completely different and is not productive in English.
* the acute (áéíóúý) indicates an alternate pronunciation (or occasionally stress) of the vowel compared to the usual rules (especially if there's a homonym); there is no consistency in what that alternate is (since many English vowels have 3 or more pronunciations). Sometimes this can create a syllable where there was none, but normally it leaves the syllable count unchanged. The usual example is "résumé"; most other examples are for words derived from French where French had it, but it gets used for other languages that didn't have one too ("saké" and "Pokémon" from Japanese, "maté" via Spanish). I can't think of examples in "native" words (barring trademarks, but I don't think any of those have genericized), but if you go back far enough, aren't they all loanwords? Certainly many of the words that this applies to aren't very foreign anymore.
* the grave (àèìòùỳ) is used in poetry to force an extra syllable that would not be used in prose. By far this is most commonly for a past participle like "blessèd", but the rule is productive and can be used on any vowel. "Learnèd" is a weird example where the pronunciation changes even in prose (for the adjective sense), but the accent is usually only in poetry still.
I think a lot of the English peculiarities with spelling come from it adopting other vocabulary without enforcing a change to English orthography onto it, so we end up with a crazy mix of Germanic, French, Latin, Greek and other language spelling conventions. Also English speakers were perhaps more willing to coin Latin/Greek terms for new objects than other languages - contrast ‘television’ from Greek and Latin in English, with Fernseher (literally far-looker) in German.
With respect to dialectical marks, I’m not sure why English in general doesn’t use them (although some might argue we do if we spell words café or naïve), but they don’t seem to be ‘fashionable’ in related languages - Dutch doesn’t seem to use them either, and in German they feel somewhat optional - ä, ö and ü can be replaced with ae, oe and ue respectively (e.g. where the accented forms aren’t available) and ß can be replaced with ss. Indeed correct alphabetical sorting of German demands this.
There are actually a good number of words that historically used a diaeresis in English, especially up until about a century ago: noël, reëlect, reënter, Chloë, Zoë, noöne, preëmpt, daïs, coöperate, coöpt, coördinate, zoölogy...
The idea was to ensure that the second vowel would be seen as its own syllable.
Dutch is not as stable as other languages over time. We've had a few spelling reforms and trying to read 17th century Dutch is actually not that easy even for native speakers. Old Dutch is more similar to German than modern Dutch.
I live in Germany and I always joke that Dutch is basically simplified German. The two languages are obviously related but we got a bit pragmatic about some of their more convoluted grammar choices and they gradually disappeared from the language. For example, we have male and female words but we only use two instead of three articles: de for male/female, het for the neutral form instead of die/der/das. Which is similar to the English the and it. Several articles still used in German (e.g. des and den) completely disappeared from the language but used to be common. A nice example where you still see this are place names like Den Haag or s'Gravenhage, which is short for Des Gravenhage and refers to the same city (The Hague).
As for accents, those would be one of the things that got removed from the language or have become optional. E.g. you don't see a lot of ë used in modern Dutch anymore but that is a recent change (in the nineties). The use of computers has accelerated this because we use keyboards without support for this (English layout basically). In highschool, I used things like wordperfect and memorized some of the key codes for accents so I could type this letter. Oddly, an e with an umlaut is not a thing in German or in French.
Also as you mention we use letter combinations for things that have e.g. an umlaut in German. For example ö is written "eu" in Dutch. The ü becomes "uu" and we use oe for the German u. French accents used to be common but have not survived language reforms. We even have some 3 and 4 letter combinations that tend to confuse the hell out of foreigners. E.g. the Dutch translation of lion would be leeuw. Löwe in German would be close in pronounciation.
I lived in Sweden for a while. The Scandinavians did a similar modernization and simplification of their grammar last century. This actually makes it relatively easy to learn. It's a pretty regular language with a fairly straightforward grammar. Also, the grammars of Norwegian, Danish, and Swedish are basically the same. The Swedish use a few different letters e.g. ö in Swedish vs. ø in Norwegian and Danish. But otherwise the languages are very similar in written form. Icelandic is a bit weirder but basically a Scandinavian language as well.
"...in German they feel somewhat optional - ä, ö and ü can be replaced with ae, oe and ue respectively"
At least in German you've got those options. For instance, Gärtner becomes Gaertner without any loss of information. That cannot be said for English when the only (usual) option is to change the word to Gartner.
> contrast ‘television’ from Greek and Latin in English
I recall reading somewhere that back in the day, there were complaints that this new word coinage "television" would never catch on specifically because it mixed Greek and Latin.
I wonder how personalized are autocompletions on Android and iPhone. How strongly could we infer, if at all, that someone had used the term dialectical before? Are autocompletions personalized enough that they might suggest dialectical over diacritic because someone had used (or received!) words like Hegel or capitalism? (Or maybe we could merely only infer someone's phone is using a European English word probability database as in European discourse dialectics might be even more common than diacritics. ;)
IIRC Benjamin Franklin tried to make use of diacritics in his (U.S.) newspaper, and it didn't fly.
English could use a spelling simplification, but it's not going to happen -- it's too late and there's too much inertia in the current way of doing things.
Also, FYI, the circumflex in ê in French isn't there to change how e is pronounced but to denote that after the e there used to be an s that has been dropped, and that's how it always is for the circumflex accent in French. Sure, être is not pronounced the same as it would be were it written as etre, but I think that's accidental and not really the essence of the circumflex accent in French.
> English could use a spelling simplification, but it's not going to happen
Sure it is, continuously and gradually.
> Also, FYI, the circumflex in ê in French isn't there to change how e is pronounced but to denote that after the e there used to be an s that has been dropped, and that's how it always is for the circumflex accent in French. Sure, être is not pronounced the same as it would be were it written as etre, but I think that's accidental and not really the essence of the circumflex accent in French.
IIRC, and its been a long time since I studied French or made use of anything but the the most basic bits, for each vowel there is a consistent (in the general case, but there may be exceptions) pronunciation change associated with the now-elided “s”, so the circumflex serves both historical and phonetic purposes.
There has been no success to any attempts to regularize the spelling of English in... what, over a century now? Is the New Yorker still the only major publication insisting on using umlauts? :)
I'm not saying that English won't evolve, mind you, only that English spelling will evolve [a lot] more slowly than the rest of the language.
> IIRC, and its been a long time since I studied French or made use of anything but the the most basic bits, for each vowel there is a consistent (in the general case, but there may be exceptions) pronunciation change associated with the now-elided “s”, so the circumflex serves both historical and phonetic purposes.
Wikipedia says it alters the sound of a, e, and o. But I would pronounce château and chateau substantially the same. I would pronounce fantôme slightly differently from fantome. Être and etre, on the other hand, would have substantially different pronunciations.
| Whereas the umlaut represents a sound shift, the diaeresis indicates a specific vowel letter that is not pronounced as part of a digraph or diphthong.
I.e., graphically there's no difference. Thus I don't really care to call it one word or the other. Sure, 'diaeresis' is more correct that 'umlaut', but most English speakers -I suspect!- are more likely to recognize the latter than the former.
As I said, there is a very distinct difference in writing. The umlaut is written more like ő with two short lines (handwriting only!) whereas the other is written ö (two dots).
It might just be a stylistic thing that distinguishes not diaeresis from umlaut so much as German handwriting from others'. That is, it might be that Germans don't distinguish between the two themselves, and then also the French, the Spanish, etc. If that's the case then I'd say there's zero difference between the two. A test of this would be to ask a German person (or several, including linguists as well as non-linguists) to handwrite a German word that uses umlauts and also a non-German word that uses diaeresis, and if they write it the same then I'd conclude that there is no difference.
And if it's handwriting only, then is there a difference as far as Unicode goes? No, there is not. In Unicode diaeresis and umlaut are the same (U+0308 Combining Diaeresis).
Why are you putting words in my mouth? And when did you say you taught this?
The evidence I'm finding is that there is no semantic difference between umlauts and diaeresis, and that the only real difference is in handwriting style, and I'm then wondering whether that style difference has to do with the language one is writing in or in one's upbringing, which is a very fair question to ask.
The style difference seems smaller and less noticeable than the handwriting difference in Latin ligatures. If the difference lies with what language one is writing, then it is closer to a real difference, but if it lies in where one learned, then it's a negligible difference.
My personal pet peeve with the English language is the placement of symbols at the end of a sentence which change the inflection of the sentence? Oh that was a question, I guess you didn't know until you got to the very end of the sentence! Yes I was shouting that whole time, I guess you didn't know.
I find oh so irritating when reading books and not knowing the inflection of the characters speech until the END of the sentence. I much prefer how other languages put a symbol at the start and end of the sentence. Grumble grumble.
> For instance, in French the letter 'e' can take four forms—without accent, grave è, acute/aigu é, circumflex ê. This makes the letter much more flexible and greatly assists with pronunciation.
The circumflex is not related to pronunciation; it tells you that in the historical form of the word an S followed the vowel. The other three are just three different French vowels. It's an odd conflation of concerns.
> Similarly, proper nouns such as Wycombe, Warwick, etc. defy logic when it comes to pronunciation and would greatly benefit from diacritical marks.
Same problem; diacritical marks aren't what you want there. What's happening is that the spelling has changed more slowly than the pronunciation. You don't want to disambiguate the pronunciation of Warwick (between what and what?), what you want is to realize that the spelling isn't trying to tell you how it's pronounced. It's telling you how it used to be pronounced. (Though, in that case, it's not at all difficult to predict the modern pronunciation.)
> A grave accent over ⟨e⟩ indicates /ɛ/ in positions where a plain ⟨e⟩ would be pronounced /ə/ (schwa).
> A circumflex over ⟨a, e, o⟩ indicates /ɑ, ɛ, o/, respectively, but the distinction between ⟨a⟩ /a/ vs. ⟨â⟩ /ɑ/ is being lost in Parisian French, merging them as [a]. In Belgian French, ⟨ê⟩ is pronounced [ɛː]. Most often, it indicates the historical deletion of an adjacent letter (usually ⟨s⟩ or a vowel): château < castel, fête < feste, sûr < seur, dîner < disner (in medieval manuscripts many letters were often written as diacritical marks, e.g. the circumflex for ⟨/s/⟩ and the tilde for ⟨/n/⟩).
What I can take from that (lacking detailed knowledge of French) is that ê always refers to the same vowel as è (with a length distinction in Belgian French), and sometimes, in contexts which defy simple description†, e refers to the same vowel. (The vowel in question is similar to that in the English word DRESS.) è is used to avoid the suggestion that a plain e might be read in the default way, as /ə/, and ê is used to remind you that the word used to contain an s.
It does not appear that a circumflex will generically clobber other accent information; the verb "listen" was formerly escouter and now, pronounced with /e/ and not /ɛ/, is spelled écouter according to the pronunciation rather than êcouter according to the etymology.
If you have a particular example in mind that the treatment of modern French doesn't seem to work for, Wikipedia also has a lot of material on sound changes between Latin and French: https://en.wikipedia.org/wiki/Phonological_history_of_French
† The page has a table of how to predict the pronunciation of unaccented "e". These are the listed exceptions to the default reading:
(a) "e" appears before multiple consonants, or before a double consonant, or before a silent consonant that is followed by t.
(b) "e" appears before a silent consonant that is not t and is not followed by t.
(c) "e" appears at the end of a word, except for some (but not all) one-syllable words. In some contexts, those non-exceptional one-syllable words will also be exceptions.
(d) "e" as it appears in the words et, femme (first position), solennel (first position), or any word ending in the suffix -emment (first position).
A context that doesn't match any of these exceptions, including the unspecified exceptions, calls for a grave accent over the e when the vowel is /ɛ/. Otherwise, you're good to just use e.
All this leaves me unenthusiastic about the concept of "make it easier to pronounce unfamiliar words by making your spelling more like French".
> It does not appear that a circumflex will generically clobber other accent information; the verb "listen" was formerly escouter and now, pronounced with /e/ and not /ɛ/, is spelled écouter according to the pronunciation rather than êcouter according to the etymology.
This is what I was referring to. As a French learner (I just recently began the journey), you don't know whether to pronounce something as é, è, or e when you see a ê. And there's not generally a rule for figuring it out, at least not as far as I've encountered. For a native speaker this isn't really an issue, but as a learner it's quite annoying.
I recognize that those learning my native language, English, have a far worse time of it however.
Don't quote me on this, but I vaguely recall an explanation years ago, that the one-two punch of Norman French and the Great Vowel Shift arriving were enough for spelling and pronounciation to permanently decouple in English speakers' minds
If you think of english spelling as highly conservative (it is) and only casually connected to pronunciation (an overstatement, but true) the spelling isn't so bad. The conservatism preserves meaning which is lost in languages like German where spelling is periodically "reformed", severing spelling from a word's historical semantics.
You could just as well consider words like "debt", "lead" or "Wycombe" to be kanjis, and nobody complains about their disconnect from pronunciation.
> You could just as well consider words like "debt", "lead" or "Wycombe" to be kanjis, and nobody complains about their disconnect from pronunciation.
Back in the day, I spent a couple of years teaching in a high school, which exposed me to a fairly large group of Australians of Japanese descent who were learning Japanese as a second language. They did complain about it, and quite fiercely, especially the ones learning it due of family pressure, and not innate interest.
If most people don't complain, I guess it's because a.) they have no exposure to kanji in the first place, b.) they grew up with it, or c.) because their environment regards it as culturally insensitive to complain about foreign writing systems. If none of these apply, you do seem to get plenty of complaints.
None of these factors hold for English, so it makes sense that people complain about the spelling.
But people don't read letter by letter to recognize a word. The vast majority of readers are fluent and are seeing a word for the 1000th time. Optimizing for foreign learners instead of native adults seems wrong.
Words vary in pronunciation across time and across regions and dialects. I would assume those words "used to" (which doesn't sound like it has a 'd') sound like they were written. So what would you do, rewrite the dictionary every couple centuries for zero gain?
Maybe one of the reasons the pronunciation varies across time and regions is exactly because the pronunciation rules aren't really standardized, so people can get creative.
I wonder if the languages with strict pronunciation rules tend to change less. In my native language we tend to get new words of course, but if I read a text from 100 years ago I will be able to pronounce every word correctly, even if the word is now archaic and fallen out of use. I might get the accent wrong, and indeed accents do vary across regions, and sometimes even between neighboring towns.
That reason doesn't make sense, because pronunciation has always varied across time and most people were illiterate in the past. the written word, is not the normative form of a language. words aren't made of letters, and letters aren't made of sounds.
It's hard to know for absolutely certain, but a lot of it might be because diacritics would just make English even harder to read/write. By the invention of printing, English was a very confusing mess of the germanic and romantic languages and there was no absolute agreement on pronunciation. Plus, we were in the middle of the great vowel shift, so slapping diacritics on letters would have been a fools errand since they wouldn't sound like that diacritic says they do. English's lax pronunciation, linguistic changes during the middle ages, and strong romantic language influence, would make it even more confusing to create solid rules on diacritic usage, especially as many people would still be illiterate, and English was an "easier" language than the very complex Latin.
I think I can speak for many many people who casually read Wikipedia articles by asking: how does one go about practically learning the IPA, so as to be able to read (not even write) something like that? every time I come across a Wikipedia article that uses IPA to explain pronunciation, it's wholly inscrutable gibberish, completely useless if there isn't a listen-to-someone-saying-it button.
In my case, we learnt it in high school. Our teachers insisted on it being very useful to look up the pronunciation of unknown words, and we even had exams where we had to transcribe from IPA to the standard spelling.
This was in Spain though, were the national language has a very consistent spelling, and English is taught as a second language. I don't know how it is in other countries.
Just like you learn anything else. Flash cards, pure repeated exposure, reading about the symbols long enough that you have a framework to fit them into, etc.
"ˈθʌrə"'s was interesting to me, I'd render "thorough" in my ideolect (Southern-inflected General American) "θʌrow". Is your pronunciation "standard" in British English (or whatever the prestige variant where you're from is)?
ˈθʌrə is standard in British English. Likewise "borough" is ˈbʌrə in British English and ˈbʌrow in American English. "Edinburgh" is ˈɛdɪnbʌrə / ˈɛdɪmbrə in British English, and Wiktionary says this is the American pronunciation too, but I've definitely heard American say ˈɛdɪnbʌrow.
That doesn't really follow. We manage to understand people speaking English in their own accents well enough, so if people were to just write a phonetic transcript of what they spoke, we should be able to understand that as well. A standardized spelling system probably does make the language a bit easier to read overall, but it isn't a necessity.
Apparently this was one of the major barriers to achieving widespread adoption of computers in China. The sheer complexity of the Chinese written language meant that it was much harder to display Chinese characters on an old 80s monitor. (I assume that Japanese, Korean, etc had the same problem, but the article I remember reading was about Chinese.)
German is frustrating for its underuse of diacritics - e.g. tetragraph "tsch" representing a single sound, similar for "sch". In Czech, these are č and š respectively. Polish has similar issues, resulting in the infamous "spilled letters" orthography.
Economy of printing press blocks, and diacritics wouldn't be enough to cover the wide variety of pronunciations, which is inconsistent in English, as highlighted by Wycombe and Warwick. Diacritics wouldn't save those names.
My theory is that one needs a good reason to cross the ocean. Life has to be better on the other side. It is a selection process. You get people who's hands and feet tend to follow their thoughts as opposed to writing them down.
European alphabets descend from the short lived alphabet invented in Ugarit. It spread by the Phoenicians to Mycenaean Greece and Carthage and then later to the Latins who would found Rome. It should be noted the Minoans already had an older writing system at that time now referred to as Linear A, but it was syllabic not alphabetic.
Correct, the Phoenicians gave it to the Greeks who used vowel letters for the first time... though the Romans mostly inherited theirs from the Etruscans (who got it from the Phoenicians too). That led to some oddities when the Romans conquered the Greeks and started merging the diverged alphabets.
Nevertheless... some letters like aleph/alpha/A survive with and nearly identical orthography. The Phoenicians wrote it sideways compared to Latin - IIRC it was the horns of a bull and represented the first sound of the Phoenician word for cattle.
I find it quite interesting that a writing system having that higher level of abstraction was invented so early. Most early systems were pictograph/ideograph or syllabic AFAIK. Taking the leap to breaking words into syllables then syllables into sounds reduces the orthography to a compact set of symbols with relatively simple rules that is still capable of expressing an entire language.
Alphabetic scripts are arguably not better than syllabic or ideographic scripts. They take less time to learn and are better able to capture the sounds of foreign words, but they are less efficient to mentally process. An already literate scribe working on one language only gets no advantage from an alphabet.
My own theory (as an amateur interested in both linguistics and the history of this time period) is that the late Bronze Age collapse resulted in the decline of Akkadian and the accompanying cuneiform writing system as a diplomatic language, meaning that (1) records started being recorded in the local language only, and (2) what trade and diplomacy did happen had to be multilingual. In this contex, the alphabet does win out. And this is precisely when we saw the alphabet spread across the ancient near east and Mediterranean.
Fair enough. I guess it depends on what the goal for writing systems is: a system that is easy to teach helps increase literacy but that hardly mattered to many empires throughout history who would rather the rabble not be able to read.
The fact that alphabets are easier to use for technology like the printing press, telegraph, typewriters, and computers is purely a historical accident. I'm sure you could (if people didn't already) come up with an encoding scheme based on stroke order/position to transmit traditional Chinese over the telegraph so it's not like that was an insurmountable problem.
FWIW l I'm a fairly fast reader and I recognize word shapes for regularly used words. Pretty sure that's a common technique regardless of the writing system in use.
The article makes a difference between "learned scribes" and "creative people at the margins" forgeting a third possibility: learned scribes working not for the religious or secular bureaucracies, but for the merchants.
A parallelism: programmers have different styles working for an established corporation and for a startup. The bureaucracy tends to stand for the old practices, resisting change. The dynamic environment favours starting from scratch and simplicity.
>>Remarkably, two recent discoveries from around 1500 BCE do show scribes using the alphabet. But these exceptions prove the rule, because these scribes used alphabetic writing just as sloppily or playfully as its other users did. In an obscure ostrakon from Thebes and a handful of looted cuneiform tablets we find surprising confirmation that even professional writers used it unprofessionally.
It's interesting how people in this space study twins, and I wonder sometimes if that isn't more on the nose than we give it credit for.
Twins likely created the first languages - who does the first person genetically capable of speech talk to, except someone genetically and environmentally identical? Likely had the first verbal families, and then verbal tribes. Written secret codes might have started the same way, and ended up either being tribal or trade secrets. Success leads to imitation. Partial success leads to theft, or acquisition.
If you're interested in fascinating deep dives into the history of a few odd letters, the jan Misali channel on Youtube has a video on the letter W (which, along the way, covers F and Y) and another one on the letter C.
Adding to other resources shared here, archaeologist Denise Schmandt-Besserat has written about the evolution of writing (not strictly the alphabet), and much is available online:
https://sites.utexas.edu/dsb/tokens/the-evolution-of-writing...
The roots of writing seems to be in counting/tallying marks, i.e. accounting. Another great book, "Against the Grain" by James Scott, describes how both tallying and writing developed hand-in-hand with the state.
Another semi related thing I've thought about is how the invention of words comes about. I know most words today have an origin that can be traced back through text over hundreds of years, but what about the time way way back, Indo-European era, was there anything to trace back to? Was it just one or a few people who realized there as no grunt sound that meant "cold" or "elbow" so decided one day that that was the grunt sound they were gonna use and it spread naturally?
There's a theory that the original words arose via a mechanism essentially analogous to onomatopoeia. But unlike literal onomatopoeia as in "cuckoo", it's a lot harder (and required a genius according to some philosophers like Otto Weininger [1]) to come up with just the sound that when uttered in the presence of other cavemen will correctly evoke such abstract ideas as bigness, smallness, heaviness, lightness, etc.
The idea that there's something about (say) the sound "big" that makes it especially suitable for expressing the idea of bigness seems pretty plausible to me. I tried this experiment on my friend. I spoke out loud the Chinese words for big and small in a random order and ask him to guess which means big and which means small. He immediately guessed correctly. I'm pretty sure that if you can find a Chinese person who's never learnt English and try a similar experiment on them with (the sounds of the English words) "big" and "small" the result will be similar.
[1] "Cannot the whole of human history (naturally in the sense of the history of the mind and not, for example, the history of wars) best be understood through the appearance of a genius, the inspirations emanating from him, and the imitation of what a genius has done by more pithecoid creatures? Take house-building, agriculture, and above all language! Every word was first created by one individual, by an individual above the average, and the same is still the case today (with the sole exception of the names for new technical inventions, which must be ignored in this context). How else should it have been created? The primal words were “onomatopoeic” and they incorporated without the will of the speaker, through the sheer intensity of the specific excitement, something similar to the cause of the excitement, while all the other words were originally tropes, as it were, second-order onomatopoeias, metaphors, similes: all prose was once poetry. Thus most geniuses have remained unknown."
With your Chinese experiment it would only be accurate if it were double blind, i.e. the speaker themself did not know the meaning of the words. There is a lot more communication going on with speech than just the words themselves.
Regarding word generation, I grew up in poor urban areas where slang was a huge part of language, and new words were invented and spread almost daily. The sound of a word definitely played a part, but it was mostly about context - when it was used, the tone of the speech, the pitch, what older words it might have been derived from, etc. Like math operations and concepts, there might have only needed to be a few words at the start to build a foundation, a seed from which language growth evolved.
... it is clear that a theory of the alphabet as a casual and playful mode of knowledge explained all of our evidence when I first tackled this back in 2004, and still (encouragingly) explains all of the new evidence discovered in the 20 years since. What we lack is a theory of play as a mode of creativity and knowledge production in ancient writing, which I suggest as a new frontier for research on the early history of writing.
That's the most interesting part of the essay, to me (in an otherwise interesting essay).
It's an interesting theory for innovation, reborn. How much creativity comes from play? How many get their start in games - within games, doing things within the gaming world, either their play-fort or an online world or their imagination about their book or their role-playing game.
Remember when SV companies would encourage play, with rooms designed for it? (Do they still?)
The Greeks. They're the first known group to explicitly represent vowels, unlike the older Egyptian derived systems which only represented consonants and thus were abjads rather than alphabets.
While the Greeks have invented the most significant improvement of the writing system, after that when some Semitic people have simplified the Egyptian writing system by eliminating all the multi-consonant signs, any discussion about the Greek alphabet cannot omit the fact that they did not invent the alphabet, but they have only improved the Phoenician alphabet, by reusing signs corresponding to consonants not used in Greek to write the Greek vowels.
This was a huge advance, but it cannot be named as "inventing the alphabet".
It depends what you mean by alphabet. In the narrow sense (consonants AND vowels) Greek was the first language to have one - Phoenician had an abjad (consonants only), probably because most words had 3 consonant roots, with vowels varying with their grammatical role.
To elaborate on this a bit more: When discussing the history of writing systems, the term "alphabet" can be used in two different senses. In a broader sense the term refers to the set of symbols, typically in a specific order, that are based on representing phonetic elements with signs (in contrast to logographic systems). In a narrower sense, the term refers to the Greek alphabet and its derivatives, to distinguish them from their syllabic or consonant-only predecessors.
It should be noted in this context, that Semitic languages can be written with comparatively less disambiguity with a consonant-only set of symbols, as the correct vowels can be inferred with relatively good accuracy from the consonants and the context. For Indo-European languages, such as Greek, this is not the case. However, even for Semitic languages a consonant-only writing-system was far from ideal. Thus, various strategies have been developed to reduce ambiguity, such as the use of plene scriptum[1] or punctation[2] in Hebrew.
The Japanese. They're the first known group to explicitly represent Emoji, unlike the older Latin derived systems which could only represent emotion through character combinations and thus were lame rather than complete.
The Rosetta Stone is from c. 200 BCE, issued by the Ptolemies in Egypt, ruling after Alexander had conquered Egypt. The first examples of writing in an ancient ancestor of our alphabet writing in Western Semitic are from c. ~1500 BCE where they were using Egyptian hieroglyphs as a model. People were writing in descendants of that alphabet for more than a thousand years when the Rosetta Stone was carved, the Greek script derived from Phoenician which evolved from West Semitic, while the hieroglyphs on the same stone were the model for West Semitic writing a thousand years before.
OP’s author is a prof. of religious studies and has a book on Hebrews so possibly his point of view requires the preeminence of Hebrew alphabet.
I also found some of the reasoning questionable. The reason Latin teaching was the job of Greek slaves was precisely because they were Greeks and Roman nouveau rich were adorning the education of their children. Who teaches the children of elite today? Millionaires or smart poor people? The second questionable idea of his is that “sex” and stuff like that are not of interest to “elite”. This confused thinking disregarding content for medium also was a rather weak argument. Maybe the elite were using writing as a private very exclusive chat app and sending textual selfies.
Hebrew must be first if you are a religious person who believes in God speaking Hebrew letters and creating the world. It just doesn’t work if it turns out the Egyptians created the alphabet.
Well, a whole load of them did come from ancient Egypt and Babylon.
Given the Moses of Exodus came from Egypt, it’s highly unlikely he created a new script when wandering the desert. I’d imagine he had more critical issues.
Also, he was brought up as an Egyptian royal and would have been taught how to write.
It is plausible that the folks from Babylon brought their script too.
oh definitely - that must explain why I saw a guy with a white shawl and a black box filled with written prayer, tied to his forehead today.. counting grains, no doubt!
to be very clear - the ways of sacred writing are very old, and not the same as counting grains.
If you're trying to make a point about sacred writings being the first texts, you may want to consider Linear B and cuneiform, some of the oldest texts of the Mediterranean and which are almost exclusively inventory lists. While we have things like the epic of Gilgamesh preserved in baked tablets, this is the exception to the rule. For the vast majority of these most ancient texts, tabulation was the main use of writing: how many animals were sacrificed, how many sheaves of wheat were in storage, how much fruit a plot of land could produce, etc.
As for sacred writings: Many religions were hesitant to commit their wisdom to writing - one reason why so much of Greco-Roman religion is unknown to us. The Oral Torah was supposedly passed on for centuries until the destruction of the temple and fragmentation of the Jews necessitated the writing down of this knowledge. Heck, Homeric poetry (the hymns as well as the epics) was not written down until centuries of oral development had gone on; not because writing had not been invented, but because it was not used for literary material.
Something important to remember is that the transient documents in the Ancient Mediterranean and Mesopotamian days were scratched into clay (a medium which allows it to be easily erased and adjusted if necessary, and is plentifully available in quantity). One consequence is that if you have these things in a storage building that catches on fire, the clay is baked into pottery and essentially permanently preserved for archaeologists to uncover. Texts written on organic parchment or papyrus are far less durable, as they tend to decompose unless properly stored.
This means we probably have an exaggerated abundance of economic documents due to survivorship bias of the things they wrote economic data on.
Edit: we've had to warn you about this specifically once before, as well as several other past warnings about breaking the site guidelines. Would you please review https://news.ycombinator.com/newsguidelines.html and stick to the rules from now on? We have to ban accounts that won't, and I don't want to ban you.
what ? this is not flamewar? I am being completely misunderstood here.. I am not guilty .. dang - honestly, I meant to be fully supportive of prayer and I am deeply wronged in this sequence.. I regularly support religious topics if you read my writing
note: I will re-read the guidelines in an abundance of caution, but I repeat.. I am being misunderstood deeply .. this is not at all meant as some kind of problem thing to say
Unfortunately, the comment is still a flamewar starter even if you didn't intend it that way, because it didn't make your intent clear enough. I wasn't the only person who took it the wrong way: https://news.ycombinator.com/item?id=37709050. If we had left it in its original position, there would likely have been others.
The burden is on the commenter to disambiguate intent in such cases (I was just writing about this elsewhere - perhaps it will help explain: https://news.ycombinator.com/item?id=37709303). And as the site guidelines say, "Comments should get more thoughtful and substantive, not less, as a topic gets more divisive."
ok - I deeply apologize and since we are detached, I will also add that I study the Bible myself, and my first wife was in fact Jewish. Please, really sorry to be in this awkward moment
That seems completely impossible, regardless of the ages of any surviving inscriptions.
The reason is that the initial North-West Semitic alphabet had 27 consonants, whose order is known from the Ugaritic alphabet derived from it.
The Phoenicians have merged 5 pairs of consonants (KHA with HOTA, SHIN with THANNA, DHAL with ZETA, ZU with SADE and AIN with GHAIN), and they have kept only one letter from each pair, the result being a simplified alphabet with only 22 consonants.
There is no doubt that all the other later North-West Semitic alphabets have been derived from the Phoenician alphabet and not from any earlier Semitic alphabet, because all of them have started only with the restricted set of 22 letters, even if their languages had more consonants than 22, so the Phoenician letters were too few for writing all the sounds of those languages.
Because of this mismatch between the Phoenician alphabet and the sound inventory of the languages, Hebrew, Aramaic and Arabic have been forced initially to use a single letter for multiple sounds, which has been corrected later by inventing various diacritic signs to distinguish the multiple meanings of a letter, like in the Hebrew SHIN and SIN (which are distinguished by adding a dot to the letter, in different positions).
If the Hebrew alphabet had been older than the Phoenician, it would have included more than 22 letters, e.g. by having distinct letters for SHIN and SIN (whose pronunciations were different from the modern pronunciations, which have merged SIN with SAMEKH).
The article discussed here shows examples from older versions of the Semitic alphabet, many hundreds of years before the appearance of the Ugaritic, Phoenician or Hebrew alphabets.
The reconstructed Proto-Semitic language had 29 consonants, so it is likely that the oldest Semitic alphabet also had 29 letters.
However, this cannot be known for sure, because the very few preserved inscriptions do not contain all the signs of the alphabet. Ugaritic proves that there were at least 27 letters.
At some point in time, the Semitic alphabet has split into two variants, a North-West variant and a South-West variant, the latter being used for writing various South-Arabic languages.
While the Northern and the Southern variants have diverged in their graphic forms, the most significant difference is that they have completely different orders of the letters in the alphabet. The reason for the two orders is unknown. Perhaps they have used some mnemonic technique, like reciting a poem for remembering all the letters, and the North and the South have chosen different poems.
The North-West Semitic alphabet is the one having the order alpha-beta-gamma ..., which has been inherited by many later Semitic alphabets and by the Greek, Latin and Cyrillic alphabets, in all their many variants, including the English alphabet.
The oldest Semitic alphabet for which all the letters are known, together with their alphabetic order, is the Ugaritic alphabet. In Ugaritic, two pairs of Proto-Semitic consonants have merged, so it has only 27 consonants of the original 29. Moreover, Ugaritic does not provide any information about the graphic forms of the older Semitic alphabets, because in it all the letter glyphs have been replaced with forms that can be written on cuneiform tablets.
Even so, the Ugaritic alphabet remains the most complete source of information about the Semitic alphabets that have preceded the Phoenician alphabet.
You can see the 27 letters of the Ugaritic alphabet in the Unicode, from "U+10380;UGARITIC LETTER ALPA" to "U+1039A;UGARITIC LETTER TO" (besides these 27 letters inherited from the older North-West Semitic alphabet, Ugaritic has created 3 additional special-purpose letters, appended at the end of the alphabet).
All this information can be found in the literature about the older Semitic languages from the second millennium BC, including Ugaritic, and about Proto-Semitic and comparative Afro-Asiatic linguistics.
There is abundant data demonstrating that Hebrew, Aramaic and Arabic had more than 22 consonants at the time when they have adopted the inadequate for them Phoenician alphabet with only 22 consonants. Arabic has retained 28 consonants until today, so, like Hebrew, it has multiplied the original 22 letters by combining them with diacritic signs.
If any of these languages would have adopted the older alphabet that was the source of the Ugaritic alphabet, instead of adopting the simplified Phoenician alphabet, they would have had distinct letters for their consonants since the beginning, with no need to invent later new diacritic signs.
Hebrew SIN was a lateral fricative, which is a sound that did not exist in Phoenician. When the Hebrews have adopted the Phoenician alphabet, they did not have any letter for writing SIN, so they were forced to write it with the letter SHIN, which was somewhat close in pronunciation. At that time SAMEKH was pronounced in a different way, so it would have been a worse choice.
If the Hebrews would have invented an alphabet of their own, or if they would have adopted another Semitic alphabet variant, and not the Phoenician alphabet, they would not have needed to use a single letter for multiple sounds. This was clearly not a satisfactory solution, because later they have invented the SHIN and SIN dots, to disambiguate the letter with multiple readings.
There are also lots of scans of really interesting Medieval manuscripts cataloging alphabets in the book.