Ready.
On this page, I tell about, how I transcribe Arabic text in Latin letters (ASCII)
This work is an improvement. Others existed. I modified theirs to fit the way I need, or want, it. This has a variety of uses. e.g: I think of a little applet that displays Arabic text, while the text is entered with a Latin keyboard.
For convenience. This helps both those who are learning Arabic, and those who own computers (and/or gadgets) without a convenient method for typing, and displaying with the Arabic letters.
For a student, a possible path for development is:
The student is reading/writing with the alphabet, that he/she already knows. No new alphabets (or, fonts) necessary, for reading-and-writing at your computer screens and keyboards.
This lets comparing-and-contrasting the needs of an Arabic text, with the requirements of English, French, etc. e.g:
Otherwise, it takes more knowledge - both about the possible words with those consonants, and about the context where that sentence is spoken. In fact, there are seven such qiraats (q-ruh-ut) in reading Quran, as revealed to our prophet Muhammed (s.a.s), and reported in a hadith. Each of these seven qiraats are memorized, by the hafiz (huh-fiz), the Quran-memorizers.
After having mastered the previous step, try get acquainted with the Arabic letter shapes. This is very easy, with this faithful mapping of the letters, I provide. Easily compare the Arabic text of the Quran, with those transcribed with this method, in Latin. The structure/pattern is exactly the same. This should help in a transfer of knowledge.
This is relief, both for fonts-shortage, and for keyboard-anomalies. Any keyboard would do. If Latin-text is possible, Arabic text is possible, too. e.g: Within the quotes of an HTML page, where the HTML itself is Latin-text, the quote is able to contain Arabic text, with this method.
I plan publishing, inshaellah, a little typing-helper, later, at this page, for displaying the typed-text, as Arabic-text, too. i.e: The typist types with these rules, while the text is displayed as ordinary Arabic. No difference.
If you would like to learn all the seven qiraats, then, you may start with the phonetics here, then switch to the corresponding qiraat (the ASIM qiraat) in Arabic texts. And then, you may go on to learn the other six.
Yes, or no. This is the most comprehensive system, that I know. Others had more limited range, even when not altogether neglectful.
There are a lot of problems with a full neglect/ignorance. The least is the point that, it is a many-to-one mapping. e.g: Three different letters, may get written as the same letter "h" in such a neglectful text. Four different letters are mapped to the single letter of "z" in Arabic-to-Turkish conversions. That loses a lot of information. You get confusing "the Creator," with "the barber" when the "h" are different, in deed.
There are little patches that place an extra dot, or two, or a sign, above or below the regular Latin letters. This needs special fonts, then, if you would like to employ it, with your own texts.
In a Teach-yourself-Arabic text, I noticed, the Latin letters are complemented with a few Greek letters, e.g. theta, to increase the range of letters, presumably, to let Arabic varieties fit in, without mapping to arbitrary letters. Thereby, the similar letters in Latin and Greek, are employed to convey the shades of the "t" etc.
I came by this text, in late June 2004, after I had coded mine, although it is the oldest among these. I refer to it, for your information, as an alternative, out there.)
A tejwid (pronounciation) book has the only rule I liked, and I improved that idea, with other wish-list items of mine, for a full-mapping. The idea in that tejwid-book, was to employ the capital case vs. small case letters, to stand for different letters in Arabic. I think this is fine, for most cases. It conveys both the similarity, and the weight. e.g: The capital T is a heavier sound than the small-case t.
For the purposes of that tejwid textbook, a unidirectional-translation may suffice, but I would like to further it. The result is the set of rules on this page.
At this site, I encode Arabic text, with the Latin-alphabet. Mostly, this substitutes, Arabic letters, with Latin equivalents.
Therefore, if you already know some text in Arabic, especially with the specific qiraat, that is employed (for the vowels and phonetics), you should easily recognize what corresponds to what.
Here is the 112th surah in Quran. The rules, and the replacements-table, follow after it.
bismi/\ll=a'hi/\RR=aHma'ni/\RR=aHi'Ym
Qul* huwe/\ll=a'hu eHad[un=+j]
/e\ll=a'hu/\SSamed[u+j]
lem* yelid*+L
we lem* yuWled*+L
we lem* yekun* lehu kufuwen= eHad[un=]
Equivalently, if only for vocalizing purposes, the display may show this (i.e: if not with Arabic letters, that is):
bismilla'hiRRaHma'niRRaHi'Ym
Qul huwella'hu eHad
ella'huSSamed
lem yelid
we lem yuWled
we lem yekun lehu kufuwen eHad
But this latter style, loses, possibly-important, information - even for sound-only. e.g: It does not suggest, where you are allowed to stop, where you are not, when reading it. It is not a good idea to stop at points marked with a "+L" , for example. The shedde-couples should not be pronounced with any pause between them.
The capital-letters W and Y, are silent. They only contribute to vowel-expansion, i.e: more-time with the sound of the vowel, before themselves.
This is interpretation/qiraat dependent. In another context, such a W or Y, may receive vowels for themselves (after themselves), too. In such a case, they would, for example, get written as "wa," or "yi" i.e: An associated vowel, converts these to small-case, and lets own sound.
A silent-elif is similar. It is noticed as a capital-case vowel, after a consonant. e.g: Qa is different than QA. The latter has an elif, after Qa, and doubles the time.
The vowel-expansion is achievable without such silent-consonants, too. A single-quote (an apostrophe) after a vowel, doubles its time. e.g: Qa' takes double the time of Qa
Such a suffix of a single-quote, is similar to the short vertical-bar, in Arabic, both in shape, and function.
Similarly, a tilde "~" after a vowel, lengthens its sound two to four times.
A jezm is showable with an asterisk "*" , although It is not necessary, mostly, if ever. If there is a space, after a consonant, in a Latin text, you stop, any way.
A shedde-diacritic, in Arabic, doubles a consonant. I write such a doubled-consonant, with two consonants, followed by an equals-sign "=". e.g: /inn=e"
An "el" followed by a word that starts with a "shemsee" letter, is similar to a shedde, in pronounciation. I keep the unpronounced "l" letter, as "\" the backslash letter. e.g: /e\$$=emsi" That is, to find out whether it is a shedde with a shemsee letter, notice the double-consonants (a single-consonant, pronounced double), whether that is after a backslash.
The tenwin (double-hareke) sounds are written with the "n" sound followed by an equals-sign, and preferrably, a space after that. i.e: "n= " For example "rajulun= kebiYrun= " When you only pronounce it, you may neglect those equals-signs, but when you count letters, those signs tip you to discard those "n= " entries, because they do not contain any consonant "n" really. It is only a tenwin sound.
Similarly, the extra consonant, introduced with a shedde, is noticed with an equals-sign after it. e.g: "jenn=Atu" is written with two "n" letters, although there is a single consonant "n" in it. For consonant-counting, it is fine to neglect whether the equals-sign exists because of a tenwin, or a shedde. In any case, it is easy to tell the difference, too. Before the tenwin's "n" there is a vowel, whereas the shedde has the same consonant doubled. Therefore, these two are never confused.
The informative subscripts/superscripts, are the reading-aid extras, that you may find in a Quran text. Here, they are prefixed with a plus-sign. For an example, +T stands for the "Ta" superscript letter, at the end of those arabic sentences, where a stop is required, or possible, and neglecting the vowel before it, is required, when/if you stop there.
When stopping is not required, I write +T within square brackets, together with the vowel that would get neglected, if stopped there. e.g: [e+T], or [u+T].
When +T is a required pause, I write +T without any brackets around it, e.g: [e]+T, or [u]+T. The vowel, in this case, is only informative, because it is never pronouncable, with a +T that requires waiting/stopping. I inform about the vowel, because it is helpful when reading for learning/understanding the arabic text. (No need, though, if only vocalizing the text, and/or if the reader is able to infer the mansub vs. merfu vs mejrur, without the vowel-tip, too.)
Another important example, +L stands for the lam-elif ligature, as a superscript. It is found at the phrase-ends, where the reading must continue. It is a not-pausable point. As such, +L is necessarily enforced. i.e: Even if the reader thought of stopping there, he/she continues, after seeing the +L.
I introduce three new coupling-concepts, for fitting Arabic letters, in Latin form.
The first is the lisping-letters: the lisping-se, and lisping zel. These are tip-of-the-tongue sounds, and represented with a c, with its tongue/teeth shape. The capital-C represents zel, the latter in alphabet.
The O is not a vowel in Arabic. Therefore, I employ it for two new concepts. When it is capital, as in leOall=e, it represents the letter Uyn, in Arabic. The Uyn, in fact, is a consonants, with a larynx'izing of elif'ful (or, hemze) sounds. Therefore, it is almost kin, with vowels. In fact, when other texts ignore its sound, they simply show the vowel, by itself. It is a heavy sound. Therefore, the capital-O is the perfect choice, I think.
When o is small case, it is the letter h, as with its Arabic shape-alike, and if it is prefixed with a colon, it is t. This letter, tamarbuta, is representable with an umlaut-o, ö, if the font has it. This is its Arabic shape.
The example with two alternative writings of the 112th surah, is important. The latter presentation will let you vocalize (pronounce) Quran, through Latin/ASCII coding. This corresponds to listening someone vocalizing the Quran, on some radio, or CD, etc.
When presented with a text, encoded that way, when studying it further, if there is a need for more, non-phonetical, information, you may consult the Arabic text of Quran. For a vocal treatment, with a given qiraat, such a presentation should suffice. (I use the ASIM qiraat, in the examples.)
In Arabic, the vowels and the transcription signs are optional, and even in reading the Quran there are revealed varieties. As such, the more I may encode it phonetically, the more it would represent a single qiraat, to the exclusion of the others. I hope to remain informative about it, though. I even include the no-sound-influence characters.
I continue studying these. The full-success, for this study, is the point where the Latin-encoding contains the full information, and it is readable, too, by a human, without major distraction. This, even when achieved, needs further work, to verify. (Human-performance research, interface improvements.) i.e: Perfection is w.r.t. its requirements, and needs testing.
In any case, though, even if this is not perfect, or not-announcable as perfect, yet, it is well developed, and good enough for the purposes of this site.
Internet netiquette suggests not writing texts in all-capital letters, because it is suggestive of shouting. But this is irrelevant in our case, where a text may appear in all-caps, or mostly capita letters, only because the Arabic letters in it are that way. It is not about shouting, or not shouting. Employ an exclamation-mark, if you need shouting, any way.