Chapter 1

The speech-sounds of Sanskrit were called varṇāḥ (“colors”) or akṣarāḥ (“indestructible”). These words referred both to the distinct speech-sounds of Sanskrit (that is, roughly, its phonemes) and to its syllables. That is because, in the scripts in which Sanskrit has historically been written, each letter corresponds to a syllable.

Sanskrit is written in a wide variety of scripts. Nowadays it is generally written in the Devanagari script, although historically each region of South and Southeast Asia had a different script for writing Sanskrit. All of these regionally-distinct scripts derive from the ancient Brāhmī script, which was used in inscriptions — although not for Sanskrit — from around the fourth century bce. The Brāhmī script, and its descendants, were specifically designed to capture the distinctions in the speech-sounds of Sanskrit and related languages. For that reason, Sanskrit has always been written exactly as it is pronounced.

The circumstances in which writing arose and spread in South Asia are still somewhat unknown. It used to be thought that the edicts of Aśōka (third century bce), which were written in a Middle Indic language related to Sanskrit, were the earliest examples of the Brāhmī and Kharōṣṭhī scripts. The Kharōṣṭhī script, which was used in the northwest of the subcontinent, was based on the Aramaic script, which the Achaemenids had introduced in those regions in previous centuries. Evidence is accumulating, however, that Brāhmī was used before Aśōka, and surprisingly, in the far south of the subcontinent (Tamil Nadu and Sri Lanka).

Sanskrit is also sometimes written in a transliteration of these Indic scripts into Roman letters. The principle behind these transliterations is representing the same speech-sounds that are represented in Indic scripts. But because Sanskrit makes distinctions that European languages generally do not make, such as aspiration and retroflexion, the Roman letters have to be supplemented with diacritics. There are two prevalent systems of transliterating Indic scripts: the ISO-15919 system, which I prefer because of its compatibility with other South Asian languages, and the IAST (International Alphabet of Sanskrit Transliteration), which is specifically designed for Sanskrit. Sometimes Sanskrit is written “informally” in Roman letters, without diacritics, as in “Yudhishthira,” “Rama,” “Lakshmana,” and so on.

§1.1.Phonemes

Phonemes are the fundamental sounds out of which words in a language are constructed. They are discrete and contrastive units of speech. They are discrete in the sense that, within a given language, there are stable criteria that distinguish each phoneme from each of the others and therefore divide up the continuum of speech-sounds into a specified number of phonemes. They are contrastive in the sense that, within a given language, replacing one phoneme in a word with another at the same position will result in an altogether different word. In English, for example, we know that b and k are phonemes because “bar” bar and “car” kar form a “minimal pair.” By contrast, we can guess that kʰ is not an English phoneme because kʰar and kar do not contrast with each other.

Phonemes are sometimes called segments to call attention to their linear sequence.

Sanskrit’s phonemes, like those of most other languages, are distinguished into vowels and consonants.

Sanskrit has no unambiguous word for “phoneme.” The words akṣaram “indestructible” and varṇaḥ “color” are often used with reference to phonemes, but often, also, with reference to syllables, whether spoken or written.

Linguists and philologists often distinguish between at least three kinds of representation. The phonemic representation of a word, contained between slashes, represents the phonemes that the word comprises, for example kæt. The phonetic representation of a word, contained between square brackets, represents the way it is pronounced, for example kʰæt. The latter is especially useful when orthography is not a reliable guide to pronunciation, as is usually the case in English. Both of these types of representation use the International Phonetic Alphabet. The graphemic representation of a word, contained between angle brackets, represents the way it is written, for example cat. Sanskritists, however, rarely distinguish between these three kinds of representation, because Sanskrit is already written in a fashion that closely approximates its pronunciation. The choice, rather, is between representing Sanskrit in an Indian script, such as Devanagari, or transliterating it into Roman letters.

§1.2.Features

Features are what distinguish phonemes from each other. Linguists in ancient India discussed a number of distinctive features, and the phoneme inventory of Sanskrit is typically organized in terms of distinctive features (see alphabetical order and the Śivasūtras below). These include:

Sonority: the degree of openness of the stream of air exhaled through the lungs. Linguists recognize a hierarchy of sonority from highest (completely open) to lowest (completely closed), which groups the speech-sounds of any language into the following categories, which also represent distinctive manners of articulation:

vowels svarāḥ (air flows out continuously, and the sound is made by the shape of the tongue in the mouth);
approximants antaḥsthāḥ (air flows out continuously, but the tongue nearly comes into contact with part of the mouth);
nasals (air flows out continuously through the nose, the oral cavity being blocked by the tongue or lips);
fricatives ūṣmāṇaḥ (air flows out through a small aperture formed by the tongue within the mouth); and
stops sparśāḥ (the tongue or lips completely blocks the flow of air).

Length: the relative duration with which a phoneme is pronounced. A phoneme is either short hrasváḥ or long dīrgháḥ. In Sanskrit this distinction applies, for most purposes, only to vowels.

Voicing ghṓṣaḥ: If the vocal cords vibraat when the phoneme is pronounced, it is voiced ghōṣavān; otherwise it is unvoiced aghōṣaḥ. All vowels are voiced. Voicing is thus contrastive only for consonants.

Aspiration prāṇáḥ: This feature is only present in consonants. If a burst of air is released at the same time that the consonant is pronounced, then it is aspirated mahāprāṇaḥ; otherwise it is unaspirated alpaprāṇaḥ.

Place of articulation sthānam: The place in the vocal apparatus where the phoneme is pronounced. Ancient Indian linguists recognized the following places:

velum kaṇṭháḥ: the back of the throat, near the soft palate.
palate tā́lu: the hard palate, at the top of the mouth.
alveolar ridge mūrdhā́: where the roots of the front teeth begin to descend from the hard palate (in fact this is slightly behind the alveolar ridge).
teeth dántāḥ: behind the top front teeth.
lips ṓṣṭhau: the lips.

Pitch sváraḥ: whether the phoneme is pronounced with a certain pitch. This only applies to vowels. Generally the options are high pitch udā́ttaḥ and non-high pitch ánudāttaḥ; see the discussion of accent below. In this textbook, a high pitch will generally be marked with an acute accent, but only in the transliterated version of the text.

Sanskrit has the following vowel sounds (note that the English equivalents are only loose approximations: please listen to examples of these sounds and try to reproduce them yourself):

Letter	IPA	English
a	ɐ	but
ā	aː	mom
i	i	beat
ī	iː	bean
u	u	boot
ū	uː	boon
r̥	ɹ̩	teacher (American); see below
r̥̄	ɹ̩ː	—
l̥	l̩	little
l̥̄	l̩ː	—
ē	eː	may
ai	aɪ̯	eye
ō	oː	go
au	au̯	vowel

The sound r̥/r̥̄ is pronounced in different ways in different regions. In Central, North, and East India (Rajasthan, Punjab, Haryana, Madhya Pradesh, and points east), it is generally pronounced as ri, while in West and South India (Gujarat, Maharashtra, Telangana, Andhra Pradesh, and points south) it is pronouned as ru. English speakers do not generally think of r as a vowel, but try extending the final syllable of the word teacher (in the General American pronunciation). In the ancient phonetics literature, the vowel is described as a combination of the neutral vowel ə and the consonant r, followed again by the neutral vowel ə in quick succession. You should model your pronunciation of this vowel on that of a good Sanskrit speaker.

The sound l̥/l̥̄ is very marginal in Sanskrit, effectively occurring in only one verbal root (kl̥p “be fitting”). The same group of speakers who pronounce r̥ as ri generally pronounce l̥ as lri (yes, it is difficult), and the same group of speakers who pronounce r̥ as ru generally pronounce l̥ as lru.

§2.1.Vowel gradation

There are many contexts in which vowels alternate with each other. Consider the following three words:

víś- “settlement”
vēśá- “settler”
vaíśya- “settler”

These words are related in meaning, and also in formation. We can arrange them in the following way, considering the vowels as “gradations” of each other:

word	gradation
víś-	“zero grade”
vēśá-	guṇáḥ, “full grade”
vaíśya-	vŕ̥ddhiḥ, “lengthened grade”

The terms “zero grade,” “full grade,” and “lengthened grade” were invented by scholars of Indo-European to capture the alternation between various forms of the “same” vowel. We will return to these terms later on.

Indian grammarians used the terms guṇáḥ for “full grade” and vŕ̥ddhiḥ for “lengthened grade.” They did not have a term for the first in the series, “zero grade,” because they considered it the simple form from which the other two were derived. In terms of their segmental makeup, the guṇáḥ vowel is identical to the simple vowel, but with a short a preceding it. Similarly the vŕ̥ddhiḥ vowel is identical to the simple vowel, but with a long ā preceding it. Hence we arrive at the following series:

Simple vowel (zero grade)	Guṇáḥ (full grade)	Vŕ̥ddhiḥ (lengthened grade)
a	a	ā
i	[a + i] = ē	[ā + i] = ai
ī	[a + ī] = ē	[ā + ī] = ai
u	[a + u] = ō	[ā + u] = au
ū	[a + ū] = ō	[ā + ū] = au
r̥	[a + r̥] = ar	[ā + r̥] = ār
r̥̄	[a + r̥̄] = ar	[ā + r̥̄] = ār
l̥	[a + l̥] = al	[ā + l̥] = āl
l̥̄	[a + l̥̄] = al	[ā + l̥̄] = āl

Do not worry too much about the fact that the guṇáḥ of the vowel a is a. This is an artefact of the way the vowel gradation system has been set up by the Indian grammarians. We will return to this topic from a historical perspective later on.

§2.2.Vowel length

Sanskrit vowels are either long dīrgháḥ or short hrasváḥ. Five vowels have both long and short variants:

Short hrasváḥ	Long dīrgháḥ
a	ā
i	ī
u	ū
r̥	r̥̄
l̥	l̥̄

Generally speaking, the long version is pronounced the same as the short version, except for twice as long. There is one important exception: a is pronounced as more “closed” sáṁvr̥taḥ than ā, hence it is pronounced as ɐ, while its long version is pronounced as aː.

The following vowels are long, and have no corresponding short vowels, because they are analyzable into two vowel segments, as noted above.

Vowel	Segments
ē	a + i/ī
ai	ā + i/ī
ō	a + u/ū
au	ā + u/ū

There is a third category of length, used only in very specific contexts. This is traditionally called prolation plutíḥ, and the vowels are called prolated plutā́ḥ. They are simply “extra-long” vowels, and they are written with the numeral “3” after them: ā3, ī3, ū3, r̥̄3, l̥̄3, ē3, ō3, ai3, au3.

§2.3.Vowel pitch

As noted above, vowels can either have a high pitch, a non-high pitch, or a falling pitch. See the section on accent below.

Consonants vyàñjanāni are those speech-sounds that cannot form a syllable on their own. They are phonetically distinguished by a relatively more restricted flow of air than the vowels. The consonants of Sanskrit are traditionally divided up based on their manner of articulation, and within those broad categories, based on their place of articulation, and within those categories, based on other features, such as voicing, aspiration, and nasality.

§3.1.Occlusives sparśā́ḥ

These sounds are so called for their occlusive manner of articulation, wherein the flow of air through the oral cavity is completely occluded. (For some of these consonants, called stop, no air at all escapes, whereas for others, called nasals, it escapes through the nasal cavity rather than through the oral cavity.) They are traditionally arranged in a grid:

place of articulation sthā́nam	voiceless ághōṣaḥ		voiced ghṓṣavān
place of articulation sthā́nam	unaspirated álpaprāṇaḥ	aspirated mahāprāṇaḥ	unaspirated álpaprāṇaḥ	aspirated mahāprāṇaḥ	nasal nāsikyaḥ
velar káṇṭhyaḥ	k k	kh kʰ	g g	gh gʱ	ṅ ŋ
palatal tā́lavyaḥ	c t͡ʃ	ch t͡ʃʰ	j d͡ʒ	jh d͡ʒʱ	ñ ɲ
retroflex mū́rdhanyaḥ	ṭ ʈ	ṭh ʈʰ	ḍ ɖ	ḍh ɖʱ	ṇ ɳ
dental dántyaḥ	t t	th tʰ	d d	dh dʱ	n n
labial ṓṣṭhyaḥ	p p	ph pʰ	b b	bh bʱ	m m

Sanskrit thus has five series of occlusives, depending on whether their primary organ is the velum, the palate, the alveolar ridge, the teeth, or the lips. Each series is called a várgaḥ, and they are named in Sanskrit for the first sound in each series, hence kavargaḥ, cavargaḥ, ṭavargaḥ, tavargaḥ, and pavargaḥ, or alternatively for their place of articulation, hence kaṇṭhyavargaḥ, tālavyavargaḥ, mūrdhanyavargaḥ, dantyavargaḥ, and ōṣṭhyavargaḥ respectively.

The technical term that Pāṇini uses for these series in his Aṣṭādhyāyī is the first consonant of each series followed by the vowel u, hence ku, cu, etc.

The first two sounds in each series are voiceless ághōṣāḥ and the last three are voiced ghṓṣavantaḥ. Voicing is distinctive in English, too, so this distinction should be easy to grasp. The first and third in each series are unaspirated alpaprāṇau, and the second and fourth in each series are aspirated mahāprāṇau. Aspiration is not distinctive in English, so if you don’t speak a language with distinctive aspiration, you will have to practice these sounds. English speakers will have a tendency to overdo the aspiration in sounds like kh and gh, but that is preferable to losing the distinction of aspiration altogether.

Aspiration is not phonemically distinctive in English, but it is an important coarticulatory process. Most speakers aspirate voiceless stops at the beginning of a word. You can test this by saying the word “cat” kʰæt while holding the palm of your hand, or an index card, up to your mouth. You should feel a puff of air. You probably will not feel the same puff of air if you pronounce a word beginning with a voiceless stop, like “get,” or a sibilant, like “skate.” You can, however, consciously avoid aspirating initial voiceless stops like the one in “cat,” so that it is pronounced as kæt. Hence the first sound in each series, the voiceless unaspirated stop, takes some training for English speakers to pronounce: Sanskrit kh has slightly more aspiration than English k when the latter comes at the beginning of a word (and is pronounced kʰ), and Sanskrit k has slightly less. [This advice does not hold for retroflex sounds, because the t in words like “stop,” with an initial s, is dentalized.]

You will sometimes read that the sound th (for example) is pronounced as in English “hot-house,” but this advice is misleading, because I, and many other English speakers, very often don’t release the final stop consonant of a syllable, and hence my pronunciation of this phrase does not contain the sound tʰ.

Voiced stops are never aspirated in English, so the fourth member of each series will require practice to recognize and produce. These sounds are sometimes said to have “breathy voice.”

The velar káṇṭhyāḥ occlusives are similar to corresponding sounds in English:

k ~ skate
kh ~ kate, with slightly more aspiration
g ~ gate
gh [no English equivalent]
ṅ ~ sing

The palatal tā́lavyāḥ occlusives are similar to the English palatal affricates:

c ~ cheap
ch [the same, but with more aspiration]
j ~ jeep
jh [no English equivalent]
ñ ~ canyon (cf. Spanish ñ)

The occlusives called mū́rdhanyāḥ in Sanskrit are called retroflex in English, which refers to the “curling backward” of the tongue right behind the alveolar ridge. (This place of contact, slightly behind the alveolar ridge, is called mūrdhā́ in Sanskrit, which has led to the English calque “cerebrals” for mū́rdhanyāḥ in older scholarship.) English does not distinguish between dental and retroflex consonants, and most English speakers pronounce the sounds t, d and n somewhere in between a retroflex and dental articulation. As a result, the English sounds t and d are generally borrowed into Indian languages as retroflex sounds (e.g., ḍākṭar for doctor). Retroflexion does, however, occur in English as a coarticulatory process: the sounds t, d, and n are more retroflexed when they are preceded by the consonant r, in those varieties of English (like General American) that pronounce this syllable-final r.

ṭ ~ hurt
ṭh [the same, but with more aspiration]
ḍ ~ yard
ḍh [no English equivalent]
ṇ ~ varnish

The dental dántyāḥ occlusives, as just noted, do not contrast with retroflex occlusives in English, and most English speakers will pronounce t, d and n somewhere between a retroflex and dental articulation. If you grew up in New York City, however, there is a good chance that you dentalize these sounds. For the English equivalents here, just imagine Christopher Walken saying them:

t ~ stop
th ~ top [with slightly more aspiration]
d ~ dog
dh [no English equivalent]
n ~ nine

The labial ṓṣṭhāḥ occlusives are basically the same as those in English:

p ~ spit
ph ~ pit [with slightly more aspiration]
b ~ bit
bh [no English equivalent]
m ~ mine

§3.2.Approximants antaḥsthāḥ

The Sanskrit word for these sounds means “in-between,” because their sonority is midway between that of vowels and occlusives. They are essentially the consonantal versions of the vowel sounds i, r̥, v and l̥, with which they alternative:

Approximant	Pronunciation	Corresponding vowel
y	j as in yet	i/ī
r	ɹ as in red	r̥/r̥̄
l	l as in let	l̥/l̥̄
v	v as in vote	u/ū

The place of articulation of these sounds is as follows: y, palatal tā́lavyaḥ; r, retroflex mū́rdhanyaḥ; l, dental dántyaḥ; v, labial ṓṣṭhyaḥ.

The sound r is somewhere between the English r, i.e., an alveolar or retroflex approximant, ɹ, and the Spanish or Italian trilled r, i.e., r. Some degree of friction or trill is implied by the common Sanskrit name for this sound (rēphaḥ “tearing sound”), although the phonetics literature warns against excessive trilling. It is not a uvular trill (as in French, German, Hebrew, etc.), or a tap (as in Spanish pero). Since the pronunciation of r varies widely in English (and since r has complex coarticulatory affects on neighboring vowel sounds in English) you should take care to pronounce Sanskrit r properly in all positions.

Note that the sound v is somewhere in between the English sounds w and v, which are, respectively, a labiovelar approximant and a labiodental fricative. In fact most English speakers pronounce it as v when it appears on its own (as in vātaḥ “wind”) and w when it appears after another consonant (as in aśvaḥ “horse”). You are safe pronouncing it as a less strongly articulated v (i.e., hold your mouth in the position of v, but pronounce it as an approximant rather than a fricative, i.e., without buzzing between the teeth and the lips).

§3.3.Fricatives ūṣmāṇaḥ

Fricatives are sounds where air is passed through a relatively narrow passage in the articulatory organs, resulting in a turbulent airflow, which is probably the meaning of the Sanskrit term ūṣmā́ṇaḥ (literally “heat”). In principle, Sanskrit has the following fricative sounds:

Place of articulation sthā́nam	Sound	Pronunciation
Velum káṇṭhaḥ	x	x
Palate tā́lu	ś	ʃ as in ship
Alveolar ridge mū́rdha	ṣ	ʂ
Teeth dántāḥ	s	s as in sip
Lips ṓṣṭhau	f	ɸ

You will notice, however, that some of the sounds — represented as x and f are represented in gray. That is because they are not phonemes of the Sanskrit language, because they do not form minimal pairs with other speech-sounds. Rather, they are variants of the sound s in certain phonological contexts, just like the visargáḥ introduced below, where you will find further discussion of these sounds.

The three sibilants contrast with each other. While English also distinguishes s s from sh ʃ, it does not distinguish a retroflex sibilant, ʂ. The same is true of most modern Indian languages. Hence many speakers pronounce ś and ṣ in very similar ways. However, the latter is retroflex, and the distinction can be heard if sufficient attention is paid to it.

Sanskrit also has one pseudo-fricative sound, namely h. This sound is very similar to the English h (e.g. hat), but with one major difference: it is voiced rather than voiceless (hence pronounced as ɦ rather than h). The closest way to approximate this sound, if you don’t have it in your language, is to learn how to pronounce the voiced aspirated (bh, dh, etc.), and simply leave out the part where the flow of air is occluded in the oral cavity.

§3.4.Dependent sounds ayōgavāhāḥ

The final class of speech-sounds are “dependent” sounds, or ayōgavāhāḥ in ancient phonetics literature (the literal meaning of the word, “non-juncture-bearing,” has been interpreted in different ways). They are “dependent” because they never constitute a syllable — and hence, in the syllabic scripts in which Sanskrit was written, a letter — on their own. Rather, they always occur at the end a syllable, and specifically, after the vowel that constitutes the nucleus of a syllable (see syllables below). There are two main types of dependent sounds: the visargáḥ and the anusvāraḥ. Unlike most of the other consonants, they do not have place of articulation features of their own (there is a tendency cross-linguistically to eliminate place of articulation contrasts at the end of a syllable).

Because the signs for these sounds are considered diacritical marks that cannot be written independently in the Unicode representation of Indic scripts, I will use the simple vowel sign a to “host” them here.

The visargáḥ (“letting loose,” also visarjanī́yaḥ) is written as aḥ. It is a voiceless fricative without a specified place of articulation. It is an allophone, or positional variant, of the sounds s and r. It is pronounced as a slight puff of air, like the English h h, although the latter never occurs at the end of a syllable, whereas visargáḥ always occurs in that position. It generally takes English speakers some practice to master this sound, although many simply pronounce it as h with a short echo of the preceding vowel (e.g., rāmaḥ as ɹaː.mɐ.hɐ).

The visargáḥ has two close relatives, which are very rarely written in printed Sanskrit books, but which used to be relatively common in Sanskrit inscriptions. They are the sounds called upadhmānīyaḥ and jihvāmūlīyaḥ (meaning “puff of air” and “base of the tongue” respectively). The upadhmānīyaḥ was the allophone of visargáḥ before voiceless labial stops (i.e., before p and ph), and it was pronounced as ɸ, i.e., a voiceless labial fricative. The jihvāmūlīyaḥ was the allophone of visargáḥ before voiceless velar stops (i.e., before k and kh), and it was pronounced as x, i.e., a voiceless velar fricative. These pronunciations of the visargáḥ are still in common use among Sanskrit speakers, especially in South India, although specific letters for the upadhmānīyaḥ and jihvāmūlīyaḥ are no longer commonly used.

The anusvāraḥ (“after-sound”) is written as aṁ. It represents a nasal phoneme without a specified place of articulation. While its original position is before fricatives (ś, ṣ, s, and h), it came to be used before approximants as well (y, r, l, v), and it has gained ground as a way of writing (and perhaps of pronouncing) a nasal consonant before any occlusive. Hence anusvāraḥ is pronounced in two distinct ways:

as a nasalization of the preceding vowel (which also makes the vowel long), when it comes before fricatives and approximants (e.g., saṁskr̥tam sɐ̃ːskɹ̩tɐm);
as the nasal corresponding the the place of articulation of a following occlusive (e.g., saṁkaṭam sɐŋkɐʈɐm).

The phonemes of Sanskrit are therefore usually arranged as follows:

svarāḥ		samānāḥ					sandhyakṣarāṇi
	hrasvāḥ	a	i	u	r̥	l̥
	dīrghāḥ	ā	ī	ū	r̥̄		ē	ō	ai	au

vyàñjanāni		sparśāḥ					antaḥsthāḥ	ūṣmānaḥ
		aghōṣāḥ		ghōṣavantaḥ			antaḥsthāḥ	ūṣmānaḥ
	kaṇṭhyāḥ	k	kh	g	gh	ṅ		h
	tālavyāḥ	c	ch	j	jh	ñ	y	ś
	mūrdhanyāḥ	ṭ	ṭh	ḍ	ḍh	ṇ	r	ṣ
	dantyāḥ	t	th	d	dh	n	l	s
	ōṣṭhyāḥ	p	ph	b	bh	m	v

Among the nasals, only n and m are “true” phonemes, in the sense that they contrast with each other in every position in which they occur. The sounds ṅ and ñ only occur at the end of a syllable, where they are positional variants for either n or m. The sound ṇ is also generally a variant of n, although due to longer-range phonological processes, but it occurs in many words without any phonological conditioning, and therefore has more of a claim to being a phoneme than either ṅ or ñ.

§4.1.The Śivasūtras

The traditional list of Sanskrit phonemes is presented in the Śivasūtras, a short text which accompanies Pāṇini’s grammar. (A recording is available here.)

a i u Ṇ
r̥ l̥ K
ē ō Ṅ
ai au C
ha ya va ra Ṭ
la Ṇ
ña ma ṅa ṇa na M
jha bha Ñ
gha ḍha dha Ṣ
ja ba ga ḍa da Ś
kha pha cha ṭha tha ca ṭa ta V
ka pa Y
śa ṣa sa R
ha L

In this list, the letters on the left represent distinct phonemes. (The vowel a has been added to each of the consonant phonemes to facilitate pronunciation.) The final letter of each line, by contrast, does not represent a phoneme, but an “index” letter anubandhaḥ that is used to form abbreviations pratyāhāraḥ. Abbreviations are formed with one letter and one “index” letter, and represent all of the letters in between. Pāṇini uses this system to refer to different classes of phonemes:

aC vowels;
haL consonants;
yaṆ semivowels;
ñaṄ nasals;
śaL sibilants;
ñaY stops.

Pāṇini also uses a different type of abbreviation for letters belonging to the same place of articulation or “class” vargaḥ. Hence ku refers to velar consonants, cu refers to palatal consonants, ṭu refers to retroflex consonants, and so on.

A syllable akṣaram is a unit of speech that contains the following elements:

an optional onset, which consists of one or more consonants;
an obligatory rime, which consists of:

an obligatory nucleus, which consists of a vowel; and
an optional coda, which consists of one or more consonants.

A syllable therefore has the pattern C*VC* (where C means “consonant,” V means “vowel,” and * means “zero or more”). A syllable can be thought of as a vowel and the consonants that are “attracted” to it. A word will always have as many syllables as it has vowels. To parse a word, or a larger phrase, into syllables, one must decide whether a given consonant goes with the preceding vowel (as a coda) or with the following vowel (as an onset); the general principle is to associate a consonant with the vowel that immediately follows it, if possible, and otherwise to associate it with the vowel that precedes it.

The parsing of speech-sounds into syllables is actually a function of their sonority, and hence the nucleus of a syllable represents a local “sonority peak” relative to the onset and coda. Generally, then, consonants closer to the nucleus will have a higher sonority than more marginal consonants. This accounts for the fact that pra is a well-formed syllable, whereas *rpa is not: semivowels like r are more sonorous than stops like p.

§5.1.Weight

Sanskrit distinguishes syllables according to their weight.

A light laghu syllable contains a short vowel (a, i, u, r̥, or l̥) that is not followed by any consonants. In metrical notation, a light syllable is represented by the symbol ˘ (breve) in transliteration and । r̥juḥ in Indian scripts.

All other syllables are heavy gur-u, i.e., those that contain a long vowel (ā, ī, ū, r̥̄, ē, ō, ai, or au), as well as those that contain a short vowel followed by one or more consonants. In metrical notation, a heavy syllable is represented by the symbol ˉ (longum) in transliteration and ऽ vakraḥ in Indian scripts.

Onset consonants do not count towards the weight of a syllable. Light syllables are said to contain one mora mātrā, and heavy syllables are said to contain two. Thus the weight of a syllable is a function of both the length of its vowel and the number of coda consonants it has.

The word padam can be considered from the perspective of syntax, morphology, and phonology.

In syntactic terms, a word is a form that enters into a specified relationship with other forms. In traditional grammar, one often speaks about a verbal form kriyāpadam and the forms expressing the participants in the verbal action kārakapadāni; alternatively, one speaks about a head pradhānam and its dependents upasarjanāni.

In the morphological terms that are favored by Pāṇini, a word is that which has a nominal or verbal ending (Aṣṭādhyāyī 1.4.14 suPtiṄantaṁ padam). This understanding reflects the division of Sanskrit words in general into a base prakr̥tiḥ and a suffix pratyayaḥ. Sanskrit, being a heavily inflectional language, makes use of many suffixes in order to convey information about a word, including (for a nominal form) its gender, number, and case, and (for a verbal form) its person, number, mood, tense, and “voice” (parasmaipadám or ātmanēpadám). The base to which the suffixes are added is generally called a “stem” aṅgam, and in the case of nominal forms, the most basic form of the stem is called a nominal base or prātipadikam, while in the case of verbal forms, the most basic form of the stem is a verbal root or dhātuḥ.

Phonologically, a word is a unit that meets two requirements: one of length—it is at least as long as the “minimal phonological word”—and one of prominence—it contains no more than one accented syllable. Together, these requirements distinguish between full-fledged words, on the one hand, and forms that do not count as phonological words on their own, on the other. Closely related to the phenomenone of accentual prominence is the phenomenon of vowel gradation; both are discussed below.

§6.1.Accent sváraḥ

One and only one syllable of a Sanskrit word can have an accent. The accent is called udā́ttaḥ or “elevated” Sanskrit, which refers to the syllable’s greater prominence relative to the other syllables in the word. This syllable will generally be written in this textbook with an acute accent in transliteration. (For technical reasons they will not be displayed in Dēvanāgarī.) The unaccented syllables are called ánudāttaḥ “unelevated.” They will not be marked in this textbook. The accented is realized differently in different traditions of recitation. In the tradition of the R̥gvēda, there is a slight drop in pitch just before the udā́ttaḥ, and a sharp rise and fall immediately after the udā́ttaḥ. Outside of Vedic recitation, however, the accents are almost never pronounced. The accents are, moreover, only written in manuscripts of Vedic texts, and the way in which they are written in these manuscripts differs according to the recitation tradition.

Sanskrit’s accent is morphological, in the sense that the individual morphemes that constitute a word are either accented or unaccented, and the word-level accent is generally a function of these morpheme-level accent. Thus Pāṇini encodes into the anubandhas or “diacritics” of each affix he teaches information about the accentual properties of that affix, and specifically, whether the affix is accented and thus “erases” the accent of the stem, or whether it is unaccented and thus “preserves” the accent of the stem. The following are examples of accented and unaccented affixes in the verbal system (note that verbs are generally unaccented: these remarks apply to accented verbs, which occur in subordinate clauses):

kr̥ + u + tiP (third person singular parasmaipadám) → karṓti “he does”
the suffix is unaccented, as indicated by the anubandha P, and hence the accent appears on the verbal stem, and specifically on the vikaraṇa u, which takes the full-grade or guṇa form.
kr̥ + u + mas (first person plural parasmaipadám) → kurmáḥ “we do”
the suffix is accented, and hence no accent appears on the verbal stem, which additionally appears in the short form kur-.

Another piece of evidence for the morphological nature of the Sanskrit accent is that its appearance, or lack thereof, is conditioned by morphological and syntactic categories. Finite verbs outside of subordinate clauses are unaccented in Sanskrit, which is to say that the “underlying” accent of a finite verb is suppressed, and only surfaces when the verb stands in a subordinate clause.

ágnē yáṁ yajñám adhvaráṁ viśvátaḥ paribhū́r ási sá íd dēvḗṣu gacchati “Agni, the worship and sacrifice that you surround on all sides goes to the gods” R̥gvēda 1.1.4
ási is accented because it is in a subordinate clause, but gacchati is not.

Most students ignore the accent in Sanskrit. You are free to do so, although if you are interested in Vedic Sanskrit, you would do well to learn the accents along with the words.

Conventionally Sanskrit is now spoken with a stress-based accent, almost the same as Latin stress. The stressed syllable is:

the penultimate (second from last), if it is heavy; or
the antepenultimate (third from last), if the penultimate is light.

(See weight above.) Hence rā-mā-ya-ṇam, ma-hā-bhā-ra-taḥ, but ku-mā-raḥ and a-nu-ṣak-taḥ.

§6.2.Vowel gradation

This term refers to the phenomenon in Sanskrit wherein related forms of a word will show different forms of the “same” vowel sound. Vowel gradation, or ablaut, is important to the distinctions of nominal and verbal morphology, as well as the process of nominal derivation. Sanskrit grammar thus includes several processes of moving “backwards” and “forwards” along a continuum of vowel gradation. The traditional categories of the “standard” type of vowel gradation in Sanskrit are as follows:

Basic vowel	Guṇáḥ	Vŕ̥ddhiḥ
a	a	ā
i, ī	ē	ai
u, ū	ō	au
r̥, r̥̄	ar	ār

In this system, the guṇáḥ version of the vowel is the basic vowel prefixed with a short a, and the vŕ̥ddhiḥ version of the vowel is the basic vowel prefixed with a long ā. (The guṇa vowel a constitutes an exception to this pattern, since normally a followed by a would result in ā, but as we will see, the generalization that guṇa is meant to capture is the addition of the vowel a to a form that does not already have this vowel.)

This set of distinctions more or less maps onto the way that vowel gradation worked in Indo-European, where the vowel e (which generally corresponds with Sanskrit a) would either appear in a syllable, or not, based on morphological alternations that can ultimately be traced to accentual features (since the presence of the vowel e generally corresponds with an accented syllable). The tripartite system can thus be described in terms of an “ablauting” vowel which appears in three graded forms: ∅ (zero grade), e (full grade), ē (lengthened grade). In the following table, the reconstructed Indo-European forms are marked with an asterisk, and the Sanskrit forms follow them on the right-hand side. (Note that ∅ refers to zero or nothing.)

Zero grade		Full grade		Lengthened grade
PIE	Sanskrit	PIE	Sanskrit	PIE	Sanskrit
∅	∅	e∅	a	ē∅	ā
∅i	i	ei	ē	ēi	ai
∅u	u	eu	ō	ēu	au
∅r̥	r̥	er	ar	ēr	ār

Here are a few examples of the standard series of vowel gradation:

imaḥ “we go” :: ēti “he goes”
Contrast zero-grade i and full-grade ē of the root syllable, both from i.
jinaḥ “victorious” :: jēman “victorious” (also a proper name) :: jaitraḥ “victorious”
Contrast zero-grade ji, full-grade jē, and lengthened-grade jai, all from the root ji.
hutiḥ “offering” :: juhōti “he offers”
Contrast zero-grade hu and full-grade hō of the root syllable, both from hu.
kṣubdham “shaken” :: kṣōbhatē “he shakes”
Contrast zero-grade kṣubh and full-grade kṣōbh of the root syllable, both from kṣubh.
r̥k “a verse of worship” :: arcanam “the act of worship”
Contrast zero-grade r̥c and full-grade arc of the root syllable, both from r̥c.

Note that the vowels of “superheavy” roots, that is, roots ending either in a long vowel and a consonant, or any vowel followed by two consonants, are generally not subject to guṇáḥ. Thus the vowel in the roots jīv “live,” nind “blame,” and cint “think” is not strengthened to guṇáḥ.

§6.3.Vowel gradation with nasals

A historical perspective also allows us to include several additional cases of vowel gradation under the same system, beginning with nasals, which Indian grammarians did not consider to have guṇáḥ and vŕ̥ddhiḥ forms:

Zero grade		Full grade		Lengthened grade
PIE	Sanskrit	PIE	Sanskrit	PIE	Sanskrit
∅n̥	a	en	an	ēn	ān
∅m̥	a	em	am	ēm	ām

A few examples:

matam “thought” :: manaḥ “mind”
Contrast zero-grade ma and full-grade man, both from the root man.
gatam “gone” :: gamanam “going”
contrast zero-grade ga and full-grade gam, both from the root gam.

§6.4.Vowel gradation with laryngeals

Sanskrit presents an abnormal kind of vowel gradation in which the forms where we would expect a “basic” or “zero-grade” vowel have i or ī, and the forms where we expect a “full-grade” vowel generally have the long vowel ā. From a historical perspective, however, this is precisely the same kind of vowel gradation that we have encountered already. The difference is simply that the “basic” vowel of these forms in the zero grade was not a semivowel or a nasal, but a laryngeal, a sound which has disappeared as such in all of the daughter languages of Indo-European except Hittite. A laryngeal usually became i or ī in Sanskrit when it appeared between consonants, and it usually lengthened a preceding vowel.

Zero grade		Full grade		Lengthened grade
PIE	Sanskrit	PIE	Sanskrit	PIE	Sanskrit
∅H	i or ī	eH	ā	ēH	ā

Here are a few examples:

gī-tam “sung” :: gā-yati “sings”
Contrast the root syllables gī and gā.
hi-tam “placed” :: da-dhā-ti “places”
Contrast the root syllables hi, from dhi, and dhā.
krī-ṇī-tē “buys” [ātmanēpadám] :: krī-ṇā-ti “buys” [parasmaipadám]
contrast the syllables of the present-tense formant, or vikaraṇaḥ, ṇī and nā.

The traces left by laryngeal consonants account for a few more types of vowel gradation which otherwise appear to be irregular or exceptional. As noted above, the regular alternation between zero- and full-grade forms for roots with a nasal consonant (such as man “think”) involves the patterns a :: am and a :: an. When the root ended in a laryngeal consonant after the nasal, however, the alternation is as follows:

context	Zero grade		Full grade		Lengthened grade
context	PIE	Sanskrit	PIE	Sanskrit	PIE	Sanskrit
before consonants:	∅mH.	ām.	emH.	a.mi	ēmH.	ā.mi
before consonants:	∅nH.	ā.	enH.	a.ni	ēnH.	ā.ni
before vowels:	∅m.H	a.m	em.H	a.m	ēm.H	ā.m
before vowels:	∅n.H	a.n	en.H	a.n	ēn.H	ā.n

(The period here indicates the boundary between syllables.)

The reason for this pattern is the sound change according to which a syllabic nasal, like m̥ or n̥, when followed by a laryngeal in the same syllable, became lengthened to m̥̄ or n̥̄. (Syllable boundaries are marked in the above table by a period, where they are relevant.) This is a special case of the general rule according to which vowels followed by a laryngeal in the same syllable are lengthened. The long syllabic nasals m̥̄ and n̥̄ then became ām and ān in Sanskrit. Hence we have examples like the following alternations:

krāntam “bestridden” :: kramaḥ “stride”
Contrast the root syllables krām- and kram-, in the zero and full grade, respectively.
śāntiḥ “tranquility” :: śamanam “tranquilizing”
Contrast the root syllables śām- and śam-, in the zero and full grade.
kāntaḥ “beloved” :: kamiṣyati “will desire”
Contrast the root syllables kām- and kam-, in the zero and full grade.
jātiḥ “birth” :: janitr̥- “begetter”
Contrast the root syllables jā- and jan, in the zero and full grade.

The root jan derives from Indo-European ǵenh₁: compare Greek γίγνομαι and Latin gignō. Hence the formation jātiḥ is parallel to that of γένεσις, and janitr̥- is parallel to that of genitor.

§6.5.Saṁprasā́raṇam

So far we have considered cases in which we “augment” a sound by prefixing a vowel segment a or ā before it. But there are cases where the ablauting vowel segment (e in Indo-European, and a in Sanskrit) follows rather than precedes the other sound. In these cases, the Indian grammarians generally teach the full grade form, rather than the zero grade form, as the citation form. Thus they teach the root vac “speak” in this form, which historically corresponds to a full-grade form wekʷ-. The corresponding zero-grade form would be uc- (ukʷ-). Indian grammarians have called this kind of variation saṁprasā́raṇam or “extension,” namely, the extension of a semivowel such as y, r, or v into the corresponding vowel i, r̥, or u, with a corresponding loss of the full-grade vowel a. The following gradational patterns hold for roots of certain phonological shapes:

Zero grade (Saṁprasā́raṇam)	Full grade	Lengthened grade
∅ + i = i	i + a = ya	i + ā = yā
∅ + u = u	u + a = va	u + ā = vā
∅ + r̥ = r̥	r̥ + a = ar	r̥ + ā = ār

Here are some examples:

iṣṭam “offered” :: yajatē “he sacrifices”
uktam “spoken” :: vakti “he speaks”
pr̥ṣṭam “asked” :: papraccha “he asked”

§6.6.Ṇ-vŕ̥ddhiḥ

It is important to mention one more type of vowel gradation here, which I will call Ṇ-vr̥ddhi, since it is triggered by suffixes that Pāṇini teaches with the marker (anubandha) Ṇ. It has the following properties:

If the root ends in a vowel, it takes the vŕ̥ddhiḥ;

bhū + ṆiC → bhāváya-
ji + ṢṭraṆ → jaitrá-

If the root ends in a consonant:

it takes vŕ̥ddhiḥ if the vowel preceding that consonant is a;

pac + ṆiC → pācáya-

it takes guṇáḥ otherwise;

cur + ṆiC → cōráya-

if the final consonant is a nasal, then guṇáḥ is prescribed for a series of roots that are taught with an acute accent in the dhātupāṭhaḥ, as well as vadh and jan, while vŕ̥ddhiḥ is prescribed for all other roots.

gam + ṆiC → gamáya-

This seemingly-arbitrary collection of rules reflects a historical development that is known as Brugmann’s Law: between Indo-European and Indo-Iranian, the vowel o was lengthened to ō in an open syllable. This development thus has two conditioning factors, one morphological, and one phonological:

the vowel must be o, which in Indo-European occurred only in certain morphological contexts;
the vowel must be followed by one and only one consonant, for otherwise the syllable in which o occurs would be closed, and Brugmann’s Law would be blocked.

The second condition is where the complications arise, for Indo-European had consonants that Sanskrit does not have, namely, the laryngeal consonants, which we can represent with H. Thus roots that seem to end in a single consonant in Sanskrit might have ended in a double consonant in Indo-European, which explains why Pāṇini needs to make exceptions for certain roots, which historically ended in a laryngeal:

man + ṆiC → mānáya- (theoretically from moneye-)
man ← men did not end in a laryngeal.
śam + ṆvuL → śamaka- (theoretically from ḱomh₂eko-)
śam + ṆiC → śamaya- (theoretically from ḱomh₂eye-)
Both are from śam ← ḱemh₂.
jan + ṆiC → janaya- (theoretically from ǵonh₁aya-)
From jan ← ǵenh₁.

However, a few roots that did not historically end in laryngeals, like gam ← gʷem, became analogically included in the set of roots that take guṇáḥ rather than vŕ̥ddhiḥ before the Ṇit suffixes.

§6.7.Independent words, enclitics, and proclitics

The minimal word in Sanskrit is a bimoraic trochee, that is, a sequence of two moras or mātrās, whether represented as two light syllable or a single heavy syllable. This “minimum weight” requirement is enforced in morphology, for instance, when an augment āgamaḥ is added to a light stem in order to make it into a moraic trochee (examples include sú-t, kŕ̥-t, etc.).

Sanskrit also has a number of clitic words. These are not fully-fledged phonological words, but attach onto one end of another word, which we can call their “host.” Enclitics follow their host, and proclitics precede their host. We can furthermore distinguish between “true clitics,” which are unaccented, and “quasi-clitics,” which have an accent but otherwise behave syntactically as clitics. The true enclitics of Sanskrit include the following:

ca (indecl.) “and”;
vā (indecl.) “or”;
iva (indecl.) “as”;
hi (indecl.) “for”;
u (indecl.) [indicating an alternative];
sma (indecl.) [indicating past reference];
the enclitic forms of the personal pronouns: mā, mē, tvā, tē, nau, vām, naḥ and vaḥ;
the forms of the pronominal stem ēna-: ēnam, ēnat, ēnēna, ēnau, ēnē, ēnayōḥ, ēnān, ēnāni, ēnām, ēnayā, and ēnāḥ.

In addition, the accented words which function syntactically as enclitics include:

almost all of the other particles, including ḗva, ápi, khálu, and so on;

There are no “true proclitics” in Sanskrit, but the negative particle ná, as well as all of the preverbs upasargáḥ, precede their host and can be considered “quasi-proclitics.”

The host of a clitic is often but not necessarily the word with which it construes syntactically. For instance, in the following example, the word ca “and” construes syntactically with the word it follows in each case:

tayā sa pūtaś ca vibhūṣitaś ca “he was both purified and adorned by it” Kumārasambhavaḥ 1.28

But when enclitics construe with the entire phrase or sentence, rather than just a single word, there is a strong tendency for them to appear after the first phonological word in the sentence. This is called Wackernagel’s position, after Jacob Wackernagel, who described the phenomenon at length. For example:

mṓ ṣú naḥ sōma mr̥tyávē párā dāḥ “do not hand us over to death, O Sōma” R̥gvēdaḥ 10.59.4a, from Lowe 2011)

Sanskrit, like almost every other language, has phonological rules that govern the way that sounds interact with other sounds in connected speech. The term for “connected speech” is saṁhitā́, and the complex of phonological processes that pertain to the modification of sounds due to their contact with other sounds is called “connection” or “juncture” sandhíḥ.

Unlike most other languages, Sanskrit is typically written in such a way that these modifications are explicitly represented. We might say that Sanskrit is written phonetically rather than phonemically. If an underlying sound is reflected as a different surface sound in diverse phonological contexts, we write the surface sound.

§7.1.Internal and external sandhi

It is important to distinguish the contraints that apply to sounds in combination within a single word, and those that apply to sounds in combination within an utterance as a whole. The former is called internal sandhi and the latter is called external sandhi. Internal sandhi thus refers primarily to the juncture of morphemes at the word level, while external sandhi refers to the juncture of words at the sentence level.

There is some flexibility regarding what counts as “word” for the purposes of sandhi. Between a preverb upasargáḥ and a verbal form, generally the internal sandhi rules apply, although not consistently across the lexicon. Between two constituents of a nominal compound samāsaḥ, the rules of external sandhi generally apply.

To large extent, internal and external sandhi overlap. There are, however, a number of conceptual and practical differences. (If you have a linguistics background, you will probably recognize in external sandhi the characteristics of postlexical phonology.)

Category-sensitivity. Internal sandhi is often sensitive to whether a sound belongs to a particular morphological category (e.g., whether it belongs to a verbal root, a stem-forming suffix, a derivational suffix, or an inflectional ending). By contrast, external sandhi applies irrespective of morphology.
Structure-preservation. Internal sandhi can only produce sounds that are already represented in the lexicon. By contrast, external sandhi can produce new sounds, for instance visargáḥ, which are not part of the lexical representation of any word.
Exceptions. Internal sandhi often has exceptions in its application, whereas external sandhi applies across-the-board.
Scope. Because phonological words form the input to external sandhi, and phonologicla words can only end in a small set of permitted final sounds, there is a smaller range of combinations to which external sandhi can apply, relative to internal sandhi. For example, a palatal, voiced, or aspirate consonant will never stand in the left-hand context of an external sandhi process.
Voice assimilation. While the assimilation of voice features between adjacent stops is found in both internal and external sandhi, the voicing of voiceless sounds before all voiced sounds is a distinctive feature of external sandhi, as explained below.

§7.2.Word-final sounds

In Sanskrit, as in many other languages, there are positional restrictions on the occurrence of speech-sounds. In particular, not all sounds can occur at the end of a word. The sounds that can occur at the end of a word are called “permitted finals.”

Similar positional restrictions are found in English, for instance: ŋ can occur at the end of a word (e.g., “sing”) but not the beginning, and h can occur at the beginning of a word (e.g., “hat”) but not the end.

The sounds that can occur at the end of a phonological word are almost identical to the sounds that can occur at the end of an utterance (the so-called pausa form: see below). Nevertheless there is a conceptual and practical distinction. The conceptual distinction is that word-final sounds are constrained by word-level phonology, whereas utterance-final sounds are constrained by utterance-level (or postlexical) phonology. Essentially this means that the output of word-level phonology can serve as input to utterance-level phonology, and in particular, word-final sounds may be further modified based on the sounds that follow them within an utterance. This is the domain of external sandhi. The practical distinction is that the contrast between a final s and r is preserved at the word level, but not at the utterance level. Hence external sandhi is sensitive to whether a final visargáḥ represents an underlying s or r. By contrast, external sandhi does not care whether a final ṭ (for example) represents an underlying j, ś, ṣ or h.

The following constraints operate on speech-sounds at the end of a word:

No complex consonants. A word may not end in more than one consonant. Any consonants that would have been added after the first final consonant are dropped. Thus the following combinations of stem and ending (W§150) result in the following forms:
- tudánt-s → tudán “striking”
- údañc-s → údaṅk-s → udaṅ “upwards”
- áchānts-t → áchān “concealed”
Very occasionally complex consonants involving -rC are retained: ū́rj-s → ū́rk, ámārj-t → ámārṭ.
No aspirate consonants. Aspirate consonants, which are only marginally permitted in syllable-final position to begin with, are not allowed in word-final position. Thus:
- vīrúdh → vīrút f. “herb”
- anuṣṭúbh → anuṣṭúp f. “anuṣṭubh verse”
No palatal obstruents. Palatal obstruents, including all palatal stops (c, ch, j, and jh) and the palatal sibilant (ś) may not occur at the end of a word. In many cases, they are replaced by a velar stop (k), but in some cases, they are replaced by a retroflex stop (ṭ). The different outcomes depend largely on whether the palatal represents an etymological velar or labiovelar stop that has been palatalized in Proto-Indo-Iranian (e.g., -pac- from -kwekw-, Latin coquere), in which case it reverts to a velar, or an etymological palatovelar (e.g. -viś- from -weiḱ-), in which case it becomes a retroflex.
- sráj → srák f. “garland” १ एक॰
- virā́j → virā́ṭ “ruler” १ एक॰
- śvapac → śvapak “dog-eater” १ एक॰
No voiced obstruents. The devoicing of word-final consonants is a relatively widespread phenomenon; it occurs, for example, in German. The sound h counts as a voiced obstruent for the purposes of this constraint: it becomes k, ṭ, or t, depending on its etymological source:
- udbhíd-s → udbhít f. “herb” १ एक॰
- kāmaduh-s → kāma-dhuk “wish-granting”
- praruh-s → praruṭ “rising forth”
No ṣ. Palatal ś is already disallowed by the above rule, but retroflex ṣ becomes the corresponding stop (ṭ).
- prāvr̥ṣ-s → prāvr̥ṭ “monsoon”

The foregoing constraints mean that only vowels, voiceless unaspirated stops, nasals, and semivowels can appear at the end of a word. However:

Of the vowels, r̥̄, l̥, and l̥̄ do not actually occur.
Of the nasals, ñ never occurs, and ṇ is rare.
Of the semivowels, y and v cannot occur as word-final sounds, except as the final segment of the diphthongs ē, ai, ō, and au; l occurs very rarely; and r appears as visargáḥ (but see below).

The inventory of permitted finals is therefore: m, n, t, k, p, ṭ, and ṅ, as well as all the vowels (a, ā, i, ī, u, ū, r̥, ē, ō, ai, au). The sounds s and r are also permitted at the end of a word, but in pausa they are always represented by the visargáḥ (ḥ). (See the above note for why it is necessary to represent these sounds differently at the word level.)

§7.3.Utterance-final sounds

Just as only certain sounds can appear at the end of a phonological word, so too only certain sounds can appear at the end of an utterance. The form that a word takes when it appears at the end of an utterance is called its pausa form (because it is followed by a pause in the utterance).

The only difference between the word-level and utterance-level constraints on final sounds is that the contrast between s and r is neutralized at the utterance level. Both of these sounds become visargáḥ:

púnar → púnaḥ
mánas → mánaḥ

I often use the pausa form to represent the form of a word prior to the application of external sandhi, although strictly speaking the rule that converts final s and r to visargáḥ is postlexical and thus a rule of external sandhi.

§7.4.External consonant sandhiḥ

To “external consonant sandhiḥ” belongs all of those phonotactic processes whereby the final consonant of a word is changed due to the character of the following sound. Most of these processes can thus be thought of as regressive assimilation, i.e., a process whereby a sound on the left edge of the juncture comes to take on some of the features of a sound on the right edge of the juncture.

Assimilation of place. The only instance of assimilation to place of articulation involves a set of sounds, called coronals, that comprise palatal, retroflex, and dental sounds. Dental sounds are typically assimilated to the place of articulation of a following coronal sound, whether it is palatal or retroflex. We will first discuss the dental stop t, and then the dental nasal n.

Assimilation of t to a following palatal:

tat ca → tac ca “and that”
tat + chaviḥ → tacchaviḥ “his beauty”
tat jāyatē → taj jāyatē “that is born”
tat + jharaḥ → tajjharaḥ “its waterfall”

The case of the palatal sibilant ś is a little different, since the final coronal—usually the dental stop t—becomes the palatal stop c, and then the palatal sibilant that induced the change is also turned into the palatal aspirate stop ch. The sibilant, in other words, disappears, but there is a “trace” of it in the aspiration of the resulting palatal stop.

tat śr̥ṇu → tac chr̥ṇu “listen to that”
virāṭ śr̥ṇu → virāc chr̥ṇu “listen, king”

Assimilation to a following retroflex stop (note that there is no assimilation before a following retroflex sibilant):

tat + ṭīkā → taṭṭīkā “that commentary”
tat + ṭhakkuraḥ → taṭṭhakkuraḥ “that chief”
tat + ṣaṇḍaḥ → tatṣaṇḍaḥ “that eunuch”

Since t is already dental, the assimilation rules apply vacuously when the following sound is a dental stop or sibilant. When, however, the following sound is the dental semivowel l, it is replaced entirely by the semivowel:

tat + lōkāḥ → tallōkāḥ “those worlds”

As far as the dental nasal n is concerned, it is also generally assimilated to the place of articulation of a following coronal consonant, but with a few differences from the treatment of t. When it is followed by a coronal stop, it becomes the class nasal of that stop. When the following sound is voiced, that is the final result:

tān jayati → tāñ jayati “he conquers them”
mahān ḍāmaraḥ → mahāṇḍāmaraḥ “a great noise”

When the following sound is voiceless, however, a sibilant appears between the final n and the coronal stop that stands at the beginning of the next word. The sibilant corresponds to the place of the coronal stop, and the final n is now written as anusvāraḥ Note that the insertion of a sibilant takes place also when the following sound is a dental stop.

tān calayati → tāṁś calayati “he makes them go”
tān chagān → tāṁś chagān “those goats”
mahān ṭīkākāraḥ → mahāṁṣ ṭīkākāraḥ “the great commentator”
mahān ṭhakkuraḥ → mahāṁṣ ṭhakkuraḥ “the great chief”
mahān taruḥ → mahāṁs taruḥ “a great tree”

One further case is n followed by the dental semivowel l. The final nasal is replaced by l, as in the case of a final t (see above), but with the difference that the resulting l is nasalized and is therefore written with an ardhacandraḥ in Indic scripts. I represent this nasalization with an anusvāraḥ:

tān lōkān → tāṁl lōkān “those worlds”

Assimilation of voice. This is one of the distinctive processes of external consonant sandhiḥ, as it does not occur in internal consonant sandhiḥ. It is a regressive process: a final consonant will take on the voice features of the following sound. Because final consonants are treated as voiceless, this process basically requires final consonants to be voiced before voiced sounds.

ētat atra → ētad atra “this here”
tat + gajaḥ → tadgajaḥ “his elephant”
prāk uktam → prāg uktam “previous stated”
dik + gajaḥ → diggajaḥ “sky-elephant”

Assimilation of nasality. When the following sound is a nasal, a final stop becomes the nasal of whatever class it belongs to:

tat + mātram → tanmātram “element”
dik + nāgaḥ → diṅnāgaḥ “sky-elephant”

Final m. Before any consonant, the labial nasal m is replaced by anusvāraḥ.

tam jayati → taṁ jayati “he conquers him”
tam śāsti → taṁ śāsti “he disciplines him”
tam rōhati → taṁ rōhati “he ascends that”

Finally, there is relatively minor weight-preservation phenonemon that applies to a final n and ṅ. When this sound is preceded by a short vowel, and when the following word begins with a voice, the nasal is doubled, so as to ensure that the first word—which ended in a heavy syllable—also ends in a heavy syllable in connected or saṁhitā speech:

pratyaṅ āste → pratyaṅṅ āste “he sits facing this direction”
pacan āste → pacann āste “he sits cooking”

§7.5.Visargasandhiḥ

The sandhi-behavior of visargáḥ, also called visarjanī́yaḥ, merits a separate treatment. First, it is one of the few sets of phonotactic rules that refers to both left-hand context (what comes before the visargáḥ) and right-hand context (what comes after it). Second, visargáḥ is not itself a phoneme of the Sanskrit language, but merely a positional variant of the phonemes s and r, and as a result, the rules regarding the treatment of visargáḥ in combination refer to several distinct levels of representation. For the same reason, it is important to distinguish whether a visargáḥ represents an underlying s or an underlying r.

The treatment of visargáḥ can be phrased in the following rules:

Before a voiceless stop, visargáḥ becomes the sibilant corresponding in place of articulation to that stop:

-ḥ → -ś / __ [ c ch ]

brāhmaṇāḥ calanti → brāhmaṇāś calanti “the Brāhmaṇas walk”
rāmaḥ ca → rāmaś ca “and Rāma”
induḥ chādayati → induś chādayati “the moon covers”

-ḥ → -ṣ / __ [ ṭ ṭh ]

paṇḍitaḥ ṭīkāṁ karōti → paṇḍitaṣ ṭīkāṁ karōti “The scholar composes a commentary”

-ḥ → -s / __ [ t th ]

sūryaḥ tapati → sūryas tapati “The sun is hot”
siddhāḥ tr̥pyanti → siddhās tr̥pyanti “The siddhas are satisfied”

Since there is no sibilant with a velar or labial place of articulation, the visargáḥ remains before velar and labial voiceless stops. In some orthographic traditions, however, it is written with a distinct sign called jihvāmūlīyaḥ before a velar voiceless stop (and pronounced as x); before a labial voiceless stop it is written with another sign, called upadhmānīyaḥ (and pronounced f).

vr̥kāḥ khādanti “the wolves eat” (optionally vr̥kāx khādanti)
indraḥ pibati “Indra drinks” (optionally indraf pibati)

Before a sibilant—and all Sanskrit sibilants are voicless—visargáḥ remains. In some orthographic traditions, mainly those of South India, the visargáḥ is replaced by the following sibilant.

nr̥paḥ śāsti “the king governs” (optionally nr̥paś śāsti)
sarpaḥ sarpati “the snake slithers” (optionally sarpas sarpati)

Before any voiced sound, including vowels, what happens to the visargáḥ will depend on the preceding vowel, provided that the visargáḥ represents an underlying phoneme s.

In case the preceding vowel is a:

If the following sound is also the short vowel a, then the final sequence -aḥ becomes -ō, and the following a is elided. Its absence is usually marked with an avagrahaḥ:

-aḥ a- → -ō ’

If the following sound is any other vowel, the visargáḥ is simply dropped:

-aḥ → -a / __ [ ā i ī u ū r̥ r̥̄ l̥ ē ō ai au ]

pārthaḥ ēva → pārtha ēva
mahārājaḥ āstē → mahārāja āstē

If the following sound is a voiced consonant, the final sequence becomes -ō.

-aḥ → -ō / __ [ g gh j jh ḍ ḍh d dh b bh ṅ ñ ṇ n m y r v l h ]

saṁtuṣṭaḥ bhavati → saṁtuṣṭō bhavati
indraḥ hanti → indrō hanti

In case the preceding vowel is ā:

The visargáḥ is simply dropped (before any voiced sound, vowel or consonant).

In case the preceding vowel is anything else:

Generally, the visargáḥ is replaced by r.

-ḥ → -r / [ i ī u ū r̥ r̥̄ ē ō ai au ] __ [ a ā i ī u ū r̥ r̥̄ l̥ ē ō ai au g gh j jh ḍ ḍh d dh b bh ṅ ñ ṇ n m y v l h ]

agniḥ iva → agnir iva
vadhūḥ bhavati → vadhūr bhavati

However, when the following sound is r, the visargáḥ is dropped, with compensatory lengthening (if applicable) of the preceding vowel.

agnír rōcatē → agnī́ rōcatē

If, however, the visargáḥ represents an underlying r, then the r simply remains, except when it is followed by r. In that case, the outcome is exactly the same as an underlying s followed by r: the first r is dropped, with compensatory lengthening of the preceding vowel.

punar asti → punar asti
punar rōcatē → punā rōcatē

However, I find it easier to understand visargasandhiḥ by bearing in mind that the visargáḥ itself is the outcome of a series of phonotactic processes, and hence what visargasandhiḥ really represents is the interaction of three relatively straightforward sets of rules:

The first is a set of assimilation rules.

Voice assimilation: A word-final s or r will take on the voicing features of the sound that follows.

If the following sound is voiceless, then the s or r will also become voiceless. We can represent it already at this stage with visargáḥ, which is a voiceless sound without any distinctive place features.
If the following sound is voiced, then the s or r will also be voiced. Since r is already voiced, it stays the same at this stage. An underlying s, however, is changed to a voiced sibilant Z. As a result of voice assibilation, we end up with three possible representations:

ḥ before a voiceless sound, standing for an underlying r or s.
r before a voiced sound, standing for an underlying r;
Z before a voiced sound, standing for an underlying s.

Place assimilation: This process only applies to the voiceless sound ḥ, which arises in connection with voice assimilation. The visargáḥ is assimilated to the place of articulation of the following stop. The following outcomes are possible:

x (a voiceless velar sibilant) before any voiceless velar stop (k or kh
ś (a voiceless palatal sibilant) before any voiceless palatal stop (c or ch)
ṣ (a voiceless retroflex sibilant) before any voiceless retroflex stop (ṭ or ṭh)
s (a voiceless dental sibilant) before any voiceless dental stop (t or th)
f (a voiceless labial sibilant) before any voiceless labial stop (p or ph)

Since place assimilation only applies to stop consonants, visargáḥ remains before sibilants.

The second is a set of rules that “resolves” all of the sounds generated above, either back into visargáḥ, or otherwise into other sounds of the Sanskrit language.

The voiceless sibilants x and f are generally replaced with visargáḥ.
The voiced sibilant Z—which is simply the voiced counterpart to the voiceless visargáḥ—is resolved in a number of ways:

Generally, aZ will turn into ō before a voiced consonant.
aZ “swallows up” a following a-vowel, resulting in expressions like sō ’bravīt.
Before a vowel, aZ generally becomes a, and a hiatus remains.
āZ will generally just become ā, and a hiatus remains.
In all other cases, Z becomes r.

Finally, Sanskrit has a constraint on two r sounds occurring in a row, so if the above rules produce any such cases, they need to be resolved by deleting the first r and lengthening the previous vowel.

§7.6.Internal consonant sandhiḥ: Voice

Voice assimilation is when one sound takes on the voicing features of another sound, which is usually directly adjacent to it. In internal sandhi, voice assimilation is only triggered by obstruents: that is, when a consonant is followed by a stop or sibilant, voice features are either spread leftwards or rightwards across the entire conjunct; when a consonant is followed by a vowel, semivowel, or nasal, which are always voiced, no voice assimilation takes place. This is in contrast to external sandhi, where the final consonant of one word is always assimilated to the voice features of the following sound, regardless of what the following sound is.

Voice assimilation proceeds differently depending on the other features of the consonants involved. The main forms of voice assimilation in internal sandhi are:

Regressive voice assimilation. This occurs between two obstruents, of which the first is unaspirated. Most often, the first sound is voiced and the second sound is voiceless, and hence the entire conjunct ends up being voiceless, but in a few instances the reverse is the case: the first sound is voiceless, and the second sound is voiced, and the entire conjunct ends up being voiced.

ád-ti → átti “he eats”
vḗd-ti → vḗtti “he knows”
yuk-tá-m → yuktám (voiceless spreads left)
bhuj-tá-m → bhuktám “enjoyed” (voiceless spreads left; for depalatalization, see below)
bubhuj-sā → bubhukṣā “hunger, desire to eat” (voiceless spreads left; for depalatalization, see below)
upa-pad-ti-ḥ → upapattiḥ “making sense” (voiceless spreads left)
upá-p-∅-d-am → upábdam (voiced spreads left)
bhuṅk-dhvē → bhuṅgdhvē “you enjoy” (voiced spreads left)
śak-dhí → śagdhí “help” (voiced spreads left)

Progressive voice assimilation. This occurs between two obstruents, of which the first is aspirated. It is also called Bartholomae’s Law. When the first sound is aspirated, it passes its voice features to the following sound, rather than the reverse (as observed above). Generally, it is only voiced aspirates that form a context for this rule, since voiceless aspirates will not generally come into contact with another obstruent to their right. Hence Bartholomae’s Law can be thought of as progressive assimilation of voicing under the condition of an initial voiced aspirated.

budh-tá-ḥ → buddháḥ “awoken”
lubh-tá-ḥ → lubdháḥ “greedy”
labh-tá-ḥ → labdháḥ “obtained”

The sound s behaves regularly when it forms the right-hand context: it devoices a preceding voiced obstruent. In addition, however, it also removes any aspiration from the preceding obstruent:

labh-sya-ti → lapsyati “he will obtain”

When it forms the left-hand context, however, the outcomes call for some comment, because Sanskrit does not have any voiced sibilants. Generally s is retained before voiceless obstruents, including s; one exception is the form asi “you are” (as-si). Before voiced obstruents, it disappears, with compensatory lengthening of the previous vowel:

ās-dhvē → ādhvē “you sit”
śās-dhi → śādhi “punish”

The sound h, which is voiced and aspirated, partly behaves as any other voiced aspirate—but only partly. When it forms the right-hand context, which only happens in external sandhi, it spreads its voice features leftwards, and receives its place features from the preceding stop:

tát hí → táddhí
anuṣṭúb hí → anuṣṭúbbhí

When it forms the left-hand context, when generally happens when h stands at the end of a verbal root, the outcome depends on whether the h represents an earlier velar or palatal:

When a final h represents an earlier velar, which is most often the case in roots beginning with a dental stop such as dah “burn” and duh “milk,” it is treated as if it were gh:

dah-tá-ḥ → dagdháḥ “burned”
duh-tá-ḥ → dugdháḥ “milked”

When a final h represents an earlier palatal, the outcomes are peculiar: it is as if the underlying palatal aspirate (ȷ́h) spread its aspiration to the following stop, as usual, and then developed into a voiced palatal sibilant (ź). This voiced palatal sibilant turns a following dental sound into a retroflex sound, just as the voiceless palatal sibilant (ś) does. But because there is no voiced palatal sibilant in the phonemic inventory of Sanskrit, this sound disappears. If the vowel preceding h is a, then it becomes ō; otherwise, the vowel is simply lengthened.

ruh-tá-ḥ → ruź-dhá-ḥ → rūḍháḥ “ascended”
muh-tá-ḥ → muź-dhá-ḥ → mūḍháḥ “bewildered”
lḗh-ti → lḗź-dhi → lḗḍhi “he licks”
lih-dhvám → liź-dhvám → līḍhvám “lick”
sáh-tum → sáź-dhum → sṓḍhum “to bear”
váh-tum → vaź-dhum → vṓḍhum “to carry”

§7.7.Internal consonant sandhiḥ: Aspiration

Many cases of changes involving aspiration have been discussed above, including the progressive assimilation of aspiration from a voiced aspirated in the left-hand context, and the deaspiration of a consonant due to a following sibilant. Here we may mention one more phenomenon connected with aspiration: Grassman’s Law, the “throwing backwards” of aspiration that is conditioned by deaspiration. If a root ends in an aspirated consonant, and also begins with a stop consonant, then when the root-final consonant is deaspirated under the influence of a following sibilant, its aspiration is “thrown back” onto the root-initial stop. Here are some examples:

dōh-sya-ti → dhōkṣyati “he will milk”
dah-sya-ti → dhakṣyati “he will burn”

There is some debate about the motivation of this rule: while it is clearly a synchronic rule of Sanskrit phonology—indeed earlier stages of the language apply the rule only sporadically—it has been adduced in support of a theory that Indo-European roots were actually “biaspirate,” that is, that aspiration was a feature of the entire root, rather than one or another of its consonants.

§7.8.Internal consonant sandhiḥ: Retroflexion

Retroflexion is a phonological process whereby a dental sound (i.e., t, th, d, dh, n, or s) becomes a retroflex sound (i.e., ṭ, ṭh, ḍ, ḍh, ṇ, or ṣ) due to the influence of a preceding sound. One of the special features of retroflexion is that, under certain circumstances, it can operate at a distance: the “target” sound does not need to immediately follow the “trigger” sound.

Retroflexion of stops. The dental stops t, th, d, and dh immediately following the retroflex sibilant ṣ and the palatal sibilant ś become their retroflex equivalents; in the latter case, the palatal sibilant is changed to the retroflex sibilant ṣ. Some cases of a final j behave similarly to a final ś, in that a following dental becomes retroflex, and the triggering palatal becomes retroflex in turn, although other cases of a final j behave similarly to a final c (see below) and do not cause retroflexion.

ti-stha-ti → tíṣṭhati “he stands” परस्मै॰ प्र॰ एक॰
dus-taram → duṣ-taram → duṣṭaram “difficult to overcome” १ एक॰
dr̥ś-tá-m → dr̥ṣṭám “seen”
viś-tá-m → viṣṭám “entered”
parā-mr̥j-ta-m → parāmr̥ṣṭam “referred to”

Retroflexion of sibilants. A dental s immediately following one of the ruki sounds becomes a retroflex ṣ. ruki is an acronym for the sounds that trigger retroflexion of a sibilant: R (r, r̥, and r̥̄), U (u, ū, ō and au), K (k), and I (i, ī, ē, and ai). These sounds share the phonological feature high, i.e., they are all articulated with the tongue raised high in the mouth. This process only occurs when the s is followed by a vowel or the sounds t, th, n, m, y or v.

gurú-sú → gurúṣu “teachers” (loc.pl./saptamībahu.)
girí-sú → giríṣu “mountains” (loc.pl./saptamībahu.)
pit-r̥-sú → pitr̥ṣú “fathers” (loc.pl./saptamībahu.)
diś-sú → dik-sú (see depalatalization below) → dikṣú “directions” (loc.pl./saptamībahu.)
bi-bhar-si → bíbharṣi “you carry” (2nd.sg.parasmai./parasmai.madhyama.ēka.)

Note, in particular, that ruki does not apply when s is followed by r (the so-called tisra-rule):

tisráḥ → tisráḥ “three”
usrā́ → usrā́ “daybreak”

ruki applies even when an anusvāraḥ separates the trigger from the target, although generally only in the nominative-accusative plural of neuter stems:

sárpīṁ-si → sárpīṁṣi “butters”
jyṓtīṁ-si → jyṓtīṁṣi “celestial lights”

Note that the operation of ruki between a triggering preverb (e.g., abhí, ní, anú, nír, parí) and the initial s of a verbal form is lexically specified, that is, some verbal roots allow the initial s to be retroflex, while others do not. Those that admit of initial retroflexion are taught in the dhātupāṭha with a retroflex ṣ, and are therefore called ṣōpadēśaḥ (“taught with ṣ”), while those that do not are taught with s and called sōpadēśaḥ:

ni-snātaḥ → niṣṇātaḥ “skilled” (ṣōpadēśaḥ)
vi-sarati → visarati “spreads” (sōpadēśaḥ)

Retroflexion of nasals. The dental nasal n, when it follows the retroflex sounds r, r̥, r̥̄ and ṣ within the same word, becomes the retroflex nasal ṇ. This assimilation, which is called nati, can happen even at a distance, that is, even if there are sounds between the trigger sound and the target sound. The triggering of retroflexion is blocked, however, by coronal stops, which includes palatal, retroflex, and dental stops.

rāmāyanam → rāmāyaṇam, where n is retroflexed by r despite the intervention of -āmāya-.
arkēna → arkēṇa, where n is retroflexed by r despite the intervention of -kē-.
īkṣamānam → īkṣamāṇam, where n is retroflexed by ṣ despite the intervention of -amā-.

Contrast the case of arcanam, where the retroflexion of n by r is blocked by the palatal stop c.

§7.9.Internal consonant sandhíḥ: Depalatalization

Palatal consonants are uniquely liable to changes in place of articulation. This is because palatal consonants come from two sources in Sanskrit: Indo-European palatovelars (ḱ, ǵ, and ǵh), which became ś, j, and h respectively, and Indo-European velars and labiovelars (k, g, and gh), which were palatalized in certain contexts in the history of Indo-Iranian, and became the sounds c, j, and jh, respectively. In both cases, palatals are generally replaced with either velar or retroflex sounds in combination, but for the sound j, the outcome will depend on whether it represents an earlier palatovelar or an earlier velar. (Compare the different developments of h noted above, in which the outcome depends on whether h represents an earlier palatovelar or an earlier velar.)

The “erstwhile velars” (c and j in some contexts) revert to velars before obstruents, and regular assimilation of voice follows:

vác-ti → vákti “he speaks”
vac-dhí → vagdhí “speak!”
vác-si → vákṣi “you speak”
yuñj-tē → yuṅktē “he joins”
yuñj-dhí → yuṅgdhí “join!”

The “erstwhile palatovelars” (ś and j in some contexts) have a variety of outcomes: ṭ when final; k before s in verbal forms (with retroflexion of the following sibilant), ṭ before s in nominal forms; ḍ before voiced stops (with retroflexion of the following stop if it is a dental); ṣ before t and th (with retroflexion of the following stop).

váś-ti → váṣṭi “he wishes”
viś-su → viṭsu “among the tribes”
viś-bhiḥ → viḍbhiḥ “with the tribes”
mā́rj-ti → mā́rṣṭi “he brushes”
sŕ̥j-ti-ḥ → sŕ̥ṣṭiḥ “creation”
rāj-tra-ḥ → rāṣṭraḥ “polity”
mr̥j-dhí → mr̥ḍḍhí “brush!”

One exception to the above rule about “erstwhile palatovelars” is furnished by the roots dr̥ś “see,” spr̥ś “touch,” diś “point out” and optionally naś “be destroyed” and viś “enter.” Instead of turning the final palatal into ṭ before zero and ḍ before voiced stops in nominal forms, they turn the final palatal into k or g:

díś-su → díkṣu “among the directions”
díś-bhiḥ → dígbhiḥ “with the directions”
díś-s → dík-s → dik “direction” (nom.sg.)

§7.10.Internal consonant sandhíḥ: Assimilation of nasals

Nasal consonants are generally assimilated to the place of articulation of a following sound in internal sandhíḥ. In case the following sound is a sibilant, the nasal becomes anusvāraḥ.

man-sya-tē → maṁsyatē “he will think”
han-sya-ti → haṁsyati “he will kill”
bhu-n-k-tē → bhuṅktē “he eats”
bhu-n-j-ānaḥ → bhuñjānaḥ “eating”

A dental n is palatalized after palatal stops:

yaj-na-m → yajñam “sacrifice”
yāc-nā → yācñā “request”

§7.11.Combinations of vowels

One of the basic principles of Sanskrit phonotactics is that vowels do not directly adjoin other vowels. The juncture between a vowel and another vowel is called a hiatus, a Latin word literally meaning a “yawn,” and within a word, it only occurs in a small number of words, where it is conventionally marked by a diaeresis on the second (e.g., titaü-). Between words, hiatus sometimes occurs as a secondary outcome of some phonotactic processes, but the general tendency is to avoid hiatus whenever possible. Hence the overarching principle of “vowel sandhi” could be described as hiatus avoidance.

The following processes account for the vast majority of cases where one vowel would directly adjoin another:

§7.12.Combinations of vowels: synhaeresis

Synaeresis refers to the process by which two homorganic savarṇaḥ vowels are combined into a single long vowel Aṣṭādhyāyī 6.1.101. This only applies to simple vowels aK, since only simple vowels can be homorganic with each other (since complex vowels such as ē, ō, ai, and au each have multiple places of articulation). In the following rules, the vowels are marked with both a brevis and a longum to show that the length of the vowel does not matter.

ā̆ + ā̆ → ā

upa + arjitam → upārjitam “acquired”

ī̆ + ī̆ → ī

abhi + itam → abhītam “gone over”

ū̆ + ū̆ → ū

su + uktam → sūktam “well-said”

§7.13.Combinations of vowels: diphthongization

Diphthongization. When two vowels that are not homorganic come into contact, the outcome depends on their sequence, and in particular, on whether the higher vowel comes first or last. Height is a feature of Sanskrit phonemes corresponding to whether the tongue is raised when pronouncing them; i and u are high, but a is not. When the vowel sequence consists of a low vowel followed by a high vowel, the result is what is called a “rising diphthong,” a single vowel that starts low but ends high, such as ai (ē) and au (ō).

ā̆ + ī̆ → ē

pra + itaḥ → prētaḥ “gone forth,” a ghost

ā̆ + ū̆ → ō

upa + udghātaḥ → upōdghātaḥ “preface”
na + u → nō “not”

ā̆ + r̥ or r̥̄ → ar

mahā + r̥ṣiḥ → maharṣiḥ “great sage”

§7.14.Combinations of vowels: glide formation

Glide formation is similar to diphthongization, except that the first vowel is high, and the second vowel may be either low or high. When the second vowel is low, this process results in what are sometimes called “falling diphthongs.” The name of this process reflects the fact that the high vowel becomes a glide, that is, a non-syllabic segment with the same features. Pāṇini phrased the glide-formation rule as iKō yaṆ aCi Aṣṭādhyāyī 6.1.77, literally “a high vowel [iK = {i, ī, u, ū, r̥, r̥̄, l̥}] becomes the corresponding semivowel [yaṆ = {y, v, r, l}] before a vowel [aC = {a, ā, i, ī, u, ū, r̥, r̥̄, l̥, ē, ō, ai, au}].” (In fact glide formation does not happen before any vowel, but in Pāṇini’s grammar, this rule is bled by the synhaeresis rule, discussed above: hence the vowel in the right-hand context will never be identical to the vowel that is replaced with a semivowel.)

The general rule of glide formation is that the first vowel simply becomes the corresponding semivowel. Here are some examples from internal sandhi:

nadī́ + ā → nadyā́ (“river,” fem.sg.instr.)
vadhū́ + ā → vadhvā́ (“river,” fem.sg.instr.)
bhṓ + a + ti → bhávati (“becomes,” 3rd.sg.parasmai. present indic.)

And here are some examples from external sandhi:

dadhi # atra → dadhy atra “curd here”
madhu # atra → madhv atra “honey here”
ati + āhitam → atyāhitam “great calamity”
pitr̥ + artham → pitrartham “for the sake of the ancestors”

However, there are a number of cases in which a high vowel does not simply become the corresponding semivowel, but rather becomes a vowel-semivowel (or semivowel-vowel-semivowel) sequence. We can call this a syllabic glide. This only ever happens in internal sandhi, and only ever at the end of a morpheme. It generally serves to keep the morpheme (in many cases a verbal root) in its own syllable, thus preventing it from syllabifying with the following vowel.

An i-vowel is replaced by the syllabic glide iy, and an u-vowel is replaced by the syllabic glide uv, before an ending beginning with a vowel, in these circumstances:

at the end of a verbal root Aṣṭādhyāyī 6.4.77;
- kṣi + a + nti → kṣiyanti “they reside”
at the end of the present stem forming suffix Śnu (i.e., nu) Aṣṭādhyāyī 6.4.77;
- āp + nu + anti → āpnuvanti “they obtain”
at the end of a reduplicate, before a non-homorganic vowel Aṣṭādhyāyī 6.4.78;
- i + ēṣ + a → iyēṣa “he wanted” (3rd.sg.parasmai. perfect)
at the end of the nominal stem bhrū́ f. “eyebrow” Aṣṭādhyāyī 6.4.77;
- bhrū́ + aḥ → bhrúvaḥ “eyebrows”
at the end of the nominal stem strī́ f. “woman” Aṣṭādhyāyī 6.4.79, although optionally in the accusative singular and plural;
- strī́ + aḥ → stríyaḥ “women”

Pāṇini calls these syllabic glides iyaṄ and uvaṄ.

An r̥ at the end of a verbal root is replaced by ri before the present stem forming suffix Śa (á) of sixth-class roots, before the present stem forming suffix yaK (yá) of the passive, and before the optative endings. This ri, standing at the end of a verbal root, then takes the syllabic glide iy Aṣṭādhyāyī 7.4.28:

ā + dr̥ + a + tē → ādriyátē “he honors”
kr̥ + ya + tē → kriyátē “it is being done”

An r̥̄ at the end of a verbal root is replaced by ir Aṣṭādhyāyī 7.1.100:

kr̥̄ + a + ti → kiráti “he sprinkles”

prathamō ’dhyāyaḥ

Chapter 1

varṇavicāraḥ

Phonology

§1. Basic concepts.

§2. Vowels.

§3. Consonants.

§4. Phonemes.

§5. Syllables akṣaram.

§6. Words padāni.

§7. Phonotactics sandhíḥ.