Your Article Library

Essay on indian languages.

essay on different languages in india

ADVERTISEMENTS:

India is the home of a very large number of languages. In fact, so many languages and dialects are spoken in India that it is often described as a ‘museum of languages’. The language diversity is by all means baffling. In popular parlance it is often described as ‘linguistic pluralism’. But this may not be a correct description. The prevailing situation in the country is not pluralistic but that of a continuum. One dialect merges into the other almost imperceptibly; one language replaces the other gradually. Moreover, along the line of contact between two languages, there is a zone of transition in which people are bilingual.

Thus lan­guages do not exist in water-tight compartments. While linguistic pluralism is a state of mutual existence of several languages in a con­tiguous space, it does not preclude the possibility of inter-connections between one language and the other. In fact, these links have grown over millennia of shared history. While linguistic pluralism continues to be a distinctive feature of the modern Indian state, it will be wrong to assume that there has been no interaction between the different groups.

On the contrary, the give-and-take between the languages groups has been very common, often resulting in systematic borrow­ings from one language to the other. The cases of assimilation of one language into the other are also not uncommon. Let us look at the nature of linguistic diversity observed in India today. According to the Linguistic Survey of India conducted by Sir George Abraham Grierson towards the end of the nineteenth century, there were 179 languages and as many as 544 dialects in the country.

However, this number has to be taken with caution. It may even be misleading in the sense that dialects and languages were enumerated separately, although they were taxonomically part of the same lan­guage. Of the 179 languages as many as 116 were speech-forms of the Sino-Tibetan family, spoken by small tribal communities in the re­mote Himalayan and the northeastern parts of the country.

Even the 1961 census recorded 187 languages. This was despite the fact that the census investigation was far more systematic and the classification was based on modern linguistic criteria. Much of this diversity may be un­derstood properly in the light of some more statistics. For example, 94 out of the 187 languages were spoken by small populations of 10,000 persons or less. In the final analysis, about 97 per cent of the country’s population was found affiliated with just 23 languages.

The diversity of languages and dialects is a reality and it is not the numerical strength of the speakers of a language which is important. The important fact is that there are people who claim a certain lan­guage as their mother tongue. Another related development which contributed to linguistic di­versity was the development of script. Different Indian languages were written in different scripts. This made learning of different languages a difficult exercise. However, with the growth of scripts, written lan­guages have been successful in maintaining their record with the consequence that literary traditions have evolved.

With the develop­ment of a script, oral communication is supplemented by a more powerful form of written communication. In the course of time some of the minor dialect and language groups have lost their identity as they have been assimilated into developed languages. It is a known fact that most of the languages still serve the purpose of oral communica­tion only as speakers of these languages are still illiterate, or preliterate.

It may be assumed that in the beginning various speech communi­ties were confined to their own enclaves, more or less unaware of the existence of other language/dialect groups in the neighbourhood. Sometimes, the boundary between two dialects, or two languages, was knife-edged, as it was described by a hill-line or a river. Within the en­claves, these groups have been communicating through a common language, or dialect, for centuries.

This became the basis of their iden­tity. This traditional association with a language gave them a sense of belonging and thus inculcated in them a feeling of unity with the larger speech community. It may, however, be noted that inter-com­munication through a language or dialect is always limited in space. Individuals in their daily course of life have a limited reach. In situ­ations where the communication is largely oral the sphere of communication is even smaller. Thus, with the passage of time, each speech community gets differentiated from other communities in the neighbourhood.

This process leads to splitting of the spoken language into diverse dialects. The dialect formation is, however, within the same speech area. With the expansion of the speech territory more dialect groups emerge and the distance between them increases. Some­times the outlying dialects are so isolated from the parent language that they acquire linguistic nuances of their own sufficient enough to be recognized as an independent branch. A study of historical linguis­tics reveals that India has gone through all these phases of language development. The present linguistic map of India is naturally a prod­uct of these developments.

Language and Dialect:

The faculty of speech is by far the most distinctive human trait. Hu­man beings use a system of language for communication which distinguishes them from the rest of the animal kingdom. It was through language that communication between the different members of a human group started in the early stages of social evolution.

Lan­guage thus facilitated multiple forms of human cooperation. Eventually, a division of labour emerged, a prototype of which is un­imaginable in the animal world. However, there is no gainsaying the fact that animals also communicate with one another. They also pro­duce vocal sounds although their sounds are simple.

One can differentiate between warning calls, mating calls and those expressing anger or affection. This system of communication is simple as it lacks structure. A structured language was the invention of the human mind, and the most effective tool of communication. In this language words could be replaced easily to change the content of meaning.

In its basic characteristics a human language is essentially a signaling system in which a variety of vocal sounds are employed. These vocal sounds are produced by the peculiar constitution of the human speech organs. There is a combination of the different speech organs— the tongue, glottis, vocal cords and the palate—in producing vocal sounds which are essential elements in human articulation of language.

It appears that in the beginning speakers of a language restricted their communication to relatively small number of vocal sounds out of the many sounds which human beings were capable of making. However, the number of vocal sounds varied from language to language which indicated variations in social evolution and the material conditions of existence. Most languages are satisfied with the use of twenty or thirty such sounds. But there are other languages which have as many as sixty sounds, or even more.

There are others which have less than twenty sounds. These sounds constitute the system of a language. As we know the purpose of a language is communication and a language sooner or later tends to become symbolic, more complex and expressive of abstract ideas. The beginnings of all languages were, however, simple.

There are very specific purposes for which the language is used. The main purpose is, of course, to express oneself, to convey one’s feelings, sometimes to express a desire or pray for help. The human beings also communicate their ideas through body language either alone or in combination with vocal sounds or words.

Even today, the body language continues to be a powerful means of expression. The exchange of ideas, feelings and calls for pray or help continue in the daily course of life of a human being. Thus, an intricate pattern of hu­man cooperation and a feeling of togetherness is evolved. It is obvious that a language needs a group of people among whom communication continues through this language.

This group of people who communi­cate in a certain language may be described as the ‘speech community’. In the course of time, several speech communities are formed, each oc­cupying a chunk of a geographically contiguous space (Fig. 6.1).

Language Divergence in Space

Each language eventually expands over a territory, homogeneous in terms of its language structure—vocal sounds, words, sentences and conventionalized symbols. When a language is written in a script it lends to stabilize its distinguishing features and promotes communica­tion over long distances between people.

Origins of Language:

Origins of language are shrouded in mystery. However, it is possible to reconstruct the bits of this history. It is generally agreed that in the history of social evolution language must have arisen with the discov­ery of the art of tool-making. Understandably, the early tool-making communities must have depended on cooperation between different members of the group on a highly organized basis.

This would have been possible through the use of a language. Thus evolution of lan­guage must have progressed hand in hand with the evolution of material cultures. As the history of material cultures shows the change in techniques of tool-making was initially slow, but later on it picked up. Language also evolved with the same pace. Expressions became more and more complex with the passage of time. In fact, at every stage of evolution, there was a direct relationship between material culture and the language in use.

Evidently progress in material cul­tures shows that the functions of brain were becoming more and more complex and with these changes language also became complex. The way languages evolved from vocal sounds to words and sentences re­vealed how they became symbolic as humans tended to express abstract rather than concrete ideas.

It is obvious that in the course of evolution many languages were invented independently at different points of time in different regions of the world. They became further ramified as the social space within which inter-communication continued was always limited (Box 6.1). As a result, new groups were formed and new speech communities came into being.

Language Families

This is how the ‘families of languages’ developed. It is also understandable that the early languages were oral and writing became possible much later. In the beginning there was no need for maintaining a written record. When such a need arose writing was in­itially mostly pictorial.

The discovery of the script in the history of development of languages must have taken a painstakingly long time during which picture/signs became conventionalized. Our knowledge of the early scripts is still incomplete. For example, the script of the Indus valley (Harappan) civilization continues to pose difficulties. We have not been able to decipher it simply because we are not familiar with the system of language in which communication was conducted by the Harappan people.

India as a Linguistic Area:

Despite the widely perceived linguistic diversity India’s unity as a socio-linguistic area is quite impressive. Several linguists have analyzed the basic elements of India as a socio-linguistic area. Describing lan­guage as an ‘autonomous system’, Lachman M. Khubchandani recognized the major characteristics of the speech forms of modem In­dia. Each region of the country is characterized by the plurality of cultures and languages “with a unique mosaic of verbal experience”. In Khubchandani’s view modem languages of India represent a striking example of the process of diffusion, grammatical as well as phonetic, over many contiguous areas.

However, he considers linguistic plural­ity only as a superficial trait. “Indian masses through sustained interaction and common legacies have developed a common way to interpret, to share experiences, to think.” What has emerged is a kind of organic plurality, although the geographical distribution of speech communities suggests a kind of linguistic heterogeneity.

Some of the basic elements of India’s linguistic unity may be seen in the fuzzy na­ture of language boundaries, fluidity in language identity and complementarity of inter-group and intra-group communication. Khubchandani also emphasized the need of linking languages with the ecology of cultural regions described by him as kshetras.

As a language area India is being put to mutually contradictory linguistic interpreta­tions which confuse the issue. Perhaps a better understanding of the linguistic scene can be developed if the static account of the multiplic­ity of languages is replaced by recognition of the elements of cultural regionalism. Similarly, the issue of linguistic homogeneity which has been argued by several linguists is fraught with complexities. In this context, one can cite the example of the states of the Indian Union carved out on the principle of linguistic homogeneity. The reality is that these states are not necessarily homogeneous in their language composition and cultural attributes.

In an earlier study, Khubchandani examined the evidence on plu­ral languages and plural cultures of India. He dwelt upon the question of language in a plural society. The processes of language modern­ization and language promotion were also analyzed on the basis of a review of the language policies and planning in India. In this work, Khubchandani noted that people in certain regions of India displayed a certain degree of fluidity in their declaration of mother tongue. On this basis, he recognized two zones in which the country could be di­vided: a fluid zone and a stable zone. The fluid zone extended over the north-central region where Hindi, Urdu, Punjabi, Kashmiri and Dogri are spoken.

The stable zone, on the other hand, incorporates western, southern and eastern regions. People in these regions did not reveal any fluidity in their mother tongue declaration. Reference may also be made to the seminal work of Murray B. Emeneau who analyzed the characteristics of India as a language area. Tracing the history of development of the Indo-Aryan and the Dravidian languages he evaluated the shared experiences of the differ­ent speech communities.

Emeneau defined linguistic area as “an area which includes languages belonging to more than one family but shar­ing traits in common which are found not to belong to other members of (at least) one of the families”.

Geographic Patterning of Languages:

The geographic patterning of languages in the South Asian sub-conti­nent can perhaps be understood in the context of the space relations the region had with other parts of Asia. As already pointed out, the sub-continent marks a southward projection of the Asian landmass into the Indian Ocean. The overland connections with West and Cen­tral Asia, Tibet, China and other regions of Southeast Asia helped the process of infiltration of linguistic influences into the South Asian re­gion.

This is evident from the fact that the languages spoken in the peripheral regions of South Asia, such as Baluchistan, Pak-Afghan bor­derlands, Kashmir, Gilgit, Hunza, Baltistan and Ladakh as well as the hilly parts of Himachal Pradesh and the regions in the Northeast have strong affinity with the languages spoken in the regions beyond the Hindu-Kush Himalayas. The remote Himalayan areas became the abode of Tibeto-Chinese languages. Similarly, the Northeastern re­gion continued to receive influences from the neighbouring parts of Myanmar, Thailand and Indo-China. These regions are now the do­main of the Tibeto-Chinese (Sino-Tibetan) or Tibeto-Burman languages.

The people in the plains of North India from Sind to As­sam acquired different branches of the Indo-European family of languages. The peninsular region continued to retain the Dravidian speech-forms even though the north was completely swayed over by the Indo-European languages. Between the Indo-European and the Dravidian one finds the Austric-speaking tribes nestled in the hills of the mid-Indian region.

The linguistic heterogeneity of India can perhaps be brought to some order when one realizes that these speeches really belong to four language families: Sino-Tibetan (Tibeto-Burman), Austro-Asiatic, Dravidian and Indo-European. In the course of usage over millennia of years these language families have found for themselves niches in the Indian social space in different parts of the sub-continent.

Their geo­graphical patterning throws some light on the routes through which these language families reached India. In fact, despite the vast heteroge­neity, Indian languages experienced parallel trends in linguistic and literary development during the long phases of shared history. This has made India ‘a composite region’ in terms of linguistic attributes (Table 6.1).

Broad Classification of Modern Indian Languages

Historical Process of Language Diffusion:

The history of Indian languages is not easy to reconstruct. As an over­view of the processes of peopling of India shows, Negroids were the first people to arrive. However, we do not exactly know about their language affiliation. The subsequent waves of migrations were so strong that the Negroids lost their identity completely, leaving behind little traces of either their racial or linguistic past.

The story of the four families of languages may be briefly reca­pitulated here, although it is not easy to establish the chronological sequence in which the speakers of the Austric, Sino-Tibetan and the Dravidian languages came to India. It is almost certain that these fami­lies were already there at the time of the advent of the Indo-Aryan.

This is, however, an established fact that the Sino-Tibetan speech com­munities were Mongoloids racially. The original Sino-Tibetan, the parent of the early Chinese, is supposed to have developed somewhere in western China around 400 B.C. It is also believed that the diffusion of this language eventually affected the regions lying to the south and the southwest of China-Tibet, Ladakh, northeastern India, Myanmar and Thailand. Perhaps, the Vedic Aryans were familiar with this group. They described the Tibeto-Burman-speaking Mongoloids of the Brahmaputra valley and the adjoining regions as Kiratas.

The speakers of the Kirata family of languages are distributed all along the Himalayan axis from Baltistan and Ladakh to Arunachal Pradesh. They occupy the regions surrounding the Brahmaputra val­ley in the northeast from Nagaland to Tripura and Meghalaya. There are striking differences between the languages of the Kirata family dis­tributed over such a vast geographical area. The speakers of the Tibeto-Himalayan branch of the Kirata languages occupy the Himala­yan regions from Baltistan to Sikkim and beyond to Arunachal Pradesh.

The Bhotia group consists of the Balti, Ladakhi, Lahauli, Sherpa and the Sikkim Bhotia dialects. Linguists also recognize a Hi­malayan group consisting of Lahuli of Chamba, Kanauri and Lepcha which is distinguishable on the basis of certain linguistic traits. In the east there is a North-Assam branch including the dialects of Arunachal Pradesh, such as Miri and Mishing. In other parts of the northeast the languages belong to the Assam-Burmese branch and are divided into Bodo, Naga, Kachin and Kuki-Chin groups. The speakers of the Kirata languages came to India in different streams at different points of time. Understandably, the groups in the northwest were unrelated to the groups in the northeast.

Similarly, the Kachin and the Kuki-Chin groups followed separate routes of migration. This is why there is a vast variety of dialects within the Kirata family and the roots of lin­guistic heterogeneity go far beyond the Indian borders into the neighbouring parts of Tibet, Myanmar and Indo-China. Anthropologists as well as linguists believe that the Austric-speaking groups came earlier than the Dravidian-speaking communities. The Austric speech communities were already there in the mid-Indian region before the advent of the Dravidian. The present geography of the Austric dialect groups holds some clues to the historical processes of their diffusion into India.

Generally, the Austric family of lan­guages is recognized as consisting of a Mon-Khmer and a Munda branch. The Mon-Khmer speakers belong to two separate groups, viz., Khasi and the Nicobarese, both separated by a distance of more than 1,500 kilometres which spans over an expanse of the Bay of Bengal. There is no clarity among the scholars about the routes taken by the speakers of the Mon-Khmer dialects. The Khasi speakers themselves are surrounded by other Kirata and Arya dialects in the Meghalayan plateau.

The advent of the Dravidian in India is generally associated with a branch of the Mediterranean racial stock which was already there in India before the rise of the Indus valley civilization. In fact, archaeolo­gists believe that they were the builders of the Harappan civilization along with the Proto-Austroloids. The Dravidian speech communities were found over most of the northern and the northwestern region of India before the advent of the Indo-Aryan. However, following the rise of the Indo-Aryan in northwestern India, a linguistic change came and the Dravidian-speak- ing area shrank in its geographical extent.

The present distributions of the Dravidian dialects in different parts of North India, such as Baluchistan, Chhotanagpur plateau and eastern Madhya Pradesh, where Baruhi, Kurukh-Oraon and the Gondi are spoken respectively, suggest the earlier stage of distribution of this family of languages. In fact, Gondi is spoken in many parts of Central India from Madhya Pradesh and Maharashtra to Orissa and Andhra Pradesh.

While Dravidian speech forms were in use for many centuries in the pre- Christian era, the literary development in the Dravidian speech community could take place only in the first few centuries after Christ. It is believed that the old Tamil, old Kannada and the old Telugu had already come into being by 1000 A.D. Malayalam ac­quired its form a little later. With the Vedic Sanskrit, a branch of the Indo-European, the Indo- Aryan established itself in northwestern India. It had definite relations with the different Indo-European languages, such as Persian, Arme­nian, Greek, French, Spanish, German and English. An early form of Indo-European seems to have genetic relations with the Hittite speech of Asia Minor.

The linguists have recognized a primitive form of Indo- European in its earlier stage of development. They called it Indo-Hittite. A branch of the Indo-European which had already estab­lished itself in Mesopotamia came to be described by the linguists as Indo-Iranian. It is this Indo-Iranian branch which spread over Iran and the northwestern regions of India by the middle of the second millen­nium B.C. Among the different families of languages spoken in India the Indo-European seems to be the last to arrive. The advent of Indo- European in the South Asian sub-continent brought about a major change in the linguistic affinity of the people of northern India.

The form of Indo-European which was spoken in India came to be known as Indo-Iranian or Indo-Aryan. Its advent in India is seen with the rise of the Vedic Sanskrit. However, the old Sanskrit changed into Prakrit and several speech forms developed in different parts of northern and western India. The region lying between Saraswati and Ganga, encom­passing the upper Ganga-Yamuna doab and adjoining parts of Haryana, to the west of the Yamuna, became the stage for the trans­formation of classical Sanskrit into a Prakrit form. From this early stage of development of Prakrit came the different Indo-Aryan ver­naculars which are now spoken in north-western, north-central, central and eastern parts of India.

The Suraseni emerged in the core re­gion of the midland (Madhyadesa of the Purartas) as the popular language. Its core area extended over western Uttar Pradesh and the adjoining parts of Haryana. A developed form of this parent language is described by the linguists as Western Hindi (Fig. 6.2).

Evolution of the Indo-Aryan Language Geographic Patterning

Around the core region of Suraseni other speech forms developed on the west, south and east. These languages formed an outer band around the core language. On the west and the northwest lay the Punjabi and the Pahari dialects. Rajasthani and Gujarati emerged on the southwest. On the east, a form of language, now known as Eastern Hindi, emerged in Kosali (Awadhi). Linguists believe that these outer dialects were all more closely related to each other than any one of them was to the language of the midland.

“In fact, at an early period of the linguistic history of India, there must have been two sets of Indo-Aryan dialects—one the language of the midland and the other the group of dialects forming the outer band.” This first stage was followed by a subsequent phase of expansion. As the population of the midland region increased expansion became a necessity. Thus, on the periphery of the languages of the outer band developed new speech forms which were by and large not related to the language of the midland.

For example, while Punjabi was closely related to the language of the upper doab it got transformed into Lahnda in southwestern Punjab. This language had little relationship with the language of the midland. With increasing distance changes be­came quite pronounced. The geographical distribution of the Indo-Aryan languages may be briefly summarized here as follows: The midland language occupies the Ganga-Yamuna doab and the regions to its north and south. This core region is encircled by different speech forms in eastern Punjab, Rajasthan and Gujarat.

Further beyond in the west and the northwest, there is a band of outer languages—Kashmiri, Sindhi, Lahnda and Kohistani. The languages of this band may be described as constituting the northwestern group of the outer languages. On the southern pe­riphery lies the Marathi. In the intermediate band are situated languages, such as Awadhi, Bagheli and Chhattisgarhi. On the eastern periphery lie the three dialects of Bihari, viz., Bhojpuri, Maithili and Maghadi. The Bihari is surrounded by Oriya in the southeast and Ben­gali in the east. The languages of the eastern branch of the Indo-Aryan extend further in the east where Assamese occupies the Brahmaputra valley (Fig. 6.3).

Evolution of the Indo-Aryan Language: Geographic Patterning

Linguists believe that the development of the Indo-Aryan lan­guages completed itself through several phases. The Prakrits developed into two stages: Primary Prakrits and Secondary Prakrits. The Primary Prakrits which were the first to evolve out of the classical Sanskrit were synthetic languages with a complicated grammar.

In the course of time they ‘decayed’ into Secondary Prakrits. “Here we find the languages still synthetic, but diphthongs and harsh combinations are eschewed, till in the latest developments we find a condition of almost absolute fluidity, each language becoming an emasculated collection of vowels hanging for support on an occasional consonant. This weakness brought its own nemesis and from, say 1000 A.D., we find in existence the series of modern Indo-Aryan vernaculars, or, as they may be called Tertiary Prakrits.”

The last stage of development of the Prakrits is known as literary Apabrahmsa. It is supposed that the modern vernaculars are the direct children of these Apabrahmsas. The sequence of change was like this. The Suraseni Apabrahmsa was the parent of Western Hindi and Pun­jabi. Closely connected with it were Avanti, the parent of Rajasthani, and Gaurjari the parent of Gujarati. The other intermediate language —Kosali (Eastern Hindi)—sprang from Ardha-Magadhi Apabrahmsa. The chronological sequence may be roughly reconstructed here (Table 6.2).

Stages in the Devlopment of Indo-Aryan Languages

The different stages through which the Indo-Aryan languages passed can be depicted as on Figure 6.4.

Time-Scale of the Indo-Aryan Language (Tentative)

In a country where so many languages/dialects are spoken, and many of them are used for oral communication only, linguistic classification may not be an easy exercise. The scientific study of Indian languages, their grammar, phonetics and vocabulary which goes back to the nine­teenth century is still ridden with problems.

For one thing, linguists are still unsure of genetic relationships between one language group and the other. Their knowledge of some of the minor languages is pa­thetically inadequate. This leaves the problem of classification always open to revision. A second set of problems arises from the recognition of the major languages and their specification in the Eighth Schedule of the Indian Constitution.

There were political compulsions under which some languages were given this special status. The Eighth Schedule mentions eighteen languages; twelve of them have their own territory where they receive maximum state patronage and seem to have great potential for development. The Eighth Schedule also in­cludes languages, such as Sanskrit, Sindhi, Nepali and Urdu. The first three do not have a speech territory as such. The speakers of Urdu, Sindhi and Nepali are distributed across several states.

Minor speech communities, such as Manipuri (with a population of 1.27 million) and Konkani (with a population of 1.72 million) have also been given the status of scheduled languages. The anomaly in this approach is that while some of the minority speech groups have found a place in the Eighth Schedule, major Austric languages, such as Santali (speakers 5.22 million), Bhili (speakers 5.57 million) or Gondi (speakers 2.12 million), have been completely ignored.

The official language policy leaves the issue amenable to political manipulation. Like the Austric languages all small Tibeto-Burman languages have also been excluded from the Eighth Schedule. Hindi, which has the largest number of speakers, is an aggregate of at least fifty different dialect groups. There are at least seven states which recognize Hindi as their official lan­guage. Of the four language families (Tibeto-Burman, Austro-Asiatic, Dravidian and the Indo-Aryan branch of the Indo-European) the most diverse is the Tibeto-Burman as their speakers communicate in 70-80 different languages. Next is the Indo-Aryan with 19 languages grouped under it.

The Dravidian family incorporates 17 different speech com­munities; while the Austro-Asiatic has 14 languages. However, this analysis is incomplete because of the fact that the 1991 census used an eligibility condition for recognizing a language as a mother tongue if it had more than 10,000 speakers at the all-India level.

This resulted in the exclusion of 0.56 million speakers of different minor languages from finding a reference in the census records. There is, therefore, no recognition of these small speech communities. Due to operational difficulties, the 1991 census could not be conducted in the state of Jammu and Kashmir. This resulted in the exclusion of the Dard group of languages (Dardi, Shina, Kohistani and Kashmiri) from the census count.

The compilation of data for the 18 scheduled languages also contributed to the multiple problems of classification. For example, minor speech communities, such as Chakma and Hajong, were clubbed together with Bengali. Secondly, Yerava and Yerukala were amalgamated with Malayalam and Tamil respectively. Such examples are many. It is obvious that the census preferred to ignore the minor dialect groups.

The story of Hindi is equally interesting. For example, 50 dialect groups have been grouped under Hindi with the result that there is no scope for an analytical study of the geographical spread of these dialect groups. In the course of time many speakers of these dialects have tended to declare Hindi as their mother tongue without making refer­ence to the dialect they use.

This is evident from the fact that some 233 million speakers declared Hindi as their mother tongue. Gradu­ally, the stage is not far when many of the distinguished dialect groups, such as Braj Bhasha, Awadhi, Bhojpuri, Magadhi, Maithili and Marwari, will loose their identity at least in the census records. This withdrawal of patronage by the speakers of these dialects is a symp­tom of their eventual decline, if not death. A broad classification scheme of the four language families is given in Table 6.1 (for detailed classificatory schemes, see Tables 6.3-6.6).

Classification of Austric Languages

Numerical Strength:

Of the four language families Indo-European has by far the largest strength of speakers. In fact, three-fourths of the country’s population claimed one or the other language of the Indo-European family as their mother tongue. The Dravidian family comes next with 22.5 per cent of the total population of the country claiming affinity to it.

The speakers of the other two families—Austro-Asiatic and the Tibeto- Burman—consist of small groups. Their overall proportionate share is low: 1.13 and 0.97 per cent respectively. As already indicated, Austro- Asiatic languages are spoken by a host of tribal groups. This is also true for most of the Tibeto-Burman languages.

Among the languages of the Austro-Asiatic family, Santali is the most outstanding speech community, the numerical strength of its speakers being as high as 5.2 million. Other languages of the Munda branch, such as Ho, or of the Mon Khmer branch, such as Khasi, have a numerical strength of less than one million speakers each. Santali speakers account for 55 per cent of the entire strength of the Austro- Asiatic family. Speakers of the Ho, Khasi and Mundari languages account for 10, 9.6 and 9.1 per cent respectively of all Austric speak­ers.

There are many language groups within the Austro-Asiatic family whose numerical strength is insignificant. Reference may be made to Bhumij, Nicobarese, Gadaba and Juang. However, their declining nu­merical strength shows that conditions are not favourable for their growth. As indicated earlier, the 1991 census adopted a policy of ex­cluding all languages from the census count whose speakers numbered less than 10,000 persons at the all-India level at the time of census enu­meration. This policy was by and large negative to the interests of the tribal languages.

As is evident from Tables 6.3-6.6, there are striking differences in the numerical strength of the languages of the Tibeto-Burman family. The major speech communities include Manipuri/Meithei (1.27 mil­lion), Bodo (1.22 million), Tripuri (0.69 million), Garo (0.67 million), Lushai (0.54 million) and Miri/Mishing (0.39 million). The Manipuri and Bodo groups together account for one-third of the total strength of Tibeto-Burman speakers.

As is generally known, the major Dravidian speech communities consist of Telugu, Tamil, Kannada and Malayalam. They have been ranked here in descending order of the strength of their speakers. The Dravidian family also includes minor groups, such as Gondi (2.12 mil­lion), Tulu (1.55 million), Kurukh-Oraon (1.42 million), Kui (0.64 million), Koya (0.27 million) and Khond (0.22 million).

Many of them are tribal dialects and have an imminent risk of extinction. Gondi, for example, presents a case of language loss, as its speakers are getting as­similated into the regional languages of the state of their habitation. The same is true for other tribal dialects, unless otherwise they have come under the cover of state protection.

In terms of numerical strength of speakers Hindi is the foremost among the Indo-European languages. With 337.27 million speakers who claimed Hindi, or its different dialects, as their mother tongue, Hindi has no comparison with other languages of the family.

Bengali, Marathi and Urdu which follow in the same order have a numerical strength ranging between 43 and 69 million. Bengali and Marathi ac­count for about 10 per cent each of the total strength of speakers of the Indo-European family. Among the dialects of Hindi, Bhojpuri was claimed by 23,1 million speakers. One can compare Bhojpuri with As­samese (total speakers: 12.96 million) and Punjabi (total speakers; 23.08 million).

The other dialects of Hindi, such as Maithili, Magadhi, Awadhi, Braj Bhasha, Marwari or Chhattisgarhi figure poorly. In fact, the dialect speakers tend to declare Hindi as their mother tongue. The strength of those who declared these dialects as their mother tongue seems to be diminishing with successive censuses. The progress of the languages of the Dard group, namely, Shina, Kohistani and Kashmiri, cannot be monitored since 1991 census was not conducted in Jammu and Kashmir.

As many as 13 of the Indo-European languages have been listed in the Eighth Schedule of the Indian Constitution. They are: Kashmiri (Dard group), Sindhi (northwestern group), Hindi, Urdu, Punjabi, Nepali, Gujarati (eastern, east-central, central and northern groups), Bengali, Assamese and Oriya (eastern group) and Marathi and Konkani (southern group). While Konkani, with a total strength of 1.76 million speakers, is mentioned in the Eighth Schedule, Santali finds no place, although its speakers numbered at 5.22 million at the 1991 census.

Language Domains:

A generalized study of the domains of various languages spoken in In­dia may be helpful in understanding the historical processes leading to their geographic spread and concentration. It may also be helpful in identifying the basic elements of India’s linguistic geography. It may be worthwhile to recapitulate the historical processes that led to the evolution of language regions in India (Box 6.2).

Evolution of Languages

It is understood that the Indo-Aryan was the last to arrive. It was preceded by Dravidian, Sino-Tibetan and Austric. However, there is no clarity about the chronological sequence in which the different families came to affect the situation in India. This question has been partly answered by the linguists. Which came first? Austro-Asiatic, Sino-Tibetan or Dravidian? The Vedic Aryans had the knowledge of the Tibeto-Bur- man-speaking Mongoloids whom they described as the Kiratas. Yajurveda and Atharvaveda as well as Mahabharata and Manu Samhita also mentioned the Kiratas.

Austro-Asiatic Languages:

The domain of the Austro-Asiatic languages lies in the mid-Indian re­gion and extends from Maharashtra to West Bengal. The two outliers of this domain—Khasi and Nicobarese—have their enclaves in Meghalaya and Nicobar Islands respectively. The two pockets are separated by a vast expanse of the sea. Santali is the foremost among the Munda languages. The Santali speakers are mainly concentrated in Bihar, West Bengal and Orissa. About one-half of them live in Bihar, 35 per cent in West Bengal and 13 per cent in Orissa. The Santals living in Assam, numbering 135,000, also declared Santali as their mother tongue at the 1991 census.

Another significant language of the Munda branch is Munda/Mundari. Of the 1.27 million speakers of Mundari, 54 per cent live in Bihar, 31 per cent in Orissa and only 6 per cent in West Bengal. The domain of the Ho language lies in Bihar and Orissa. Two-thirds of all Ho speakers are confined to Bihar and the remaining one-third to Orissa. The territories of the Kharia, Korku and Savara languages extend over Bihar, Orissa and Madhya Pradesh. However, the Savara speakers are mostly confined to Orissa.

Tibeto-Burman Language:

The territory of the Tibeto-Burman languages is by and large conter­minous with the Himalayas and extends from Baltistan and Ladakh in Jammu and Kashmir to Arunachal Pradesh. It extends further to en­compass other northeastern states. The Bhotia and the Himalayan groups of the Tibeto-Burman family are confined to Jammu and Kash­mir, Himachal Pradesh, hilly Uttar Pradesh and Sikkim. The Tibetan speakers, however, have a wider spread as they are distributed over many states in India. The Tibetans are of course in exile in India and live in camps and colonies especially created for them in several states. Notable among the languages of the North Assam branch of the Tibeto-Burman family are Miri/Mishing and Adi.

More than 97 per cent of the Adi speakers are confined to Arunachal Pradesh. The Miri/Mishing speakers, on the other hand, are confined to Assam. Of the languages of the Bodo group, Bodo is largely specific to Assam where 97 per cent of its speakers live. The domains of the Garo and Tripuri lie in Meghalaya and Tripura respectively.

About 80 per cent of the Garo speakers are confined to Meghalaya whereas 17 per cent of them are based in Assam. About 93 per cent of the Tripuri speakers are confined to Tripura, although in recent years a section of their population has also moved out to Mizoram and Assam. The Bodo group also includes Karabi/Mikir and Rabha dialects.

Their speakers are mostly concentrated in Assam and Meghalaya. However, a small proportion of Rabha speakers is also found in the northern districts of West Bengal. Likewise, the Koch is confined to Meghalaya and Assam, Dimasa to Assam and Nagaland, and Lalung is specific to Assam alone. Most of the speech territory of the Naga group of languages is shared between Nagaland and Manipur. While Ao, Angami, Lotha, Pochury, Phom, Yimchingure and Khiemnungan are exclusive to Na­galand, Kabui and Tangkhul are specific to Manipur. On the other hand, Khezha and Mao are spoken both in Manipur and Nagaland. However, a small proportion of their speakers are also located in As­sam.

The Kuki-Chin languages are confined to the states of Manipur and Mizoram. Manipuri (including Meitei) has its domain in the cen­tral valley of Manipur where 87 per cent of its speakers live. A section of the Manipuri population (about 10 per cent) has also moved out to Assam. Manipuri speakers were also enumerated in Tripura, Nagaland and other parts of the northeast, although in small numbers. Lushai is confined to Mizoram. Manipur presents a case of linguistic plurality. In fact, the state is the home of many speech communities belonging to both the Naga and Kuki-Chin groups.

Notable among these dia­lects are Thado, Paite, Halam, Hmar, Kabui, Tangkhul, Gangte, Khezha, Kom, Kuki, Liangmei, Lushai, Mao, Maram, Maring, Vaiphei, Zeliang, Zemi and Zau. Lakher has its domain in Mizoram only. Migration in recent years has taken the Kuki speakers to other parts of the northeast, such as Assam, Nagaland and Tripura. Manipur may be chosen as an example to illustrate the territorial­ity of minor language groups in a contiguous geographical space. The people of Manipur exhibit a complex pattern of ethnic diversity, where each ethnic or dialect group tends to concentrate in a mono­lithic world of its own.

Broadly speaking, the population of Manipur consists of two different groups:

(a) Palaeo-Mongoloids consisting of

(i) The Meiteis, and

(ii) The hill tribes; and

(b) Immigrants mostly con­sisting of Palae-Mediterraneans further sub-divided into

(i) The Pangals (Muslim settlers), and

(ii) The Mayangs or Kols. Each of these groups can be further sub-divided on the basis of language/dialect and racial attributes.

While the Meiteis, Pangals and Mayangs are plain-dwellers, the tribes, such as Kuki-Chins and Nagas are hill-dwellers. The ethno- lingual situation in Manipur suggests that geographical factors have promoted the emergence of homogeneous dialect territories. Each dia­lect is confined to a pocket where people communicate in a given dialect. These monolithic dialect territories are contiguous. This lin­guistic plurality has survived the onslaught of time.

In his doctoral thesis Hemkhothang Lunghdim examined the pat­terns of communication in the multi-speech area of Manipur. The presence of as many as 29 major speech communities in Manipur has contributed to a type of ethno-centrism for the survival of speeches or Patois coupled with other socio-cultural differences.

Ethnic groups have often indulged in competition resulting in inter-ethnic conflicts. In any case ethnic groups strive for the preservation of their dialects even if it results in diminishing interaction between different dialect groups. Over time linguistic plurality has resulted in bilingualism or multilingualism. Many tribes have adopted elements of Meitei-Lon for mutual inter-communication (Box 6.3).

Language Plurality in Manipur

Dravidian Languages :

As is generally known, the four southern states of Andhra Pradesh, Karnataka, Tamil Nadu and Kerala are the home of the major Dravidian languages, viz., Telugu, Kannada, Tamil and Malayalam. The speakers of these languages have also moved out to other states, particularly the neighbouring states of the south in the recent past. The geographical spread of these languages is evident from Table 6.7.

Distribution of Major Dravidian Languages by States, 1991

Evidently, these languages display a high degree of concentration in their home states. The highest degree of concentration is seen in the case of Malayalam, followed by Tamil. On the other hand, the lowest degree of concentration is revealed in the case of Telugu. There are several minor speech communities within the Dravidian family. Notable among them are: Yurukala, Yerava, Tulu, Coorgi, Gondi, Malto and Kurukh-Oraon. The Kurukh-Oraon and Malto are confined to Bihar. They belong to the northern branch of the Dravidian family.

Gondi, which is classified as a language of the central Dravidian branch, is the traditional dialect of the Gonds. However, recent census data show a steep decline in the numerical strength of the Gondi speakers. At the 1991 census, the Gondi-speaking population numbered just 2.12 million. This is an indication of the loss of language due to assimilation into the dominant regional lan­guages. More than 90 per cent of the Gondi (mother tongue) speakers live in Madhya Pradesh (70 per cent) and Maharashtra (21 per cent). The remaining population is found in Andhra Pradesh and Orissa. In other states, their number is too small.

Indo-European Languages:

Both in terms of numerical strength and the territorial extent the Indo-European languages surpass all other language families in India. The speech territory extends from Rajasthan in the west to Assam in the east and from Jammu and Kashmir in the north to Goa in the south. In fact, the domain of the Indo-European family extends be­yond the borders of Rajasthan on the west and continues over adjoining Pakistan. Notable among the languages of this family are Hindi, Bengali, Marathi, Urdu, Gujarati, Oriya, Punjabi and As­samese. Keeping in view their importance as many as 13 of the Indo-European languages have been included in the Eighth Schedule of the Indian Constitution.

While Hindi and Urdu are spoken across many states, including southern states, other languages, such as Ben­gali, Marathi, Gujarati, Oriya, Punjabi and Assamese, are specific to their own states. For example, while Bengali is specific to West Ben­gal, Marathi and Gujarati have their domain in Maharashtra and Gujarat respectively. The dominance of Hindi is evident from the fact that there were 337 million speakers who claimed it as their mother tongue in the 1991 census. About 81 per cent of the total population of Bihar, 91 per cent of Haryana and 89 per cent of Himachal Pradesh claimed affinity to Hindi.

The respective percentages for Rajasthan and Madhya Pradesh were 89 and 90. The story of the Hindi-speaking population is rather complicated. There are no less than 50 dialects which are grouped under Hindi. The speakers of these dialects de­clared them as their mother tongue in the same way as millions of others accepted Hindi as their mother tongue without making any ref­erence to the dialect used by them.

These dialects are actually regional variants of a spoken language of which the standardized form written in the Devnagri script is the official Hindi. Three-fifths of all Hindi speakers are concentrated in the two northern states of Uttar Pradesh and Bihar. Of the remaining, about 17 per cent live in Madhya Pradesh and 12 per cent in Rajasthan. The rest of the Hindi- speaking population is found in Haryana, Delhi, Himachal Pradesh, Maharashtra and West Bengal (Fig. 6.5).

Distribution of Hindi Speakers, 1991

In terms of numerical strength Bengali comes next to Hindi. While its speakers are heavily concentrated in West Bengal, a sizeable proportion of Bengali speakers is also found in the neighbouring states of Assam, Bihar and Tripura. Marathi is next to Bengali. About 93 per cent of its speakers live in Maharashtra alone. However, Marathi is also spoken by a section of population in Karnataka as well as Madhya Pradesh. Urdu holds the fourth rank among the Indo-Aryan languages. Its core region overlaps with that of the Hindi.

Distribution of Urdu Speakers, 1991

While the domain of Punjabi lies in Punjab, it is widely spoken over the entire northwestern region, particularly Haryana, Himachal Pradesh, Jammu and Kashmir and northern Rajasthan. Its ter­ritorial extent is wider as Punjabi is spoken in the neighbouring Punjab in Pakistan. Within India migration processes have taken Pun­jabi-speaking population to different parts of the country (Fig. 6.7).

Distribution of Punjabi Speakers, 1991

In the east, almost 99 per cent of the Assamese speakers are confined to Assam. The domain of the Assamese is surrounded on the north, east, south-east and the south by Tibeto-Burman and Austric languages. There is a slight spill-over of the Assamese population into the neigh­bouring states of Arunachal Pradesh and Meghalaya (Fig. 6.8).

Major Language of India

In fact, they mainly live in the littoral region in the neighbourhood of Goa. Some speakers of the Konkani have also dispersed to other neighbouring states, such as Kerala and Gujarat as well as the union territory of Dadra and Nagar Haveli. Since the 1991 census was not conducted in the state of Jammu and Kashmir, the home of the Kashmiri language, it is difficult to describe the patterns of geographic distribution of the Kashmiri speakers.

Outside the state a population of 56,000 persons returned Kashmiri as their mother tongue. The Nepali-speaking population is distributed in a number of states, mostly in the neighbourhood of Ne­pal. Of the 2.07 million Nepali speakers, more than 40 per cent are concentrated in West Bengal and another 21 per cent in Assam. A small proportion (12.35 per cent) of the Nepali speakers is also found in Sikkim. They are scattered all over the northeast although in small numbers. The domain of Sindhi lies in the Sind province of the neigh­bouring state of Pakistan. The present Sindhi-speaking population in India, however, consists of population displaced in the wake of Parti­tion in 1947.

Initially, the Sindhis came to the neighbouring regions of Rajasthan and Gujarat. Later, they dispersed to other western and northern states. At the 1991 census, about two-thirds of all Sindhi speakers in India were enumerated in Gujarat and Maharashtra. Of the remaining one-third, the two states of Rajasthan and Madhya Pradesh shared together 15 per cent each. A small proportion of Sindhi speak­ers are also found in Delhi and Uttar Pradesh.

Language Scene in Tribal Areas:

The language scene in tribal areas of India deserves a mention. During the 50 years since independence, the tribes have been exposed to di­verse influences—economic, political and socio-cultural. The scene in northeastern India is somewhat different. There, the tribes have been empowered to manage their own political affairs.

In other regions of India a certain number of seats have been reserved for the tribes to en­sure their representation in the state and the national legislative bodies. These measures have paved the way for their rehabilitation in the national polity. However, a majority of Indian tribes lives in the mid-Indian region where their participation in the political processes is nominal.

Their traditional habitats lie divided between several states. They do not have much of a say in policy formulation. This has left an imprint on their traditional culture, language and social struc­ture. Language seems to have suffered the most.

First, the developmental processes initiated since independence seem to have contributed towards the disintegration of tribal economies, and their communitarian way of life. The tribes have lost their hold on the for­est resources and have been forced to depend on the market forces. In fact, the free market economy has encircled completely the petty tribal commodity trade.

Secondly, expansion of primary education has brought tribal children face to face to a new cultural situation. In the course of schooling they have been exposed to the regional lan­guages of the states of their habitation. This has paved the way for their becoming bilingual. The ultimate effect is on their traditional dialects which are on the way to decline and eventual death.

It has been noted that the Indian tribes display a very high degree of diversity in their language affinity. Despite the relative isolation of the tribal communities there have been contact areas in which give- and-take between the tribal and non-tribal languages has continued throughout history.

The geographic patterning of tribal languages sug­gests that along the zone of contact between them and the non-tribes progressive interaction has resulted in the fusion of linguistic elements on either side. This is evident from the fact that while the tribes communicate mainly in the Nishada, Kirata and Dravidian languages, they have also adopted several speech-forms of the Indo-Aryan family. The incidence of bilingualism and multilingualism among the tribes has in­creased phenomenally.

One may develop an understanding of the linguistic plurality ob­served in tribal areas by selecting the case of Austric-speaking tribes. In India, the Austric-speaking tribes are grouped into Mon-Khmer and Munda branches of languages. We have already seen that the Munda- speaking zone extends over a vast area from the Aravallis in the west to the Raj Mahal hills in the east.

Language Shift:

A striking feature of the language scene in tribal areas is the growing shift in language affinity of the tribal communities. This fluid situ­ation in which the tribes are losing their linguistic identity and are being identified with languages spoken by other tribes or the dominant regional languages of the states in which they have been living is observed in many parts of India.

An evaluation of the census data reveals gaps between the numerical strength of the ethnic tribals, say the Mundas, Santals and the Gonds and those sections of the Munda, Santal and Gond population who declared their own dialects as their mother tongue, say at the 1961 census. This lack of conformity between ethnic identity and language affinity reveals the process of language shift in a significant way. There can be several plausible explanations for this phenomenon. It may be assumed that by 1961 a major shift in the linguistic/dialectal affinity of the Indian tribes had already taken place in certain regions of the country. The 1961 census may, however, be taken as a benchmark.

It may be assumed that as tribal/non-tribal interaction was growing, a section of the tribal population shifted to other dialects/ languages with which it has no traditional affinity. This shift to the dominant languages of the regions of their habitation indicated a process of language shift and assimilation into the regional languages. The language shift was, however, not necessarily from a tribal to a non-tribal dialect. In fact, several tribal groups shifted over to the other tribal dialects as contacts between them were growing fast.

As a result they lost their own traditional dialects. The 1961 census data on linguistic affinity of the tribal communities as revealed in their declaration of a particular language as their mother tongue makes it possible to analyze the following dimensions of language shift among the tribes:

(a) Tribes speaking a dialect with which they are traditionally identified. For example, Santals declaring the Santali as their mother tongue or Gonds declaring Gondi as their mother tongue. This reveals that the tribes in certain regions of the country display continuity in their language affinity. We may call it a case of language retention.

(b) Tribes declaring a regional language as their mother tongue. For example, Santals declaring Bengali as their mother tongue in West Bengal, or Gonds declaring Oriya as their mother tongue in Orissa. This shows the ongoing process of language shift indicating that the tribal communities are getting assimilated into

the dominant regional languages. Tribal regions where the process of regional development has brought tribes face to face to non-tribes have witnessed this phenomenon more significantly.

(c) Tribes on the periphery of their traditional areas have a tendency to declare as their mother tongue a language which is spoken by a dominant tribal group or the official language of a neighbouring state. This process indicates that the tribes are getting exposed to other tribal/regional languages. As a result they are getting assimilated into these languages (Table 6.8).

Language Shift

Language Retention:

While collection of data on mother tongues may replate with prob­lems, the position as recorded by the 1961 census was that one-half of the tribal population of India retained their own dialects as mother tongues. This was an evidence of language retention. The situation, however, varied from tribe to tribe and from region to region.

The geographic patterning of the tribes still claiming affinity to their own tribal dialects revealed three major formations:

(a) Areas of tribal fastness in which tribes were by and large living in a state of exclusivity. For example, in Mizoram, Manipur, Meghalaya, Tripura, Nagaland, Arunachal Pradesh, West Bengal, Rajasthan, and Himachal Pradesh, 70-100 per cent of tribes retained their traditional dialects as their mother tongue.

(b) Areas of tribal-non-tribal inter-mingling, where tribes were living in a state of varied degrees of exposure to non-tribal economic and cultural influences. One can cite the case of Assam, Orissa, Andhra Pradesh, Madhya Pradesh, Maharashtra and Mysore (now Karnataka) where 25-60 per cent of tribes retained their own language.

(c) Areas where tribes have been assimilated into dominant cultures of the regions of their habitation. In these areas less than 10 per cent of the tribal population claimed affinity with their own traditional tribal dialect (Table 6.9).

Geographic Pattern in Language Retention

A number of doctoral theses, written under the direction of this writer at the Centre for the Study of Regional Development of the Jawaharlal Nehru University, explored these questions at length. In two of these theses the question of language shift and retention as registered among the Austric-speaking tribes of the mid-Indian region was examined.

These studies revealed that the Santals and Korkus by and large preferred to retain their traditional dialect.’ On the other hand, there were other Austric-speaking tribes who displayed a ten­dency of shift to the regional languages. The language shift was the highest among the Savaras. The Kharias, Mundas and the Hos fol­lowed.

Studies also revealed that the Bhils of Rajasthan, Gujarat and Madhya Pradesh preferred to claim regional language as their mother tongue. A study of the household-level data generated through field- work from Wanera Para, Umedgarhi, Nai Abadi and Regania villages of Bagidora tehsil of Banswara district, as well as from Banswara town, revealed that the Bhils by and large maintained Bhili as their mother tongue.

On the other hand, the Korkus of Punasa, Richhi and Udaipur villages of Khandwa district tended to switch over to the re­gional languages. In fact, they declared Nimadi as their mother tongue. However, no generalizations can be made about Korkus on this basis as in other villages they continue to retain their own dialect and declare it as their mother tongue at the successive censuses.

Another finding of this research is that tribes living in rural areas have a greater affinity with their traditional dialects as compared to the tribes in urban areas. In regions of tribal concentration, for exam­ple, tribes enjoyed a certain degree of isolation which helped them retain their language and culture.

Their stay in cities and towns, on the other hand, diluted their cultural identity and their language was the first casualty. Investigations at the household level confirmed the language shift among the Mundas of Ranchi town and the Korkus of Khandwa tehsil, East Nimar district. On the contrary, the Bhils of Banswara district, the Santals of Santal Parganas, the Korkus of Khalwa tehsil (East Nimar district) and the Mundas of the rural parts of Ranchi district have continued to retain their language.

The study noted a strong correlation between language shift and a number of as­sociated factors, such as urbanization, proportion of Hinduized tribes, and the proportion of non-primary workers. Literacy also revealed a high positive correlation with language shift. In fact, the school-going tribal children were receiving education through the medium of re­gional languages.

Obviously, schooling became a powerful instrument of bilingualism and/or language shift. In terms of exposure of the mid- Indian tribes to regional languages, Bhils stand first, followed by Mundas, Korkus and Santals. In any case, loss of language is sympto­matic of the loss of cultural identity.

While the incidence of bilingualism among the Indian tribes is very high,’ it does not mean that they have always retained their traditional languages. This is un­derstandable in view of the fact that most of the tribes are getting exposed to external influences, particularly at the schools and the mar­ket places. Interaction at these places is possible through a common language, generally a regional language or pidgin, such as the Sadan/Sadari in Ranchi, which is adopted by the tribes for day-to-day communication. Back at their homes, their own dialect reigns su­preme. This may not be true for the displaced tribals whose number is increasing day by day.

Related Articles:

  • Linguistic Diversity in India
  • Classification of Tribal Groups of India – Essay

No comments yet.

Leave a reply click here to cancel reply..

You must be logged in to post a comment.

web statistics

  • AsianStudies.org
  • Annual Conference
  • EAA Articles
  • AAS Community Forum Log In and Participate
  • AAS-in-Asia Conference July 9-11, 2024

Education About Asia: Online Archives

Multilingualism in india.

A sign at the border with Pakistan in the state of Punjab featuring the same message written in three different languages (Hindi/Urdu, Punjabi, and English) and four different scripts from the top down: Devanagari (Hindi), Punjabi, Roman (English), and Perso–Arabic (Urdu)

With a growing population of just over 1.3 billion people, India is an incredibly diverse country in many ways. This article will focus specifically on contemporary linguistic di­versity in India, first with an overview of India as a multilingual country just before and after Independence in 1947 and then through a brief outline of impacts of multilingualism on busi­ness and schools, as well as digital, visual, and print media.

India is home to many native languages, and it is also common that people speak and understand more than one language or dialect, which can entail the use of different scripts as well. India’s 2011 census documents that 121 languages are spoken as mother tongues, which is defined as the first lan­guage a person learns and uses.1 Of these languages, the Constitution of India recognizes twenty-two of them as official or “scheduled” languages. Articles 344(1) and 351 of the Constitution of India, titled the Eighth Schedule, recognizes the following languages as official languages of states of India: Assamese, Bengali, Bodo, Dogri, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Maithili, Malayalam, Manipuri, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Santhali, Sindhi, Tamil, Telugu, and Urdu.2

Six languages also hold the title of classical languages (Kannada, Malayalam, Odia, Sanskrit, Tamil, and Telugu), which are determined to have a history of recorded use for more than 1,500 years and a rich body of literature. Furthermore, for a contemporary language to also be a classical language, it must be an original language and cannot be a variety, such as a dialect, stemming from another language. Just as there are many people who wish for their mother tongues to be recognized as official, scheduled languages, there are also efforts to add Indian languages to the list of classical languages. Once a language has the official status of a classical language, the Ministry of Education organizes international awards for scholars of those languages, sets up language studies centers, and grants funding to universities to promote the study of the language. Interestingly, the Constitution of India lists no national language for the country as a whole.

Of the official, scheduled languages, Modern Standard Hindi—as an umbrella term for a family of languages—has the most mother tongue speakers, with around 528 million speakers, or 44 percent of India’s population, followed by Bengali with around 97 million speakers, or 8 percent of the popu­lation. Marathi has around 83 million speakers, or 7 percent of the population, and Telugu speakers number around 81 million, or almost 6 percent of the population. Speakers who list the remaining official languages as their mother tongues also number between 2 and 4 percent of the population, as recorded in the 2011 census. It is interesting to note that due to India’s large population, native speakers of these regional Indian languages often outnumber native speakers of other major world languages such as Korean—with 77.2 million native speakers—and Italian—with 67 million native speakers—as of 2020.3

Languages in India are categorized into language families based on their different linguistic or­igins, which often include different scripts as well. The main language families include Dravidian, Indo–Aryan, and Sino–Tibetan. Bodo is the Sino–Tibetan language spoken in northeastern Indian states with the most speakers (1.4 million). Languages considered to be mother tongues or regional languages in the south of India have grammatical structures and scripts with Dravidian roots, and languages used in the central and northern regions of India are part of the Indo–Aryan family of languages. Many central and northern Indian languages use scripts derived from the Nagari script. Contemporary variations of Hindi use the Devanagari script, and scripts used in Gujarati, Punjabi, and Marathi use Nagari-derived scripts or versions of Devanagari that include some differences in their alphabets.

a chart of several languages and their language families, and their official recognitions in various states in India

Similarly, Modern Standard Hindi and Urdu are grammatically identical, though they often differ in some vocabulary and their use of scripts, as Urdu uses a modified form of the Perso–Arabic script. As Hindi and Urdu are often considered to be one language with two scripts, a common belief is that the distinction among speaking and writing Hindi and Urdu falls along a religious divide be­tween Hindus and Muslims, where Hindus are listed as Hindi speakers and Muslims as Urdu speak­ers in government documents such as the census.4 However, in practice, the distinction between Hindi and Urdu speakers is much more fluid and complex, as linguistic boundaries rely more on geographic location and speech community.

Another aspect of India’s multilingualism is that each mother tongue, or regional language, roughly belongs to one or more states. India’s twenty-eight states have been largely organized along linguistic lines since the 1950s, just after Independence, with the formation of the southeastern state of Andhra Pradesh in 1953 for Telugu speakers. Andhra Pradesh was created after prolonged pro­tests and strikes by Telugu speakers, which included the prominent activist Potti Sreeramulu fasting for the creation of a Telugu state until his death in 1952.5 A new state was finally created in 1953 by dividing the Tamil- and Telugu-speaking regions in what, under the British, was called the Madras Presidency. Immediately after Independence, the country retained similar political divisions it had under colonial rule, which newly independent Indians felt did not accurately represent them in the new government. The state reorganization movement for Andhra Pradesh culminated in the govern­ment-organized Dhar commission, which was ordered to investigate reordering additional Indian state borders along the lines of linguistic communities, or groups of people who speak the same language. The commission produced the State Reorganization Act of 1956, which called for states to be formed to represent linguistic groups rather than to stay the way the country was divided over the course of British rule.

map of india's states and their regional language

Following the division of the state of Andhra Pradesh and the orders of the State Reorganization Act of 1956, the states of Kerala, Mysore, and Madras were formed. (In 1969, Madras was renamed Tamil Nadu, and in 1973, Mysore was renamed Karnataka.) In 1956 as well, the princely state of Hy­derabad was divided between Andhra Pradesh, Maharashtra, and Madhya Pradesh. Just as the state of Andhra Pradesh was formed after prolonged protests for the rights of Telugu speakers, the Prov­ince of Bombay was also divided between Marathi and Gujarati speakers in 1960 into the states of Gujarat and Maharashtra. The bustling port city of Bombay, later renamed Mumbai in 1995, became part of the state of Maharashtra. Large reorganization efforts along the lines of language and reli­gious communities continued with the 1966 reorganization of Patiala and East Punjab States Union (PEPSU) and Punjab into three new states, Punjab, Haryana, and Himachal Pradesh; and India’s northeastern region also underwent a linguistic and communal reorganization into different states between the years of 1963 and 1987.

Multilingualism in India has therefore played a key role in the country’s contemporary politics. State boundaries were drawn along the lines of language groups, even though regions remain lin­guistically diverse, because languages in India can be an important way of defining one’s identity. Many people in different Indian regions in cultural and religious groups retain distinct identities that set them apart from other communities through language. As India is culturally, ethnically, and religiously diverse, language is one way people maintain group identities. Identity politics have also made mother tongues an object and mode of political struggle. Authors of the State Reorganization Act felt that democratic participation would grow if local populaces could access information and participate in government programs in their mother tongues. Language is a basis of identity and is why, when state boundaries were being redrawn after Independence in 1947, languages and the areas in which they were spoken were utmost factors of importance in where the boundaries of new Indian states would be.

Today, there are twenty-eight states in India and eight union territories, or areas directly gov­erned by the federal government, and each state has at least one official language and many have two, in addition to English. In this way, with unique languages and scripts attributed to each state, India often seems like a collection of distinct countries due to the cultural and linguistic differences between states. Due to this vast diversity in languages and the way language is closely tied to identity, sometimes there are also struggles along religious and political lines that play out through language. Hindu nationalists engage in movements to spread the use of Hindi and Sanskrit as a means to spread the Hindu religion as well. Some have also felt that the distinct regional languages of states should indicate that people who do not speak those languages from birth should not be allowed to reside and work in states where they do not belong to the linguistic community. This was the message of the conservative party, the Shiv Sena, in Maharashtra.

map of india and where Hindi is spoken

English as an Indian Language

Adding to the complexity of the history of languages in South Asia, the English language was also integrated into the social fabric of the region insofar as it played a key role as a unifying language in the Constitution among north and south India at the time of Independence in 1947. Mohandas Karamchand Gandhi (1869–1948), also known by the name Mahatma, which means “Great Soul,” was a famous activist who led many Indians in peaceful protests against British rule. Gandhi and his supporters made concessions for the English language to remain in use in the new nation. They found English incredibly useful for unifying the country despite their support for Hindustani, the name for the mix of Hindi and Urdu commonly used in the northern regions of India, as the national language.

Paradoxically for Gandhi and his supporters, English represented a dividing force that emphasized the distance between educated Indian elites, who were more aligned with British colonizers, and non-En­glish-speaking, often-uneducated masses. Gandhi maintained that to have a successful Independence movement was to govern through Indian ways, including through Indian regional languages. During the late colonial period, Gandhi addressed audiences from 1916 to 1928 over English linguistic colo­nization in education. He called for education in regional languages, stating, “The question of vernac­ulars as media of instruction is of national importance,” and he criticized how “English-educated In­dians are the sole custodians of public and patriotic work”; he also said the “neglect of the vernaculars means national suicide.”6 However, the question of a national unifying language at Independence was a complicated one, as Gandhi’s call for Hindustani to wholly replace English was also rejected. Hin­dustani, later known as a variety of Hindi and Urdu, is not commonly spoken across all of India, and it is considered a northern Indian regional language since it is distinct from the language families and scripts used in south India. English, therefore, served as a utilitarian language to connect disparate and diverse areas of the newly unified country, as it still does today. Now, Indian English, with a unique vocabulary and accent, is a recognized variety of English in the world.

Ultimately, the English language was written into India’s new Constitution as a language to help with the new country’s transition from a British colonial subject to independent governance. Specifi­cally, the 1949 original Constitution of India states that “business in Parliament shall be transacted in Hindi or in English” in Article 210, Article 343 states that “the official language of the Union shall be Hindi in Devanagari script” and “for a period of fifteen years from the commencement of this Con­stitution, the English language shall continue to be used for all the official purposes of the Union for which it was being used immediately before such commencement: provided that the president may, during the said period, by order authorize the use of the Hindi language in addition to the English language.”7 The intention of including English as a language for official government purposes along with Hindi was that English would be shed as the new nation matured.

In 1963, the impending transition away from English brought about similar concerns over the need for a unifying language, which were voiced at the time of Independence. Parliament enacted the Official Languages Act of 1963 to continue the use of English and section 3 of the act extended the implementation of English for official purposes along with Hindi.8 India decided to keep English as a unifying language to connect parts of the country where Modern Standard Hindi is not commonly spoken, such as in the southern Indian states with different scripts and language roots. While English is also a legacy of the British in India, it remains a tool and window through which to gain wider knowledge and understanding of the country. English also connects India with other English-speak­ing regions of the world.

Multilingualism in Daily Life

As there are many languages in India, many Indians can speak, read, and/or write in multiple lan­guages, and multilingualism therefore is a part of daily life. Challenges and advantages of linguistic diversity affect the everyday lives of Indians in terms of businesses, educational institutions, and media. Due to the widespread use of English in India, the country is home to many international companies where English is commonly used for work. As English fluency often means higher socio­economic status in India, middle- and upper-class Indians who have greater access to English have relative ease when working, studying, traveling, and immigrating to areas of the world where English is a lingua franca, or common language.

newspapers in various languages and scripts

English is used in many office settings, especially in the international businesses and multina­tional corporations housed in India. Businesses in Indian cities hire employees from all over the country, and English is a common language among people with different mother tongues. In shops and supermarkets, many labels are written in English in the Roman script.

Other than English in business and commerce, many people also use Modern Standard Hindi as a common language, especially in the northern regions of India. In many of these places, mixing two or more languages or language varieties together when speaking is a prevalent practice known as “code-switching.”

Example of code-switching in conversation:

Waiter: Aur chahiye ? (Do you want anything else?) [Hindi] Man: Don coffee pahije (We want two coffees) [Marathi]

Code-switching as a practice is distinct from commonly used English words that have been sub­sumed into other Indian languages, which are called “loan” or “borrowed” words.9 While English and forms of code-switching are incorporated into Indian corporate culture, many people find them­selves facing barriers to communication in these settings if they are not fluent in English or Hindi.

Example of loan words:

Teacher to the class: OK, all of you, open like this. Saglyana asa ghya aani, first page ogu-da, first page war kay lihile ? (Everyone take it like this and open to the first page. On the first page, what is written?) [English and Marathi]

The merits of multilingual education have dominated the field of education policy since the colonial period in India. Education in India is delivered in many different languages, but two lan­guages are the most popular: English and Hindi. Additionally, many schools have instruction in students’ mother tongues or the regional state language as well. A plan for trilingual learning, called the three-language formula (TLF), was adopted into education policy by the Ministry of Education in 1968 and has been in discussion in parliament since 1948, just after Independence. The three-lan­guage formula requires schoolchildren to “(a) study and to receive content area instruction for twelve years in their mother tongue or the regional (state) language (which for some children will be one and the same); (b) study Hindi or English for ten to twelve years; and (c) study a modern Indian lan­guage (i.e., any one of the “scheduled” languages) or a foreign language for three to five years.” How­ever, implementation of the TLF varies widely across the country today, with many English-language schools teaching limited English or only using English books and classroom materials. The idealized linguistic model presented in the 1968 TLF policy is meant to prepare students to be trilingual should students choose to enter the predominantly English-language higher education system and/or a glo­balized workforce. Proponents of multilingual education in India call attention to the importance of the three-language formula in adequately preparing primary and secondary school students for the linguistic demands of higher education, while also maintaining the rich linguistic heritage of India.

Even outside of education and business, Indians encounter English and multiple other Indian re­gional languages in media every day. They are especially likely to speak or read English in their daily lives if they are among the middle and upper socioeconomic classes, as increased English fluency aligns closely with socioeconomic class. There are many print media publications such as newspapers and magazines in English. Each city has at least one local English-language publication, and major print and online national publications can be found in Hindi and English. Each state’s news media publications are most commonly consumed in regional languages.

Visual media as well caters to regional language-speaking audiences, where local news broad­casts will be in the regional language and national news segments or specific programs will take place in Hindi. It is also becoming increasingly common for English to be used as part of code-switching practices in visual media, though English-only Indian broadcasts may be available at certain times or through special subscriptions or satellite television programming. In India, many regions have local film industries as well, where movies and television shows are produced in regional languages. Bollywood, the largest film industry in India, is located in Mumbai in the state of Maharashtra and produces films in Hindi. Bollywood movies are popular all over the world and can be viewed with subtitles for non-Hindi-speaking audiences. Interestingly, as of yet, no mainstream visual entertain­ment media industry in India makes English-only film or television productions, unlike in news and online digital media. It cannot be emphasized enough that as access and proliferation of English varies substantially along socioeconomic class lines, access to English in business, education, and media is linked to international capital and has a great capacity to increases one’s economic and social position.

As a multilingual country, India’s diversity has proven to be both a strength and a challenge to uni­fying the nation. Hopefully, this essay illustrates how multiple languages have shaped policies from education to the political boundaries of states, and, stemming from a colonial footprint and global pressures for greater use of English in international networks, the high demand for the use of English in India.

One can see there is a careful balance to multilingualism in India. English and languages like Hindi are deemed necessary for interaction in national and international communities beyond state and national borders, while mother tongues or regional languages are also made relevant through lo­cal state governments, institutions, and cultural identity. In this way, the cultivation and practices of multilingualism in India lends itself to more than just a preservation of unique, regional identities but has great impact on how Indians interact with fellow Indians and much of the world. Multilingual­ism in India defines the nation within global and national networks and communities for business, education, and media. As language plays an important part in our daily interactions, multilingualism and linguistic diversity in India have shaped the country and unique cultural practices and policies within it.

Share this:

  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • “2011 Census Data,” Ministry of Home Affairs, Government of India, https://tinyurl.com/y66egqfl .
  • See the Eighth Schedule of The Constitution of India, 1949
  • Ethnologue, Languages of the World . Web archive (1992) at the Library of Congress at https://tinyurl.com/y6s2yjd3 .
  • Christopher Rolland King, One Language, Two Scripts: The Hindi Movement in the Nineteenth Century North India (Bombay: Oxford University Press, 1994).
  • Lisa Mitchell, Language, Emotion, and Politics in South India: The Making of a Mother Tongue (Bloomington: Indiana University Press, 2009).
  • M. K. Gandhi 1917 as cited in Speeches and Writings of M. K. Gandhi 1922, https://tinyurl.com/ybxaft9e , 307.
  • Articles 210 and 243 of The Constitution of India, 1949
  • Official Languages Act, 1963, section 3
  • Harold F. Schiffman and Michael C. Shapiro, Language and Society in South Asia (Delhi: Motilal Banarsidass, 1981).
  • Latest News
  • Join or Renew
  • Education About Asia
  • Education About Asia Articles
  • Asia Shorts Book Series
  • Asia Past & Present
  • Key Issues in Asian Studies
  • Journal of Asian Studies
  • The Bibliography of Asian Studies
  • AAS-Gale Fellowship
  • Council Grants
  • Book Prizes
  • Graduate Student Paper Prizes
  • Distinguished Contributions to Asian Studies Award
  • First Book Subvention Program
  • External Grants & Fellowships
  • AAS Career Center
  • Asian Studies Programs & Centers
  • Study Abroad Programs
  • Language Database
  • Conferences & Events
  • #AsiaNow Blog
  • What Languages Are Spoken In India?

Hindi, English, and Bengali are among the most popular languages spoken in India.

  • Hindi is the most spoken language in India with 41% of the population being first language speakers, but the other 59% of the population speak over 30 different languages.
  • Due to their long history, Tamil, Sanskrit, Malayalam, Odia, and Telugu have been designated classica languages.
  • Most Indian languages are classified into one of four groups: Afro-Asiatic, Dravidian, Indo-Aryan, and Sino-Tibetan.

With more than 1.35 billion people, the Republic of India has the second-highest population in the world. It also boasts the seventh-highest landmass with 1.27 million square miles (3.29 million square km). The Census of India of 2001 recognizes 122 major languages and 1599 other languages spoken in the country. Contrary to the popular belief, India has no national language as the Constitution of the country does not grant this title to any language spoken in the country. As per the Official Languages Act, 1963, Hindi and English are official languages in the Indian Government. Several other languages have official status at the state level. Hindi is the most spoken language in the country followed by Bengali and Marathi. Below is the list of some of the most popular languages of India:

1. Hindi - 528 million speakers

Following Mandarin, Spanish, and English, Hindi is the fourth most common first language in the world, spoken by about 41% of people in India. A descendent of Sanskrit, Hindi has been influenced by several languages over the centuries, including Dravidian tongues, Arabic, Portuguese, English, Persian, and Turkic. There are several dialects that differ between east and west variations of the language.

In recent years there has been a push to make Hindi the most-spoken language within India's borders by measures that included changing the numerals on rupee notes to Devanagari script, which is used to write Hindi in addition to several other native languages, and milestone signage on highways in Tamil Nadu, where other languages are more prevalent, were changed from English to Hindi.

2. Bengali - 97 million speakers

Also known as Bangla by native speakers, Bengali is the official language of Bangladesh, and most spoken in the Indian states of West Bengal, lower Assam and Tripura. Spoken by 8% of Indian citizens, Bengali holds the title as the fifth most-spoken first language in the world. Like Hindi, it evolved from Sanskrit, as well as Pali and Prakrit with influences from many other languages including Persian, Portuguese, Dutch, French, and English. It is now divided into eight disparate groups depending on geographical location. 

essay on different languages in india

3. Marathi - 83 million speakers

Purported to be more than 1,300 years old, Marathi is spoken by about 7% of Indian people and is the official language of states in the western regions of the country including Goa and Maharashtra. Like other Indian languages, Marathi descended from Sanskrit and is made up of at least 42 regional dialects, some of which resemble eastern Hindi in sound and structure. Its roots can also be found in Indian languages like Konkani, Goanese, Deccan, Gowlan, Ikrani, and Varhadi-Nagpuri.

4. Telugu - 81 million speakers

Telugu, a Dravidian language, is found mainly in the Indian states of Andhra Pradesh, Telangana, and Yanam, as well as the Andaman and Nicobar Islands, Karnataka, Tamil Nadu, Maharashtra, Chattisgarh, and Odisha. Its earliest-known inscriptions appear on coins from 400 BCE, which contain some Telugu words, and the first inscription written entirely in the language was created in 575 AD. It is presumed this was made by Renati Cholas, who was known for writing royal proclamations in the language rather than traditional Sanskrit.

5. Tamil - 67 Million Speakers

Like Telugu, Tamil has Dravidian roots and is spoken by close to 6% of Indian citizens, as well as being an official language in Singapore and Sri Lanka and a recognized minority language in countries like Malaysia, Mauritius, and South Africa. It is notable as one of the oldest languages in the modern world, with a literary history dating back at least 2,000 years. It is commonly spoken in southern India, primarily in the state of Tamil Nadu and the Indian Union Territory of Puducherry. Its earliest-known transcriptions date back to 500 BCE with literature appearing in about 300 CE in its original form, Old Tamil.

More in Society

This could be you...If you know where to look!

Countries With Zero Income Tax For Digital Nomads

Two beds, a table, a washstand and a toilet are behind bars. A cell in one of the worst prisons in Russia.

The World's 10 Most Overcrowded Prison Systems

The Manichaean Hall located in Longxing Temple, Zhengding, Hebei, China. By David Chen, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=2707984

Manichaeism: The Religion that Went Extinct

The school of Athens, where many philosophical concepts took shape.

The Philosophical Approach to Skepticism

A man sitting alone with the vast mountains in front of him.

How Philsophy Can Help With Your Life

An ancient calendar adorned with constellations and intricate astronomical instruments. This symbolizes various concepts, including science, astronomy, astrology, mystery, education, mysticism, numerology, occultism, divination, and philosophy.

3 Interesting Philosophical Questions About Time

Paper cut outs of a family held in the palm of hands. Concept of what is a family?

What Is The Antinatalism Movement?

Postage stamp showing Hannah Arendt by A. Marino via Shutterstock.com

The Controversial Philosophy Of Hannah Arendt

Which Language Do You Want to Learn?

  • Inside Babbel
  • Babbel Bytes

ARTICLES ABOUT

Which languages are spoken in india.

What language is spoken in India? | Babbel

Illustration by Zemir Bermeo .

The first thing you need to know about India’s linguistic landscape is that it’s impossible to speak about an ‘Indian language’ as if there were only one. Did you know that if two unknown Indians met randomly on the street, there would only be a 36% chance that they would understand each other ?  Of course, that 36% depends a lot on their ethnicity and place of origin.

For years, classifying the languages spoken in India has been a very complicated task since experts have to differentiate between dialects and mother tongues that share many similarities. This isn’t exactly surprising considering that:

  • India is the seventh largest country in the world
  • Over 1.3 billion people live in India
  • The distance between northern India and southern India is similar to the distance between Canada and Mexico

A census conducted in 2011 showed that India has about 19,569 languages and dialects, of which almost 1,369 are considered dialects and only 121 are recognized as languages (the acceptance criterion being that the language has 10,000 or more speakers). The languages spoken in India belong mainly to two big linguistic families: the Indo-European and the Dravidian; others come mainly from the Austro-Asian and Tibetan-Burman linguistic families.

‘The Indian Language’ Is Actually 22 Separate Official Languages

The Indian constitution recognizes 22 official languages: Bengali, Hindi, Maithili, Nepalese, Sanskrit, Tamil, Urdu, Assamese, Dogri, Kannada, Gujarati, Bodo, Manipur (also known as Meitei), Oriya, Marathi, Santali, Telugu, Punjabi, Sindhi, Malayalam, Konkani and Kashmiri. Tamil and Sanskrit (considered by some academics as a lingua franca in India) are the only two official classical languages.

The states of India were organized based on the common language spoken in each region, and while Hindi is the official language of the central government in India along with English, individual state legislatures can adopt any regional language as the official language of their state.

Many children in India grow up in a bilingual environment, either because their parents speak different languages or because they’re surrounded by a community that originates from another part of the country. The literacy rate in India is 71.2% and most private schools strive to motivate children to learn several languages, sometimes beginning in primary school. Public schools (generally attended by working-class children) teach in the vernacular, but there has been an effort to incorporate more English classes throughout the years.

The ‘Hindi Belt’

The Hindi Belt, or Desh Hindi, refers to the areas of India, mostly in the north, where Hindi is the official language:

  • Himachal Pradesh
  • Uttar Pradesh
  • Madhya Pradesh
  • Uttaranchal
  • Chattisgarh

The Persian-speaking Turks who invaded the plains of the Gangj and Punjab in the early 11th century named the language spoken there Hindi , the Persian word for “the language of the land of the Indus River.” Hindi is the fourth most natively-spoken language in the world . Almost 425 million people speak Hindi as a first language, and although only 12% of Hindi natives are multilingual, about 120 million people in India speak it as a second language.

From a linguistic point of view, Hindi belongs to the huge family of Indo-European languages, particularly to the Indo-Aryan branch. It stems from Sanskrit, which is written from left to right (like English) and most of its words are pronounced as they’re written.

The Use Of English In India

Although for many English is still a symbol of the British Raj, others enjoy its continued use as an official language in India, especially because it’s (unofficially) recognized as the language of business. Many tourists say that the better your English is, the more money you have in the eyes of Indian merchants.

That said, English doesn’t have a strong presence in the general social life of India, except in the upper classes. For many people in India, English is no longer a foreign language because, after almost 100 years of colonization, Indians made it their own. For cultural and linguistic reasons, Indian English is very different from Standard English, and is best known as “Hinglish.”

One of the most impressive engines of English in India is Bollywood, the mega movie industry. Many movies mix some English into their titles or popular songs. As mentioned above, it’s also used as the language of business, especially in very lucrative sectors such as technology and customer service (like the infamous call centers).

So if you’re planning to travel to India, you can probably get by in most big cities with English — but that’s not guaranteed in rural areas. But what kind of authentic travel experience would it be, anyway, if you didn’t have any linguistic challenges? 

फिर मिलेंगे!

  • Social Stream
  • Institute News

Select Page

India: The Land of Diverse Languages and Scripts

Jun 29, 2020 | Essays , Institute News

India: The Land of Diverse Languages and Scripts

An analysis of the different languages and scripts of a nation can provide a detailed understanding of its past, present, and future.

Did you know that India  one of the most linguistically diverse nations  ( fourth largest ) of the world? Our country has  four  major language families. The biggest one is  Indo-European . Next, follow the  Dravidian  languages, which are spoken mostly in the Southern parts of India. The  Austro-Asiatic  (Munda) is the third family and the fourth one is the  Tibeto-Burman . Moreover, there are several languages that do not fit into any of these families! Unlike many other places in the world, the people of India write these languages in multiple scripts, making our nation one of the most graphically diverse nations  across the globe .

An interesting fact – According to  Article 343  of the  Indian Constitution , the official language of the Union is Hindi in the Devanagari script. But the second part of this Article states that for a period of 15 years from the commencement of the Constitution, the English language shall continue to be used for all the official purposes of the Union. But it did not quite work out like that. Although English was the linguistic inheritance of British colonialism, the imposition of Hindi was a big issue and caused riots and deaths in several parts of the country. As a result, English is now considered as a de-facto official language for India.

The English language, which Indians speak, has become Indian over time. It cannot be compared to the English that is spoken in other countries. It has been significantly influenced by the multilingual nature of our nation. Similar to this, several other countries have their own versions of the English language. All these instances describe how  English has localized all over the world, blending in with the languages of different nations.

A detailed and interesting story providing a glimpse of the different languages, scripts, and practices of multilingualism in our country was published on Medium which can be found  here . It is based on one of the  sessions  (delivered by Nishaant Choksi, faculty in the Humanities and Social Sciences discipline) of the  Virtual Seminar Series  by IIT Gandhinagar.

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Exploring Linguistic Diversity in India: A Spatial Analysis

Profile image of Rajrani  Kalra

2019, Handbook of the Changing World Language Map

Related Papers

IJIRT Journal

Cultural diversity in general and linguistic diversity in particular has increasingly gaining its importance due to accelerating mobility and agility of people all over the world. As a result of this human mobility, language diffusion is also taking place, which is adding new directions in framing education policy in different country. The present study revolves around the contemporary scenario of language diversity of India with special emphasis on West Bengal.

essay on different languages in india

Bikram Lamba

Last week, Home Minister Amit Shah suggested that states should communicate with each other in Hindi instead of English. He also emphasized that Hindi should not be a substitute for local languages. "When citizens of other languages speaking states communicate with each other, it should be in the language of India," the home ministry quoted Shah as saying at a meeting of the Parliamentary Committee on Official Language.

International Encyclopedia of Linguistic Anthropology

Constantine V . Nakassis

In this essay, we reconsider the topic of "Linguistic Diversity in South Asia"-the title of the landmark 1960 volume edited by Charles Ferguson and John Gumperz-from the perspective of contemporary sociolinguistics and linguistic anthropology. Reviewing a number of case studies, we argue that empirical and theoretical accounts of language, diversity, and South Asia cannot be disassociated from the ideologies and political projects that construe, objectify, and performatively realize such terms and their referents. At the same time, however, contemporary linguistic anthropology and sociolinguistics have not disposed of the questions that animated earlier generations' investigations into linguistic diversity in the subcontinent but have reinvigorated and transformed them in sophisticated ways that are empirically sensitive to the realities of social and linguistic life in all its complex reflexivity.

Economic & Political Weekly

T Ravi Kumar

Binay Pattanayak

Royal Class Academy | رويال كلاس للبحوث والدراسات العليا والتحليل الإحصائي بالكويت

India is an influential nation in South Asia is home to the world’s second largest population. It is a country of bulk variety, arguably the most sundry nation in the world, whether it concerns religious, cultural, or ethnic diversity. The climate and landscape throughout the Indian subcontinent ranges dramatically from arid deserts to tropical rainforests. India's cultural diversity is in many ways a reflection of its varied climate. Languages, food, clothing, customs, songs and literature differ throughout India's many regions. The notable aspects of India’s soft power cover as many different spheres as imaginably possible, from simple agriculture, to ritual religious practice, to quality technological services. With a rapidly growing consumer base, and a swiftly rising overall economic output fueled by its young and increasingly educated population, India has begin its trip on the path to becoming a dominant world power within the next century as opposed to the dormant role it has played so far.

International Journal of Research

shaista afzal

Multilingualism is a gift to India. Multilingualism means using several different languages. A multi-lingual person can speak two or more than two languages very well. In terms of heterogeneity, multilingualism can be explained properly .Five language families in India marked its linguistic heterogeneity. The multilingualism in India is multidimensional and intricate. There are many changes in every single language on the basis of caste, religion, gender, occupation, age etc. An individual may use different style of language at different places .It is present in the life of all citizens. Indian multilingualism became unique because of its dynamic relationship of its language. The present work is an attempt to find out the nature of multilingualism in India. It also aims to look into the different aspects of Indian multilingualism arising due to the high diversity of Indian societies.

Alkafil Choudhury , Juri Saikia

Philology Sciences

Giridhar Rao

India’s National Education Policy 2020 (NEP 2020) promotes mother-tongue based multilingual education. Welcoming this recommendation, this essay looks at the policy in the context of India’s linguistic diversity, and the already existing provisions for multilingual education. We list some of the conceptual and implementation challenges that the language-education recommendations in NEP 2020 face. The essay also overviews a few promising initiatives that show the way forward for a just, equitable, and sustainable policy for a mother-tongue based multilingual education in a democratic polity like India.

Dominated Languages in the 21st Century: Papers from the International Conference on Minority Languages XIV

Abhimanyu Sharma

The present paper deals with the status of linguistic minorities in India and tries to give an overview of the problems plaguing Indian language policy regarding minority languages. India represents a unique case in the current global linguistic scenario, as it is the only country in the world with 23 official languages (2 official cross-regional languages and 21 official regional languages). Despite this fact, minority languages in India cannot be regarded as well protected, as obvious from the high number of languages listed as ‘endangered’ by UNESCO. The paper looks into the various forms of domination and subordination that dictate the language policy and influence the various language communities in India, including linguistic minorities. Moreover, it undertakes an analysis of the various kinds of language conflicts prevalent in the Indian linguistic situation and examines whether the language conflicts emanate from group-specific dominance and unequal status ascriptions, and secondly, whether language is simply a secondary feature in conflicts that are mainly socially, economically and politically motivated. Lastly, the paper addresses the aspect which it sees as a highly questionable part of Indian language policy, i.e. the principle of ‘rationalization’, a method developed by the Government of India to take account of the number of ‘languages’ in India, but which has been widely criticized as a ‘reductionist’ policy because through the process of ‘rationalization’, smaller and minority languages are categorized as ‘dialects’ or ‘variants’ of the so-called major languages and are thus deprived of their own independent status and identity.

RELATED PAPERS

Eleni Tziafa

Journal of Colloid and Interface Science

Ajaya Singh

Medical Hypotheses

Peter Huypens

European Scientific Journal

Edlira Gjuraj

Walter Barrella

Journal of Physics: Conference Series

khalsa al shukaili

Revista espanola de cardiologia (English ed.)

Journal of Medical Genetics

P. Meinecke

Sustainability

yuli suharnoto

Anthropological Science (Japanese Series)

kazuhiro sakaue

abdul wahid

Early Childhood Education Journal

Cristina Gillanders

Journal of Economic and Social Studies

Jeton Mazllami

venkata suresh venkataiah

Jurnal Ilmiah Bina Edukasi

Iky sistranafya

Industrial Crops and Products

Wolfgang Friedt

Michael Schmiedt

Physiologia Plantarum

Andrea Scartazza

IEEE Access

Jiří Nádvorník

Monica Heilbron

2010 20th International Conference on Pattern Recognition

terry windeatt

International Journal of Antimicrobial Agents

Phillip Bergen

hjjhgj kjghtrg

Journal of clinical nephrology

Mithat Tabaković

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Asiahighlights logo

  • 2 Weeks for Couple
  • 2 Weeks for Family
  • Thailand Lantern Festival
  • Indonesia(Bali)
  • South Korea
  • China (HK, Taiwan)
  • Itinerary Ideas
  • Asia Highlights Travel Reviews
  • Thailand Travel Reviews
  • Vietnam Travel Reviews
  • Cambodia Travel Reviews
  • Japan Travel Reviews
  • Myanmar Travel Reviews
  • China Travel Reviews

Asia Highlights TrustPilot rating

Languages of India

The different languages of india.

  • What Language Should You Learn before visiting

Indian Script and Alphabet

Indian languages from a foreigner's perspective, useful phrases in hindi.

India is the world's second most populous country and is also very culturally, religiously, and lingually diverse. With over 22 languages that are recognized by the government and hundreds of other languages that are spoken within the country, it can be difficult to know what language the local people will speak when you visit India and which language (if any) you should learn before you travel.

In this article, we will explain everything you need to know about different languages of India including where they are spoken, the writing system and alphabet, how Indian languages are different from English, and some useful travel phrases.

The languages spoken in India have ancient roots and belong to two major languages families. The majority of Indian languages belong to the Indo-Aryan family which is derived from Sanskrit and influenced strongly by Persian and Arabic. Most of North India speaks Indo-Aryan languages such as Hindi, Punjabi, and Bengali.

In southern India, most languages are from the Dravidian language family. This language family includes languages such as Tamil and Malayalam. The Dravidian languages are completely different from the Indo-Aryan languages spoken in the rest of the country.

Within these two language groups, there are many different languages and dialects. Below we will explain some of the most widely spoken Indian languages.

National Languages of India

Despite what many people believe, India does not have an official national language. Although there is a lot of debate about what language (if any) should be the national language of India, the two most likely candidates are Hindi and English because they are the most widely spoken across the country.

Hindi: Hindi is natively spoken by about 41% of the people in India and is the primary language spoken in Northern India. Hindi is the first language of people living in the states of Delhi, Haryana, Bihar, Madhya Pradesh, Uttar Pradesh, Jharkhand, Rajasthan, Himachal Pradesh, and Chattisgarh.

Although other states in northern India do not speak Hindi as their first language, the people there will be able to understand Hindi as most Indians learn Hindi in school. Some states where Hindi is not natively spoken but is widely understood include West Bengal, Punjab, Gujarat, Maharashtra, Odisha, Kashmir, and Kammu.

English: English was brought to India during its colonization by the British and has remained within the country as the lingua franca that helps Indians of many different native languages communicate. English is often used in the central government, on countrywide news channels, and in business.

English is widely understood and spoken in India and foreigners who are exploring Indian cities should have few problems getting around by just speaking English. However, if you are traveling to rural areas in India it is less likely that the locals will understand English and it will be necessary to bring a guide who speaks the local language.

Regional Languages of India

Almost every state in India has its own language or dialect. Although there are over 780 languages spoken in India, 22 languages are recognized by the government and some of the most widely spoken of these 22 are explained below.

Bengali: Bengali is a North Indian language spoken in the state of West Bengal and is the second most widely spoken language in India with over 83 million speakers. Bengali is considered to be a very poetic language and the national anthem of India was originally written in Bengali.

Telugu: Telugu is a South Indian Dravidian language that is spoken by around 74 million people in India and across the states of Andhra Pradesh, Telangana, and Yanam.

Marathi: Marathi is an Indo-Aryan language that has around 72 million speakers and is the official language of some states in western India including Goa and Maharashtra.

Tamil: Tamil is a Dravidian language that is spoken in the South Indian state of Tamil Nadu. This language has around 67 million speakers and is one of the oldest surviving languages in the world.

Urdu: Urdu is a North Indian language that is a sister language to Hindi. Hindi speakers often use many words from Urdu and most people who speak Hindi can understand those that speak Urdu and vice versa. Urdu is spoken mostly in Jammu and Kashmir and has over 59 million speakers in India.

Kannada: Like Tamil, Kannada is a South Indian Dravidian language and is one of the oldest surviving languages in the world. Kannada is spoken by the people in the state of Karnataka and has over 20 different dialects. Around 50 million people in India speak Kannada.

Other Indian Languages: Other equally important but less widely spoken regional languages of India include Gujarati, Punjabi, Assamese, Oriya, Malayalam, Konkani, Manipuri, Khasi, and Mizo.

Lingual Divide between the North and South

In the 1960s, the Indian government tried to pass a law that would make Hindi the official language of India, but this received major backlash by Indians from regions in South India where people speak languages from the Dravidian family.

The Dravidian languages such as Tamil and Kannada are actually the indigenous languages of ancient India and hold a lot of culture and heritage for those who speak it.

Although Hindi will get you by in most of India, speaking Hindi in southern India especially in the states of Tamil Nadu and Karnataka can be seen as insulting.

Explore similar itineraries: 9-Day India Tiger Safari and Golden Triangle Tour 11-Day India and Nepal Tour

Discover real reviews of Highlights Travel Family 's best-rated service across trusted platforms.

What Language Should You Learn before visiting India?

When traveling to India, it is not necessary to learn a new language because the majority of Indians speak English well. In big cities, most foreigners will have no problems getting by with English and in rural areas, you can always travel with a guide.

However, if you want to learn some Indian phrases, you will be able to impress the locals with your interest in their culture and you may make some new friends during your travels.

If you are only visiting one region in India, then the best idea is to learn a few phrases in that region's local language. However, if you are traveling across many regions in India, the best language to learn is Hindi as it is the most widely understood.

Each language in India uses a slightly different script although from an outside perspective it may be difficult to tell the difference between them. Unlike Mandarin, Indian scripts operate off of an alphabet with many different letters each of which makes a different sound.

Hindi and other Indo-Aryan languages use the Devanagari script which was taken from ancient Sanskrit and has 47 primary letters including 14 vowels and 33 consonants.

In Hindi writing, the vowels and consonants of words merge together to form one flowing shape. Written Hindi is easily recognizable by the horizontal line the runs across the top of every word. For example, the word India in Hindi is written इंडिया.

Learning a new language is always difficult and it is especially daunting when that language is very different from your native language. If you grew up speaking English, then the languages of India will look and sound very foreign to you.

In this section, we will talk about some of the major differences between Indian languages, such as Hindi, and English.

Formality and Honorifics

One of the major differences between Indian languages and English is the use of honorifics. In Indian culture, respect for age and social position are both very important and this is strongly reflected in their languages.

In India, it is seen as impolite to call someone who is older than you or of a higher social standing by their first name. Instead, there are many honorifics (respectful titles) that should be used. The Indian honorifics system is quite complicated but in most of India, if you are referring to someone who is older than you, you can call them auntie or uncle depending on their gender.

In Hindi, you can also sound respectful and polite by using the word ji (pronounced like gee). Ji is similar to the Japanese word san and is an honorific that can be added to the end of a person's name to show respect. For example, if you are talking to another person who is named Deepak and you want to be respectful you can call them Deepakji (pronounced like Dee-pak-gee).

Consonants and Pronunciation

One of the best parts of learning an Indian language is that, unlike English, everything in Hindi is pronounced exactly as it is written and once you know all the sounds learning new words is easy. However, the downside of many Indian languages is that they include sounds that do not exist in English and are difficult for native English speakers to pronounce.

In Hindi, there are three different "r" sounds and two different "t" sounds. These differences in sound don't exist in English and many English speakers will have a hard time differentiating the sounds as well as pronouncing them.

Hindi also includes many consonants that are immediately followed by an "h" sound which don't exist in English. For example, in Hindi, there is a da sound and a dha sound as well as a ka sound and a kha sound. Depending on which one you use you can say a completely different word.

Although pronouncing words in Indian languages can be difficult for foreigners, even if you do get the pronunciation wrong it is likely that the people you are speaking to will still understand what you want to say and just be happy that you are trying to learn their language.

Influence of English

One aspect of Indian languages that makes them easier to learn is that the English language has had a large influence in India. Indian languages and English have a long history of exchanging words and many Indians today will use certain English phrases such as hi, bye, and cheers instead of saying them in their native language.

When traveling in India, it is very normal to hear people speak in a mixture of English and their native Indian language. In northern India speaking in Hinglish is very common and people in cities will carry on conversations while unpredictably switching between English and Hindi within the same sentence. This means that although as travelers you might not understand a lot of the conversations around you, you can still guess and interpret generally what people are saying.

As Hindi is the most widely spoken language in India, when traveling in India it's a good idea to know some basic phrases that you can practice with the locals.

The first important word to know is thank you which is Hindi is shukriya (pronounced like shoo-cree-ya). Shukriya is a great word to use whenever you buy something or when someone helps you during your travels.

When traveling in India, it is also a good idea to know the words for yes and no in Hindi. Although everyone in India will understand the words "yes" and "no" in English. They may answer your questions using the Hindi versions which are haa (pronounced like hah) for yes and nah hi or naa (pronounced like nah-hee and nah)

Lastly, because India is a country with a high population, when using public transportation or when visiting busy tourist attractions you may want to say excuse me. In Hindi, there is no direct translation for excuse me, but Indians will often say "I'm sorry" when brushing past people instead. I'm sorry in Hindi is maaf keejiye (pronounced like maf-kee-gee-yay).

For more Hindi travel phrases please check out the article How to Say Hello in Hindi.

Visit India with Asia Highlights

Want to learn more about the rich culture and heritage of India? At Asia Highlights we can help you plan a tailor-made trip to see the vast lingual, cultural, and religious diversity of India. Our experts can help you decide which cities you want to visit as well as which attractions and activities you'll like the most so that you can have the trip of a lifetime. To get start, email us here..

Why Asia Highlights (10,000+ reviews & 98.8% 5-star rating)

  • Save Your Time:
  • Less research, more enjoyment!
  • Real-time 1V1 expert planning
  • Maximize Your Flexibility:
  • Personal local guide and ride
  • Explore at your own pace
  • Celebrate Your Journeys:
  • Specially-crafted family adventures
  • Celebrate milestones with style!

9-Day India Golden Triangle with Varanasi

  • 7-Day India Golden Triangle Tour
  • 8-Day Diwali Festival in India Golden Triangle Tour 2024
  • 9-Day India Tiger Safari and Golden Triangle Tour
  • 9-Day India Golden Triangle with Varanasi
  • 11-Day India and Nepal Tour
  • 11-Day India Holi Festival Tour 2024
  • 14-Day Rajasthan Tour Package for Westerners
  • 14-Day Romantic Honeymoon Trip in India
  • 14-Day India, Nepal, and Bhutan Tour
  • 2-Week Luxury India Tour
  • 21-Day India, Nepal and Bhutan Tour
  • India Itineraries 2024: from 5 Days to 1 Month
  • 2 Weeks in India 2024/2025: Top 4 Itineraries
  • How to Plan a Trip to India 2024/2025 for First Timers
  • How to Plan an Awesome Family Trip to India in 2024
  • Plan an Awesome Trip to India, Bhutan and Nepal (6 Tips)
  • How to Plan an India and Nepal Trip 2024/2025
  • How to Plan a Trip to India for Holi 2024
  • A Complete Guide to Experiencing Diwali
  • How Much Does a Trip to India Cost?
  • Is India Safe to Travel? Women & Kids (Tips for 2024/2025)
  • Best (and Worst) Times to Visit India 2024, Rainy Season?
  • Best Times to Visit the Taj Mahal?Expert Tips in 2024
  • Weather and Where to Go in India in Janurary 2024/2025
  • Weather and Places to Go in India in February 2024/2025
  • Weather and Places to Go in India in March 2024/2025
  • India Weather & Best Places to Visit in April 2024/2025
  • Weather in India in May 2024 & Travel Tips for First-Timers
  • India in June: Weather & Best Places to Visit 2024/2025
  • India Weather & Best Places to Visit in July 2024/2025
  • India in August: Weather & Best Places to Visit 2024/2025
  • September Weather in India & Tips (Places+Crowds) 2024
  • October Weather in India & Tips (Places+Crowds) 2024
  • Weather and Where to Go in India in November 2024
  • Weather in India in December 2024 & Tips for First Visits

essay on different languages in india

Jessie was amazing ! Everything from her level of English to her understanding of what we valued .

We had a good time exploring the city with Selinda!

Our guide lele is a wealth of information, Lele is very professional and very attentive to our needs. Lele is amazing. Lele got everything spot on. It probably helps that Guilin is a brilliant place to visit so Lele has great material to work with but that doesn't take anything away from how much Lele helped make it a great trip.

Our guide for Beijing was super knowledgeable and experienced and able to help us to achieve as much as we wanted within the time given. We had a fun time guided by him as he is also super humorous and you can see how he interacts with the vendors and people whom he comes by. Thank you for a very enjoyable time in Beijing and accommodating to all our needs!

Our China Highlight guide. Michael, was attentive, thoughtful and knowledgeable. He narrated many interesting historical events about Chengdu while touring around the city and having afternoon tea with us. He was thoughtful to provide us with snacks during long hikes at Leshan or walks around the city.

She was very flexible and added extra time when we needed it and we felt extremely well taken care of. She also chose the best restaurants for us,

Our tour guide Helen, was excellent, she was very kind, professional and passionate for her work and she also loves Pandas! She will take you to take the best panda photos and to know more about Chengdu city. Our tour was great, she took us to all our destinations always with the best spots: Temples, pagodas, famous streets, theaters, pandas...you name it! Everything was great.

He picked up our pre-booked boat/other excursions tickets so we were able to avoid all the long lines and chaos. He is knowledgeable of the places we visited, courteous, fun to travel with and well-versed in Chinese classics.

Tom is the guide that will take you to where no other guide will. We pushed for the experience and Tom and the team delivered more than what we could have ever asked for. His English speaking ability and his Chinese history knowledge is second to none.

essay on different languages in india

More reviews

Get Inspired with Some Popular Itineraries

At Asia Highlights, we create your kind of journey — your dates, your destinations, at your pace. You can have any trip tailor made for your travel.

More Travel Ideas and Inspiration

TrustPilot rating

Sign up to Our Newsletter

Be the first to receive exciting updates, exclusive promotions, and valuable travel tips from our team of experts.

Why Asia Highlights

Where can we take you today.

  • Middle East
  • African Safari

essay on different languages in india

  • Travel Agents
  • Our Differences
  • Privacy Policy

China Highlights was featured on these medias.

Address: Building 6, Chuangyi Business Park, 70 Qilidian Road, Guilin, Guangxi, 541004, China

LSE - Small Logo

  • Latest Posts
  • LSE Authors
  • Choose a Book for Review
  • Submit a Book for Review
  • Bookshop Guides

Rose Deller

December 13th, 2018.

Language Movements and Democracy in India

1 comment | 25 shares

Estimated reading time: 10 minutes

In this feature essay,  Language Movements and Democracy in India , Mithilesh Kumar Jha draws on his recent book Language Politics and Public Sphere in North India   (Oxford UP). In the piece, he argues that capturing the real and continuing tensions and challenges of democratic practices in India requires attention to how they are performed and understood within its numerous vernacular spheres, drawing particularly on the linguistic movements that have asserted the importance of ‘minor’ or ‘non-scheduled’ languages in the nation.

This essay is part of the  LSE RB Translation and Multilingualism Week , running between 10 and 14 December 2018. If you are interested in this topic, all posts published as part of the week can be accessed  here . If you would like to contribute on this topic in the future, please contact us at  [email protected]

essay on different languages in india

Language movements have been debated in numerous ways since the beginning of modern vernacular education and classificatory exercises during colonial rule. During the nationalist phase, the question of ‘national’ language became, politically and emotionally, a very charged issue. The Hindi-Urdu debate is well-known and widely explored. In the first few decades after independence, India witnessed numerous linguistic riots, the linguistic reorganisation of states and clashes between supporters of Hindi and resistances to its ‘imposition’ as the ‘national’ language, especially from speakers of Tamil and other South Indian languages.

Since then, the language issue is seen as more or less settled, although there have been various studies that critically examine the Hindi-Urdu debates, the making of Hindi as the ‘national’ language, or the making of modern Tamil, Telugu, Bengali, Panjabi and so on. But there are very few studies using language movements to understand the progress and limits of Indian democracy and its various contradictions. At best, language movements are treated merely as an identity issue. If they promote Hindi or other ‘major’ Indian languages, they are welcomed or promoted. But if they promote other ‘minor’ or ‘non-scheduled’ languages – i.e. the languages which are not part of the eighth schedule of the Indian constitution, such as Bhojpuri, Awadhi, Braj, Tulu, Bodo and so on – they are not only discouraged but also suspected.

Image Credit: Jalamb Junction Railway Station, Jalamb, Maharashtra, India, with station name in three languages (English, Hindi and the local language, in this case Marathi) ( Ganesh Dhamodkar CC BY SA 2.0 )

Language, with the beginning of print and the expansion of nationalism, is at the root of all modern social and political imaginaries. It simultaneously connects the self emotionally and psychologically with community, and that makes language a very powerful tool for social and political mobilisations. In the imaginaries of the nation, the role of a ‘national’ language is of prime ideological importance: the growth and development of one’s language is now seen as the growth and development of self and community. In modern India, Bhartendu Harishchandra’s (1850-85; a Benares-based Hindi writer and poet, also regarded as the father of the Hindi renaissance) idea of nija bhasha unnat ahai sab unnat kee mool (in the development of one’s language lies the roots of all development) became the rallying point for various linguistic communities in north India. However, this makes the language issue in a multilingual country like India even more problematic, especially when ‘minor’ and ‘non-scheduled’ languages begin to assert their demands and concerns. Usually, these movements are seen as parochial and impediments to the growth and expansion of the ‘national’ language – Hindi. However, millions of speakers of Indian languages continue to make sense of and participate in the democratic process through their vernaculars. Linguistic movements and assertions continuously alter and expand the meaning and practices of democracy in India. Therefore, without engaging with these, one’s understanding of Indian democracy shall always be incomplete or partial.

In the linguistic economy of India, we have the English elite at the top, followed by bilingual or trilingual elites with knowledge of English and one or more Indian languages. They have played a historical role in transmitting ideas like democracy or nation or swaraj (self-rule) in various vernacular spheres. Below them are the vast majority of monolingual masses with very little or no knowledge of Hindi, let alone English. In this kind of linguistic economy, one can very well infer the limits of one’s understanding of Indian democracy or polity if it takes into account the concerns of only one particular community. The majority of linguistic communities in India are still grappling with the questions of modernity, democracy, swaraj , nation and so on. And they are willing to reconcile their concerns with the nation’s, but not at the cost of their mother tongues. This make the issue of language and democracy in India even more fascinating.

Rammanohar Lohia (1910-67), the socialist ideologue, in his staunch opposition to English, understood the valuable role of Indian languages in the democratisation of state and society. He wanted Indian languages to be elevated to the status of English. However, the linguistic situation in India is very far from this ideal. English continues to be a ticket to enter into the ruling class of India. And it continues to reproduce a wide gulf between the elite and the masses. Shall India ever overcome this contradiction? Do linguistic movements have the potential to radically alter the privileges associated with a particular language?

Language, although in a limited sense, did provide a modern secular tool for people to connect together by transcending the boundries of caste, religion, class and gender. And a critical understanding of the rise and assertion of linguistic movements in different parts of the country will help one understand processes of domination and subordination. With the standardisation of a language, many languages, even those with rich literary histories, have lost their status, but the speakers of these languages are conscious of their distinctiveness. And when the opportune time comes, they do assert this. Many linguistic movements emerged as a challenge to their appropriation by a standard language. In north India, the speakers of Maithili, Bhojpuri, Awadhi and Braj are making such claims.

There is another aspect to these linguistic movements. There are tendencies towards reproducing the age-old and existing hierarchies within them, even when these movements have been fighting against their appropriation by a standard or ‘major’ language. Within their own spheres, they also try to marginalise their own ‘varieties’ or ‘sub/dialects’. Often these movements are appropriated by the dominant castes and classes. But it is also in these spheres that such dominations are challenged and countered. For example, in the Maithili movement, the leadership has been exclusively in the hands of upper caste Brahmins and Kayasthas. But such hegemony is being increasingly questioned in the movement’s contemporary phase. To democratise the state and its institutions, it is essential to democratise society. Can it be done without democratising vernacular spheres where real battles between democratic and undemocratic forces are fought every day?

Language movements in India provide a valuable source for understanding the trajectories of ideas like democracy, swaraj and nation in modern India. Indeed, deeper engagements with Indian languages and their literary spheres will not only broaden our understanding of Indian democracy and its various challenges, but also the entanglements of these communities with modernity. Were imaginaries in these vernacular spheres distinct from national imaginaries? How did these hierarchical societies and communities reconcile with modern ideals like democracy or equal citizenship? In other words, the real entanglements of democracy in India can be better explained by closely engaging with modern Indian languages and their public spheres. These spheres are not necessarily democratic, but without making them such, trajectories of Indian democracy shall always be incomplete.

Dr. Mithilesh Kumar Jha teaches political science in the Department of Humanities and Social Science, IIT Guwahati. His most recent publication is Language Politics and Public Sphere in North India: Making of the Maithili Movement (OUP 2018).

Note: This feature essay gives the views of the author, and not the position of the LSE Review of Books blog, or of the London School of Economics. 

Print Friendly, PDF & Email

About the author

' src=

  • Pingback: Editor's Column: Introducing the LSE Review of Books Translation and Multilingualism Week | LSE Review of Books

Leave a Reply Cancel reply

Related posts.

essay on different languages in india

Book Review: Going to War in Iraq: When Citizens and the Press Matter by Stanley Feldman, Leonie Huddy and George E. Marcus

January 15th, 2016.

essay on different languages in india

Book Review: Democracy Despite Itself: Why a System That Shouldn’t Work at All Works So Well by Danny Oppenheimer and Mike Edwards

August 2nd, 2012.

essay on different languages in india

Most popular bookshop guides from 2014

December 28th, 2014, book review: morality politics in western europe: parties, agendas and policy choices, edited by isabelle engeli, christoffer green-pedersen, lars thorup larsen, march 6th, 2013, subscribe via email.

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Email Address

essay on different languages in india

  • Festival of Democracy
  • Latest News
  • Andhra Pradesh
  • Visakhapatnam

Entertainment

  • Photo Stories
  • Delhi Region
  • Education & Careers
  • Media Outreach
  • Revanth Reddy
  • Telangana Assembly
  • Andhra Pradesh News Updates
  • telangana news updates
  • 2024 Lok Sabha elections
  • Meta Unveils Next-Gen AI Chips for Faster Performance
  • Dyson's AR Tool: Spotless Cleaning Made Quick & Precise
  • Like Jagan Reddy, am not destructive, says Naidu
  • 'Was democracy not threatened during Emergency', PM Modi trashes Oppn's 'Constitution in danger' claims
  • Microsoft's Extended Security Updates for Windows 10: Details
  • Will announce Ludhiana, Jalandhar candidates on April 16: Punjab CM
  • Vokkaliga community will respond to Shivakumar: Kumaraswamy
  • PM Modi travelled extensively in US even before joining active politics, here is how
  • Akali Dal chief seeks judicial probe into Punjab liquor scam too
  • Meta debuts new generation of AI chip

Preserving India’s Cultural Heritage: The Role of Languages

essay on different languages in india

Languages are crucial for maintaining Indian culture, as they communicate knowledge and identity. With over 1,600 languages spoken, India has one of the highest linguistic diversities globally. Preserving cultural heritage is vital for representing collective knowledge, traditions, and customs. Preserving ancient languages like Sanskrit and Pali provides insights into India’s rich cultural history and appreciation of diverse traditions and customs

Languages are crucial for maintaining Indian culture, as they communicate knowledge and identity. With over 1,600 languages spoken, India has one of the highest linguistic diversities globally. Preserving cultural heritage is vital for representing collective knowledge, traditions, and customs. Preserving ancient languages like Sanskrit and Pali provides insights into India’s rich cultural history and appreciation of diverse traditions and customs.

Influence of ancient languages on cultural heritage

Ancient languages have significantly impacted India’s cultural heritage, with Sanskrit shaping literature, religion, and philosophy. Classical languages like Tamil, Telugu, and Kannada preserve regional heritage and traditions. These languages uphold distinct identities and honour historical roots, while Tamil, Bengali, and Marathi hold immense importance as literary works reflecting the diverse cultural fabric of India. Efforts must be made to safeguard and promote these languages to ensure the continuation of India’s rich cultural legacy.

Linguistic diversity as a reflection of India’s cultural richness

India’s linguistic diversity, with over 1,600 languages spoken, reflects its cultural richness. Each language represents a unique culture, customs, and traditions, providing insight into India’s heritage. Preserving these languages is crucial for safeguarding India’s cultural legacy and connecting future generations with their ancestral roots. Indigenous languages preserve diverse practices and traditions, while tribal languages foster identity and pride. Indian languages also preserve customs, rituals, and folklore, passing down intricate traditions and local history from generation to generation. Therefore, safeguarding and promoting these languages is essential for ensuring the continued preservation of India’s cultural heritage.

Role of languages in literature and the arts

Languages are crucial for expressing, preserving, and transmitting cultural heritage in India. The diverse linguistic landscape has led to a rich tapestry of literature and artistic expressions, each with its unique cultural nuances and traditions. These languages form the foundation of the country’s cultural heritage, ensuring its continuity and relevance in an ever-changing world. Literary masterpieces like Sanskrit epics, Tamil literature’s Thirukkural, and Kannada’s Pampa Bharata provide a window into different regions’ traditions, customs, and values.

Preserving ancient texts and scriptures in various languages is essential for safeguarding India’s cultural heritage. Institutions like libraries, universities, and research centres play a vital role in conserving and digitising these texts, ensuring their longevity and accessibility. Languages also influence literary styles, genres, and themes, with Hindi literature characterised by devotional and spiritual themes and Tamil literature showcasing a distinct poetic style, Sangam literature.

Promoting cultural heritage through performing arts is essential for preserving the rich traditions of India. The diverse forms of performing arts, such as dance, music, and theatre, showcase the cultural identity of different regions. By actively encouraging and supporting performing arts, India can ensure its cultural heritage’s continued growth and relevance for future generations.

Languages as identity markers and cultural symbols

Languages are essential identity markers and cultural symbols in India, with a rich linguistic diversity across different regions. Each language represents a distinct cultural heritage and contributes to the unique identity of its speakers. Hindi is the official language, while Tamil, Bengali, and Telugu have their indigenous scripts and literary traditions. These languages preserve India’s cultural heritage and transmit traditional knowledge, folklore, and values from one generation to another. Linguistic diversity promotes pride and unity among citizens, fostering understanding and empathy. Languages also serve as signifiers of social, ethnic, and religious affiliations, carrying traditions, values, and identities.

Language plays a crucial role in religious practices and rituals, transmitting sacred texts and guiding worshippers in their quest for divine enlightenment. Safeguarding languages is essential for ensuring the longevity and vitality of religious practices and rituals within a cultural context. Preserving unique dialects and variations within languages is crucial for the cultural heritage of India, providing insight into the history, traditions, and cultural practices of different communities. By safeguarding these distinct linguistic variations, India ensures the preservation of its rich cultural tapestry and reflects its speakers’ unique experiences and perspectives.

Challenges and efforts in preserving languages for cultural heritage

Preserving languages for cultural heritage in India presents challenges due to its vast linguistic diversity, with over 1,600 languages spoken across its regions. Rapid globalisation and the increasing popularity of English as a lingua franca threaten the vitality of indigenous languages. Government initiatives, language institutes, and non-governmental organisations play crucial roles in documenting and reviving endangered languages.

The rapid spread of the English language has led to a decline in regional languages, closely linked to India’s cultural heritage. The migration of people from rural to urban areas has also resulted in the loss of indigenous languages, forcing them to assimilate into dominant linguistic groups. The dominance of major languages also poses a challenge, as the prevalence of English and foreign languages often needs to pay more attention to smaller regional languages, diminishing their cultural heritage and excluding a significant portion of the population from equal opportunities. Addressing the dominance of major languages is essential for preserving diverse cultures within Indian society.

Impact of globalisation and urbanisation on local languages

India’s cultural diversity is rooted in its rich linguistic heritage, making language preservation essential for safeguarding unique traditions and customs. Globalisation and urbanisation have significantly impacted local languages, leading to a decline in usage and adoption. This issue is exacerbated by migration from rural to urban areas, endangering the survival and preservation of India’s rich linguistic diversity. Initiatives to preserve and revive endangered languages are crucial, with organisations and government bodies working to document, promote, and revitalise these languages. Government support is crucial for language documentation and conservation projects. At the same time, grassroots organisations and NGOs play a vital role in promoting linguistic diversity, raising awareness, and organising revitalisation programs, workshops, and cultural events.

Language preservation is vital for India’s cultural heritage, as it transmits traditions, values, and historical narratives, fostering identity, cultural diversity, and intergenerational communication. India’s multiculturalism contributes to its diverse traditions, and protecting and promoting these languages is essential for preserving the rich cultural heritage. Policy support, educational initiatives, and public awareness campaigns can help safeguard India’s linguistic diversity and promote multilingualism. Collaboration between educational institutions, government bodies, and communities is essential for implementing language preservation programs, supporting documentation efforts, and promoting multilingualism.

(The writer is an Assistant Director on deputation with the National Gallery of Modern Art, Ministry of Culture, New Delhi)

  • Dr Pandiri Harsha Bhargavi
  • Cultural Heritage
  • cultural symbols
  • Linguistic diversity
  • Art & Culture
  • Sunday Hans News

essay on different languages in india

  • Advertise With us
  • Terms & Conditions
  • Subscriber Terms of Use
  • Privacy Policy
  • Editor'S Desk

essay on different languages in india

© 2024 Hyderabad Media House Limited/The Hans India. All rights reserved. Powered by hocalwire.com

essay on different languages in india

Short Essay on Languages in India

essay on different languages in india

India’s heritage in languages and literature is one of the richest in the world. Some of the languages that were spoken in India in ancient times and had a rich literature have become extinct, others remained important.

As Sanskrit is no longer a spoken language, it is still a language of many religious rituals and of literature. The old languages have left their mark on the other languages which we speak today.

There are two main groups of languages — the Indo European (Indo-Aryan) and the Dravidian. These two groups have not developed in isolation from each other. Sanskrit was the language of Indo Aryans who came to India.

Sanskrit was gradually standardized and given a highly scientific grammar by Panini. Sanskrit was the language of religion, philosophy and learning. It was used by the upper castes, the brahmans and the Kshatriyas. The common people spoke a number of dialects which are called Prakrits.

ADVERTISEMENTS:

Buddhist literature was written in Pali, one of the Prakrits. Ashoka had his rock and pillar edicts inscribed in the popular languages. Among the Dravidian languages Tamil is the most ancient. In the period of Gupta’s Sanskrit again became the predominant language of learning.

The various spoken languages that developed are called Apabhrams which developed in the various regions of India in the medieval period. During the periods of Turks and the Mughals Arabic and Persian entered India. Persian became the court language. The growth of a new language-Urdu-based on the dialects of Hindi and Persian became the common language of towns alt over northern India and the Deccan. Its literature in poetry and prose developed very rich.

There are eighteen languages which have been mentioned in the constitution of India. Hundreds of other languages are spoken by the people of other areas of the country. The variety of languages has made India a multilingual country.

Related Articles:

  • Short notes on the Languages Spoken By the Dravidians in South India
  • Free Essay on India: India represents languages and culture of large diversities
  • Main changes in the development of Indian languages – Essay
  • Short Essay on the Pre-History Part-II of South India

logo

The Linguistic Identity of India

Exclusive analysis from our experts.

The Linguistic Identity of India

  • Analyses & Assessments

July 21, 2018 | Expert Insights

India is a nation where over 1.2 billion people speak 19,569 different languages or dialects as a "mother tongue.” The largest language census revealed interesting data concerning linguistics, identity and nationalism.

Languages spoken in India belong to several language families, the major ones being the Indo-Aryan languages spoken by 78.05% of Indians and the Dravidian languages spoken by 19.64%. The remaining 2.31% of the population belong to the Austroasiatic, Sino-Tibetan, Tai-Kadai, and other minor language families. Three millennia of political and social contact have resulted in mutual influence among the four language families in India and South Asia, alongside the noteworthy effects of Persian and English as contact languages.

Linguistic records begin with the appearance of the Brāhmī script from about the 3rd century BCE. The oldest recorded script from the Indus Valley Civilisation is yet to be deciphered.

James Princep, an archaeologist with the East India Company, deciphered the Brahmi script and, thus, unlocked a trove of knowledge regarding the history of India. Although Indologists developed historical narratives which skewed the socio-political thought of Indians for decades, their efforts to uncover the linguistic history of India cannot be undermined.

Mahatma Gandhi’s plans to make Hindi the sole official language of the Republic met with resistance, especially in South India, and was in violation of the federal nature of India. English, the language of the British Raj, was not easily accepted either. Thus, the Official Languages Act of 1963 ruled that English and Hindi would be used for official purposes, but there is no national language of the Union. Instead, 22 other languages were recognised by the Eighth Schedule, and state governments were granted the right to choose their own official language. Of the 22 scheduled languages, 15 are Indic, four are Dravidian, two are Tibeto-Burman, and one is Munda.

According to a report of the census directorate, there are 22 scheduled languages and 100 non-scheduled languages in the country which are spoken by a large number of people of one lakh or more. However, there are around 42 languages which are spoken by less than 10,000 people. These are considered endangered and may be heading towards extinction, a home ministry official said.

The languages or dialects which were considered endangered, include 11 from Andaman and Nicobar Islands, 7 from Manipur and four from Himachal Pradesh. Major states of India from Odisha, Assam, and West Bengal to Karnataka, Andhra Pradesh, and Maharashtra have languages on the verge of extinction.

The Central Institute of Indian Languages, Mysore, has been working for the protection and preservation of endangered languages of the country, under a central scheme.Grammatical descriptions, dictionaries, language primers, anthologies of folklore, encyclopaedias of all languages or dialects especially those spoken by less than 10,000 people are being prepared.

Hindi was identified as the country's most spoken language with more than 43% of the population (more than 528 million people) able to communicate in it. Since the liberalisation of the economy in the late 90’s, there has been increased migration across the states, particularly to the economically thriving southern states, where Hindi was not commonly spoken. Hindi managed to add 100 million new speakers between 2001 and 2011, a 25% increase. It can be estimated that since 2011 to 2017, there has been a further increase in Hindi speakers.

Ganesh Devy, a linguist and founder of the Bhasha Trust research organization, was not satisfied with the data saying, ”The census has subsumed many languages in Hindi. This includes Bhojpuri, which is spoken by more than 50 million people. Bhojpuri is not Hindi; it is a different language.” He further underlined that, "Many other languages spoken in the states of Rajasthan, Haryana, Arunachal Pradesh, Chhattisgarh, Himachal Pradesh and Uttarakhand, with millions of speakers, were also categorized as Hindi to inflate the figures.”

Owing to religious identity politics, those who speak Urdu choose Hindi as their mother tongue to avoid harassment. Furthermore, the number of Tamil, Telugu and Malayalam speakers has significantly dropped.

The census shows a sharp decline in the number of speakers of southern Indian languages, except for Kannada, which saw a marginal rise from to 3.73%. On the other hand, Janaki Nair examines Kannada nationalism on the basis that it, like all nationalisms, attempts to produce a solidarity between all Kannada speakers in order to efface the specificities of caste and class.

Language has figured prominently in politics, education, economics and post-independence conflicts over distribution of resources and territories in India.  Minority language speakers are often discriminated against, owing to their lack of fluency in the official language of the state. Nationalism and linguistics have a syncretic relationship which has the power to unite or divide a country of multiple cultures, religions and identities, not only in conversation, but also in text and prose.

Our assessment is that the data from the language consensus can be used to understand underling socio-economic issues. We believe that language, a comprehensive method of expression, has the power to influence ideas and policies, however, it ought not be used to dominate over minority communities. We feel that in the case of ineffective policies for unemployment, poverty and infrastructure, language is used as a fuel for nationalist or regionalist identities.

Leave A Comment

  • No HTML tags allowed.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.

Related Articles

Pakistan: PDM 2.0 Prospects

Pakistan: PDM 2.0 Prospects

The coalition may have to make tough economic and foreign policy decisions even while its standing is in question.

UNRWA: Under the Scanner

UNRWA: Under the Scanner

As funding dries up for UNRWA, a humanitarian crisis of grave magnitude is developing in Gaza.

Autocratic Diplomacy at Play?

Autocratic Diplomacy at Play?

As Russia vetoes the UN's sanctions on North Korea, it gives more opportunity to the rogue nation to destabilise the entire region.

Baltic: A NATO Lake?

Baltic: A NATO Lake?

With Sweden on Team NATO, the Baltic Sea, a historical buffer between Russia and Europe, is no longer neutral commons.

Looming Energy Hikes?

Looming Energy Hikes?

OPEC+ extends oil production cuts, raising concerns for the global economy.

Games in the UNSC and Beyond

Games in the UNSC and Beyond

The world watches helplessly as the P5 play the game of kicking the ball while Gaza burns.

A comprehensive survey on Indian regional language processing

  • Review Paper
  • Published: 12 June 2020
  • Volume 2 , article number  1204 , ( 2020 )

Cite this article

  • B. S. Harish   ORCID: orcid.org/0000-0001-5495-0640 1 &
  • R. Kasturi Rangan   ORCID: orcid.org/0000-0002-7310-1035 1  

7799 Accesses

13 Citations

Explore all metrics

In recent information explosion, contents in internet are multilingual and majority will be in the form of natural languages. Processing of these natural languages for various language processing tasks is challenging. The Indian regional languages are considered to be low resourced when compared to other languages. In this survey, the various approaches and techniques contributed by the researchers for Indian regional language processing are reviewed. The tasks like machine translation, Named Entity Recognition, Sentiment Analysis and Parts-Of-Speech tagging are reviewed with respect to Rule, Statistical and Neural based approaches. The challenges which motivate to solve language processing problems are presented. The sources of dataset for the Indian regional languages are described. The future scope and essential requirements to enhance the processing of Indian regional languages for various language processing tasks are discussed.ϖ

Similar content being viewed by others

essay on different languages in india

Natural language processing: state of the art, current trends and challenges

Diksha Khurana, Aditya Koli, … Sukhdev Singh

essay on different languages in india

A Combined CNN and LSTM Model for Arabic Sentiment Analysis

essay on different languages in india

Sentiment analysis using deep learning architectures: a review

Ashima Yadav & Dinesh Kumar Vishwakarma

Avoid common mistakes on your manuscript.

1 Introduction

Any language that has evolved naturally in humans through its usage over the time is called natural language. People exchange their knowledge, emotions and feelings with others through the means of natural language. There are different native languages existing in various parts of the world, each with its own alphabet, signs and grammar. If there is a nation where old and morphologically rich varieties of regional languages exist that is India [ 57 ]. It is comparatively easy for computers to process the data represented in English language through standard ASCII codes than other natural languages. However, building the machines capability of understanding other natural languages is arduous and is carried out using various techniques. There are many research works and applications like (1) Chatbot (2) Text-to-speech conversion (3) Language Identification (4) Hands-free computing (5) Spell-check (6) Summarizing-electronic medical records (7) Sentiment Analysis and so on, developed to handle these natural languages for real time needs. In this paper, various methods used to develop the aforementioned applications; especially on Indian Regional Languages (IRL) are presented.

Nowadays, the internet is no more monolingual; contents of the other regional languages are growing rapidly. According to the 2001 census, there are approximately 1000 documented languages and dialects in India. Much research is being carried out to facilitate users to work and interact with computers in their own regional natural languages [ 3 ]. Google offers searching in 13 languages and provides transliteration in Indian Regional languages (IRL) like Kannada, Hindi, Bengali, Tamil, Telugu, Malayalam, Marathi, Punjabi, and Gujarati [ 51 ]. The major concentrated tasks on IRL are Machine Translation (MT), Sentiment Analysis (SA), Parts-Of-Speech (POS) Tagging and Named Entity Recognition (NER). Machine translation is inter-lingual communication where machines translate source language to the target language by preserving its meaning [ 75 ]. Sentiment analysis is identification of opinions expressed and orientation of thoughts in a piece of text [ 47 ]. POS Tagging is a process in which each word in a sentence is labeled with a tag indicating its appropriate part of speech [ 15 ]. Named Entity Recognition identifies the proper names in the structured or unstructured documents and then classifies the names into sets of predefined categories of interest. Majorly, machine learning algorithms and natural language processing techniques are used to develop applications for IRL. Language processing techniques are widely and deeply investigated for English. However, not much work has been reported for IRL due to the richness in morphology and complexity in structure. The generic model for the language processing is as shown in Fig. 1 .

1.1 Generic block diagram

figure 1

Generic model for language processing

The generic model for language processing consists of various stages viz., machine transliteration, preprocessing, lexical and morphological analysis, POS tagging, feature extraction and evaluation. The raw text block in the diagram represents the natural language which is in unstructured form. The contributions of aforementioned techniques for success of the language processing tasks are as follows:

1.1.1 Tokenization

In natural language processing applications, the raw text initially undergoes a process called tokenization. In this process, the given text is tokenized into the lexical units, which are the most basic units. After tokenization, each lexical unit is termed as token. Tokenization can be at sentence level or word level, depending on the category of the problem [ 91 ]. Hence, there are 3 kinds of tokenization - a) sentence level tokenization b) word level tokenization and c) n-gram tokenization. Sentence level tokenization deals with the challenges like sentence ending detection and sentence boundary ambiguity. In word level tokenization, words are the lexical units, hence the whole document is tokenized to the set of words. The word level of tokenization is used in various language processing and text processing applications. The n-gram tokenization is a token of n-words where ‘n’ indicates the number of words taken together for a lexical unit. If ‘n=1’ then lexical unit is called as unigram, similarly if ‘n=2’ lexical unit is bigram and trigram if ‘n’ value is ‘3’. During n-gram tokenization (where \(n>=2\) ), to satisfy the n-words in the tokens there will be overlapping of terms in the tokens. Figure 2 presents all the 3 ways of tokenization for some set of sentences in Kannada which is one of the Indian Regional Languages.

figure 2

An example of Tokenization in Kannada Language

1.1.2 Machine transliteration

In natural language processing, machine transliteration plays a vital role in applications like cross-language machine translation, named entity recognition, information retrieval, etc. Transliteration is a process of converting a word or character from the source languages alphabetical system to the target languages alphabetical system, without losing the phonetics of the source languages word or character. Before transliteration, words are divided into syllabic units using Unicode and character encoding standards. Then each of the syllabic units of a word gets converted to target language [ 50 ]. For example:

figure a

There are 3 main types of transliteration; grapheme, phoneme and hybrid [ 8 ]. The grapheme based transliteration model directly transliterates the source language word to the target language grapheme without phonetic knowledge. The phoneme based transliteration model uses the phonetics of the source language word to transliterate to target language grapheme. This model conserves the tone of the word or character and brings out proper transliteration of target language graphemes. Examples of phonetics dictionary for some words of different languages are presented in Fig. 3 using International Phonetic Alphabet (IPA) [ 26 ]. The hybrid transliteration model uses both the source language grapheme and phoneme to produce the target language grapheme.

figure 3

Examples of pronunciation dictionary of few languages

1.1.3 Preprocessing techniques

Once the raw text of natural language is tokenized and transliterated, some of the preprocessing techniques are used in enhancing the efficiency of the applications, as per the requirement. Some of the major techniques are as follows:

Stemming Stemming normalizes the given word into its root/stem by cleaving the suffixes and prefixes of the word. Root word is modified to express different grammatical categories (tense, voice, person, gender, etc.) in a sentence, which is called as inflection of the language. However, the obtained root word may not be a valid word in the language. There are some stemming techniques developed for IRL based on longest-matched method, n-gram method, brute-force method, etc. [ 14 ].

figure b

Stop-word removal There are some words which frequently occur in documents yet convey no additional meaning. Hence, removal of these non-informative words eases language processing tasks. There are several techniques to remove stop-words like dictionary based stop-words removal, DFA [ 45 ] (Deterministic Finite Automata) based stop-word removal, etc.

Lemmatization It is similar to the stemming technique but the word is reduced to an acceptable form in the language after the removal of suffixes and prefixes. The reduced word is called “Lemma”; it is valid and accepted by the language. For example, “runs”, “running”, “ran” are all the different forms of the root word called “run” in English language; thus “run” is the lemma for all those formerly mentioned forms of it. Researchers are also working on lemmatization techniques for IRL, which can help in building various applications on multilingual platforms [ 67 ].

POS Tagging As the name indicates, it refers to the process of tagging words in a sentence with parts of speech like Noun, Pronoun, Verb, Adjective, etc. This process will be different for different languages owing to differences in grammatical structure. POS tagging helps in the natural language processing applications like cross-lingual machine translation, Documents-Tagging using named entity recognition, sentiment analysis, etc. However, understanding the grammatical structure of a sentence for automatic POS tagging is a challenging task. In India, many researchers are working towards proposing POS tagging for various regional languages [ 15 ]. It helps in development of various applications for native languages.

Unicode Normalization Unicode is the Universal character encoding standard, which represents the text characters by a unique hexadecimal value. This representation is used for information processing. Some sequence of Unicode characters are equivalent to single abstract Unicode character, this multiple representations for an abstract character leads to complication. To eliminate these non-essential differences Unicode normalization is performed during preprocessing. Unicode normalization transforms equivalent sequence of characters into the same representation. For example: The string “fi” can be represented either by the characters “f” and “i” ( \(U+0066,U+0069\) ) or by the ligature “fi” ( \(U+FB01\) ). Even in Indian regional languages especially in Hindi language, Nuktha based characters forms multiple representations as shown in below example.

figure c

1.1.4 Statistical based approaches

After preprocessing, the model is trained for language processing using either the machine learning/statistical approaches or rule based language processing approaches. In machine learning approach, feature extraction method is used to extract features from the preprocessed data. Later, these features are used for the purpose of training learning algorithms. These machine learning algorithms constitute statistical based models.

For example, statistical approach uses probability distribution function to choose the best translation in machine translation task of language processing. During this translation, Multi/bi-lingual corpus is used [ 40 ]. In another approach called Example Based Machine Translation (EBMT), corpus of the translated examples is used to train the model. The test input is matched with the corpus example and matched words of test input sentence are recombined later in an analogical manner for proper translation [ 85 ]. Similarly, there are different machine learning based solutions for other applications which are mentioned earlier.

1.1.5 Rule based approaches

This approach existed before the statistical based models were created for language processing. The lexical and morphological analyses using techniques like regular expressions, Suffix striping and so on are applied after preprocessing. In rule based natural language processing approach, the set of rules and patterns guide the machine to translate the language. E.g.: English has the language structure of SVO (Subject, Verb, Object) while Hindi has SOV (Subject, Object, Verb). Researchers believe natural language translation is incomplete without the support of some external knowledge like reasoning and basic knowledge of the language. Hence, rule based approach uses thesaurus and data sources like Wordnet.

For example, Rule Based Machine Translation (RBMT) technique translates source language to target language using various set of rules and bilingual dictionary in machine translation task of language processing. Similarly, other techniques like Knowledge Based Machine Translation (KBMT), Principle Based Machine Translation (PBMT) make use of parsers for the lexical, phrasal and grammatical information of the language [ 42 ].

1.1.6 Neural based approaches

Other than the rule and statistical based approach, researchers also worked using neural based approaches for language processing tasks to find better results. In this approach, the input data is processed through the artificial neurons in the architecture. The circuit of neurons forms a neural network. For language processing tasks, neural based approaches provide better results for various complex situations such as training huge datasets, better and fast learning, owing to the presence of features like uniformity, computation power, learning ability and generalization ability. For instance, machine translation task aims to find the best similar target language sentence for a source in language processing task. From the probabilistic point of view, it is the maximum of P ( y / x ) where \(``y''\) is target and \(``x''\) is source language. But in neural based approach, it tries to build an end to end large neural network model in order to complete the same task. The core purpose is to encode a variable length sequence of words into a fixed length vector, which is the summary of the whole source sentence. It is further translated to the target language using decoder. This encoder-decoder model is trained to attain best conditional probability P ( y / x ) in neural based translation [ 24 , 60 ].

1.1.7 Post-processing

In this phase, the results generated by the techniques of language processing models are made much more refined or efficient. The results from the models are checked for spelling corrections, sentence arrangements, grammatical errors, missed translations, etc. [ 42 ].

1.1.8 Evaluation

The results of the applications are measured and evaluated to know the efficiency of the models using statistical measurements like Accuracy, Precision, Recall and F-Score (Harmonic mean of Precision and Recall). BLEU (BiLingual Evaluation Understudy) scores are calculated for machine translation tasks to check the quality of translations [ 59 ]. Other measures are also used. UNK (Unknown Word) count is used for measuring the Out-Of-Vocabulary (OOV) words in translation task, while WER (Word Error Rate) metric is used to analyze human translated output and machine translated output.

2 Challenges

There are several challenges faced in all the stages of the language processing tasks because of differences in grammar and phonetics. The challenges faced in Indian regional language processing are as follows:

Tokenization of the text. Some of the regional languages dont have common delimiters like white space or punctuations.

E.g.: Urdu language.

figure d

Language structure i.e. order of the words in the sentences will differ from one language to another [ 16 ].

E.g.: Subject Verb Object (SVO) (English), Subject Object Verb (SOV) (kannada).

figure e

Ambiguity in translation or transliteration of regional language words.

E.g.: In English-Hindi translation, the word mount translates but Everest remains same. In English-Kannada both the words are just transliterated.

figure f

Some languages support Homograph words whose meaning changes with context [ 56 ].

E.g.: In Heart Attack and Dog attacks cat , attack is the homograph word.

Some languages have multiple scripts.

E.g.: Punjabi (Gurmukhi, Shahmukhi).

Grammatical variations between languages lead to ambiguity.

Judging of speakers intention is difficult. Meanings of sentences or words vary with the speakers intention (like sarcasm, sentiment, metaphor, etc.).

Code-Mixed language processing is challenging as user uses multiple languages in a sentence or an utterance.

E.g.: User tweet : “listening to Bombae Haelutaitae from Rajakumara”

3 Motivation

India is a multilingual country. Indian constitution lists 22 languages, referred to as scheduled languages. These languages are given status, recognition and official encouragement. Of the entire population, barely 10% Indians use English to transact and most prefer regional languages, which have evolved over centuries. As there is diversity in languages, language processing applications are a boon to the people for their day-to-day transactions. However, understanding and generation of these natural languages i.e. processing of these natural languages by machine is complex. Therefore, we review the work carried out by researchers on various techniques developed for processing Indian Regional Languages.

4 Review in detail

George University and IBM jointly developed the first machine translation application in 1954 for translating more than sixty Russian sentences into English. This was the first milestone achieved in the field of natural language processing. Real progress was much slower in NLP. Until 1980, NLP techniques were complex and based on hand written rules. Post the introduction of Moores law by Gordon Moore, the former CEO of Intel, the computational power of the system increased and paved way for the development of statistical models based machine learning algorithms, which led to a revolution in NLP.

4.1 Machine transliteration

Early in 1994, Arbabi worked on Arabic-English language transliteration using phoneme based model [ 10 ]. Later in 2008-2010, researchers developed statistical transliteration techniques which are language independent. Many works have been proposed with regard to Indian regional languages too. In [ 9 ], Antony et al., addressed the problem of transliterating English to Kannada language using SVM kernel model, which trained over 40k names of Indian towns. It is based on sequence labeling method. The transliteration module uses an intermediate code, which is designed for preserving the phonetic properties. Authors also compared their results with the Google Indic transliteration system and found better results. The process of converting the words to pronunciation is called as grapheme-to-phoneme (g2p). The statistical grapheme-to-phoneme (g2p) transliteration learning models are trained on language specific pronunciation dictionaries which are expensive, time consuming and require the intervention of language experts. To address these issues, [ 26 ] worked on grapheme to phoneme (g2p) transliteration model for low resource languages using Phoible [ 53 ] phonological inventory data (having 37 phonological features such as nasal, consonantal, sonorant, etc.). Low resource language words are converted to their pronunciation using phonological information of high resource language words, which are similar in linguistic and phonological information. In [ 29 ], Dhore et al., focused on direct phonetic based transliteration approach for Hindi and Marathi to English, without training bilingual database. They used hybrid stress analysis approach for deletion of schwa, which refers to the vowel sounds presented in many unaccented syllables of words and are removed after transliteration. Ekbal et al., [ 36 ] made substantial contribution to develop transliteration systems for Indian languages to English and especially for Bengali-English transliteration. They proposed modified joint source-channel model, which is based on regular and non-probabilistic expression. It uses linguistic knowledge to transliterate person names from Bengali-English. In [ 50 ], Lakshmi et al., worked on Back-Transliteration of Kannada language. The Romanized Kannada words are transliterated back to Kannada script. Bilingual corpus (around 1 lakh words) and Bidirectional Long Short-Term Memory (BLSTM) are used in this Back-Transliteration, which obtained good results.

4.2 Preprocessing techniques

4.2.1 stemming.

It is a process of reducing morphologically variant terms into a single term, without performing complete morphological analysis. Ramanathan et al., [ 71 ] presented their light weight stemmer on Indian regional language Hindi. This work is based on stripping of word endings by longest matching suffix of words, using manually created suffix list consisting of 65 suffixes. Pandey et al., [ 58 ] proposed an improvised unsupervised stemmer for Hindi, which is a probabilistic approach to achieve better stemming. They used EMILLE corpus and WordNet of Hindi for training and testing, respectively. This approach showed better results than light weight stemmers. Ramachandran et al., [ 70 ] applied longest match suffix removal technique for the Tamil language stemmer. Saharia et al., [ 78 ] worked on Assamese language stemmer based on suffix removal technique. For Gujarati, [ 61 ] presented a light weight stemmer which is based on hybrid technique of both unsupervised morphological parsing method [ 41 ] and rule based method (manual listing of handcrafted suffixes). Similarly [ 5 , 86 ] worked on Gujarati language stemmer based on hybrid approach. In [ 20 , 52 ] researchers presented Bengali stemmers based on longest suffix matching technique, distance based statistical technique and unsupervised morphological analysis technique.

4.2.2 Lemmatization

The aim of lemmatization is to obtain meaningful root word by removing unnecessary morphemes. English and other European languages are not highly inflected when compared to Indian languages, which have more stemmers and lemmatizers [ 63 ]. Compared with other Indian regional languages, Hindi words have finite set of inflections morphologically [ 6 ]. Hence, [ 63 ] worked on optimization of lemmatization technique for Hindi words using rule based and knowledge based approach. Here, knowledge refers to the storage of grammatical features and in lemmatization, it refers to the storage of root words. [ 67 ] worked on one of the south Indian regional languages Kannada which consists of more inflectional words than Hindi. They used Kannada language dictionary for the lemmatization of words under rule based approach.

4.2.3 Parts of speech (POS) tagging

In language understanding, POS tagging plays a vital role. It helps in achieving language processing tasks more efficiently. POS tagging is a disambiguation task and the goal of tagging is to find the exact role of a word in the sentence.

figure g

where PN, V, N are called tagsets, representing the grammatical identities of words.

In Hindi, Singh et al., [ 84 ] presented POS tagger with detailed morphosyntactic analysis, skillful handling of suffixes and decision tree based learning algorithm. Dalal et al., [ 18 ] used maximum entropy markov model, which is statistical based and considers multiple features simultaneously such as context based features, word features, dictionary features and corpus-based features to predict the tag for a word. Avinesh et al., [ 68 ] used conditional random field and transformation based learning statistical methods for POS tagging of Hindi, Telugu and Bengali. [ 83 ] presented POS tagger for Hindi by using Hidden Markov Model (HMM). Working on local word grouping for Hindi, Ray et al., [ 73 ] presented POS tagging algorithm based on morphological analysis and lexical rules. In Bengali, [ 19 ] built POS tagger based on HMM and Maximum Entropy (ME) methods. They also found that accuracy increased with addition of Morphological Analysis (MA). Ekbal et al., [ 37 ] worked on Bengali POS taggers based on Conditional Random Field (CRF) and Support Vector Machine (SVM). They found the performance of SVM to be better [ 32 ]. For the south Indian language Tamil, [ 28 ] proposed statistical based Support Vector Machine method for POS tagging. Similarly [ 81 ] presented a POS tagger for Tamil which is built on combination of both rule based morphological analysis and statistical based methods. Kannada is also a south Indian regional language where Antony et al., [ 7 ] worked on POS tagger, based on lexicon dictionary and support vector machine method. Later, Shambavi et al., [ 15 ] presented POS tagger built on Hidden Markov Model and Conditional Random Fields methods. In social media, users with multilingual knowledge interact using words from multiple languages in a sentence or an utterance; this is called Code-mixing. [ 44 ] worked on English-Hindi social media code-mixed text and experimented POS tagging of these corpora using four machine learning algorithms (Conditional Random Fields, Sequential Minimal Optimization, Nave Bayes, and Random Forests).

As POS tagging is needed for many of the language processing tasks, researchers used multilayer perceptron neural network for more efficiency. [ 60 ] presented neural based POS tagger for Hindi language and claimed it to be the first work on neural based Hindi POS tagger. Comparatively, neural method works better than CRF and HMM statistical methods. Todi et al., [ 87 ] worked on Unknown or Out-of-vocabulary words, which is the major challenge in POS tagging task. This challenge is addressed by character embedding and word embedding solutions with simple RNN, LSTM and biLSTM methods. Narayan et al., [ 54 ] presented neural based solution for the disambiguation of corpus problem in Hindi language. All these POS taggers are presented in Table 1 .

4.3 Approaches for language processing tasks

4.3.1 rule based approaches [ 25 , 65 ].

If the language processing tasks are achieved based on lexical rules, morphological analysis and linguistic knowledge after preprocessing, then this approach is termed as rule based solution/approach. The language processing tasks are handled by the decisions taken by the lexical rules, which should be specific and clear. Each language has its own own linguistic rules and all these are to be taken into consideration for achieving the language processing tasks efficiently. Machine Translation (MT) is one of the most difficult and major tasks in language processing. Rule based MT are of three types; the first being Dictionary based or Direct based where multilingual dictionaries are used for translation and which is easy to implement. [ 42 ] presented Hindi to Punjabi MT based direct rule based method. Dictionary based English to Kannada/Telugu translation tool is proposed by [ 75 ]. Next is Transfer based translation, which concentrates on the grammatical structure of source and target languages. Lastly Interlingual translation, in which source language is translated to intermediate representation called Interlingua (E.g.: Universal Networking Language (UNL)) [ 46 ], from which target language is generated. This representation is independent of languages. Dave et al. [ 25 ] worked on English to Hindi MT using Interlingua. Rule based sentence simplification technique is proposed by [ 65 ] for English to Tamil translation task.

In language processing, Named Entity Recognition (NER) refers to the process of identifying the proper nouns in the text and classifying them into named entity classes like person, location, date, organization, numbers etc. and is a major task. The linguistic handcrafted rules are used in rule based NER. As NER is a classification task of given language entities into any one of the named entity classes, machine learning methods perform better than rule based methods. Hence, machine learning methods are used widely by the researchers [ 55 ]. [ 43 ] presented conditional based or rule based NER system for Punjabi language. They developed and used gazetteer lists like prefix list, suffix list, last name list and so on for proper name identification.

Most of the business decisions are based on choices of customers; thus gauging sentiment and sarcasm is crucial for proper decision making. In language processing, Sentiment Analysis (SA) is also a major task [ 48 ]. In rule based, SA dictionaries of words annotated with the word’s semantic orientation or polarity are used. In Indian regional language, Balamurali et al. [ 12 ] worked on Cross-Lingual Sentiment Analysis (CLSA) using WordNets of Hindi and Marathi. CLSA is a task of analyzing sentiment where languages are different for testing and training processes. WordNets avoid translation between test language texts while training language texts. [ 47 ] present SA system for Bengali and Hindi languages using lexicons, distributional thesaurus (DTs) and sentence level co-occurrences.

4.3.2 Performance and limitations

Though rule based approach considers the morphological analysis and linguistic knowledge, it falls short while making complex rules and processing resource deficient languages. It is also a tedious approach because it demands high linguistic acquaintance of languages and updation of rules with evolution of language. During machine translation task, especially in dictionary based, there is no consideration of structure of source text sentence beyond morphological analysis of words (idioms and phrases, slogans). In transfer based translation, there must be compatibility between the languages. Interlingua is time consuming as it does double translations; however it is supportive of multi languages [ 46 , 51 , 72 , 79 ]. NER for Indian regional languages is difficult as it lacks capitalization and has complex phonetics [ 62 ]. The language processing task, namely Sentiment Analysis (SA) for Indian regional languages, is difficult due to language constructs, morphological variations and grammatical differences [ 47 ]. Further, the lack of WordNets for regional languages renders an uphill task for SA. The rule based approach for SA task is applied whenever the goal is to analyze the sentiment at the document or sentence level, because this approach helps in contextual based analysis of sentiment using various rules. But for cases where contextual factors are not essential or contribute less, feature based statistical methods are preferable.

4.3.3 Statistical based approaches [ 4 , 33 , 35 , 38 , 39 , 72 , 89 , 90 ]

If the preprocessed data are analyzed with statistical metrics to achieve the desired result in language processing tasks, it is called statistical based approach. This approach looks for statistical relations in preprocessed data (such as distance metric, probability metric, etc.). Here the features of the data guide the statistical models towards efficient results. In translation task, the document is translated on the basis of probability distribution function indicated by P ( k / e ). The P ( k / e ) represents the probability of translating a sentence “ e ” in the source language ‘E’ (E.g.: English) to a sentence “ k ” in the target language ‘K’ (E.g.: Kannada). The parallel corpora of languages play a vital role in statistical based language processing tasks. Unnikrishnan P et al. [ 89 ] proposed Statistical Machine Translation (SMT) system for English to Kannada and Malayalam languages, where they concentrated on aspects like reordering the sentences of source language as structure of target language sentence, root-suffix separation for both source and target words and efficient morphological information usage. [ 72 ] worked on SMT for English-Hindi translation task. The incorporation of sentence structure reordering method (as per target language) and better suffix-root separation method (of words), enhanced the efficiency of their SMT system.

Named Entity Recognition (NER) task performs better in statistical approach. [ 4 , 33 , 39 , 90 ] worked on developing NER for regional languages like Hindi, Bengali, Kannada and Tamil using Conditional Random Field (CRF) method. The SVM statistical method is used by [ 35 ] and [ 31 ] on Hindi and Bengali languages. The Hidden Markov Model (HMM) method is used for Kannada and Bengali NER task by [ 30 , 38 ]. [ 34 ] presented NER system by hybrid of these methods in Bengali language and found better results.

Statistical based Sentiment Analysis (SA) task uses machine learning algorithms and is trained by known datasets. Rohini et al. [ 77 ] worked on SA for movie reviews in Kannada regional language using Decision tree classifier. The same reviews are translated to English and polarity is analyzed with the classifier mentioned formerly. Location based SA was carried out to identify trends during the Indian election campaign in 2014, using twitter dataset [ 2 ]. They used Nave based classifier to classify tweets into positive or negative. [ 80 ] worked on classifying Tamil, Hindi and Bengali tweets into positive, negative or neutral using sentiWordNet for features extraction and Nave Bayes classifier.

4.3.4 Performance and limitations

The major significance of the statistical based approach is that it doesnt require more linguistic acquaintances. This is a boon for languages with less resources and leads to efficient processing of language tasks. Among languages, we can find similarly structured languages and non-similarly structured languages i.e. whether the order of Subject-Verb-Object remains the same. In Machine Translation (MT) task, statistical based translation is more efficient for languages with different structures, rather than similarly structured languages. For similarly structured languages, rule based method is efficient and performs better. For the statistical based MT, good parallel-corpora of languages are required. However, dictionaries are more widely available when compared with parallel-corpora and bilingual dictionaries [ 72 ]. The main features deciding efficiency in translation are quality and coverage of the corpora and dictionaries, be it rule based or statistical. Proper probability estimation is also a difficult task as it requires sufficient training [ 46 ]. The NER task produces more efficient results with statistical methods than with rule based methods. As formerly mentioned, coverage of annotated corpora is the key factor for efficiency of statistical based methods [ 55 ]. Even though statistical method for Sentiment Analysis (SA) takes the upper hand when compared to lexicon based, it trails when it comes to the highly inflected regional language. Hence, researchers use WordNets for feature extraction and machine learning classifiers for later classification into positive, negative or neutral classes. Dependency on WordNets is also a drawback because there is lack of WordNets for regional languages [ 77 ]. [ 49 ] compared the semantic approaches (E.g.: Baseline algorithm) and machine learning approaches on web based Kannada documents for sentiment analysis and found that machine learning approach (using Weka software suite) performs better.

4.3.5 Neural based approaches

Recently, many researchers have worked on neural based solutions for language processing tasks. It is quite successful in giving better results for some language processing tasks but also gives below par results at times due to lack of resources for some regional language processing tasks. There are many artificial neural network architectures like Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), used for building learning models. The task of neural based machine translation is called Neural Machine Translation (NMT). Most of the proposed NMT belong to the encoder-decoder approach. An encoder transfers the variable source sentence into a fixed length vector, while the decoder later translates into target language sentence [ 11 ]. Revanuru et al. [ 76 ] worked on NMT for 6 Indian language pairs like Telugu-Hindi, Konkani-Hindi, Gujarati-Hindi, Punjabi-Hindi, Tamil-Hindi and Urdu-Hindi. They claim to be the first to apply NMT on Indian regional languages. The neural architecture consists of bi-directional LSTM and BLEU metric for evaluation. In comparison with Google translate, their model outperformed by a BLEU score of 29 for Punjabi-Hindi translation, 17 for Urdu-Hindi translation and 30 for Gujarati-Hindi translation for the dataset given from Indian Language Technology Proliferation and Deployment Center (TDIL-DC), C-DAC. Sentiment Analysis (SA) is a language processing task wherein emotions are studied computationally. Ravi et al. [ 74 ] worked on Hinglish (Code-mixed) text that is Romanized Hindi for sentiment classification. They claim that the combination of gain ratio based feature selection and Radial Basis Function Neural Network performs well for their dataset in sentiment classification. Similarly, [ 66 ] worked on Code-mixed Hindi-English texts at sub-word level compositions using LSTM for sentiment analysis. Akhtar et al. [ 1 ] proposed hybrid deep learning approach where features are extracted from Convolution Neural Network (CNN) and sentiments are classified later by SVM classifier. To prove the independence of method over language, Choudhary et al. [ 17 ] proposed Siamese Network Architecture, which is composed of twin Bi-directional LSTM Recurrent Neural Networks (Bi-LSTM RNN) for sentiment analysis task, which they tested on both Hindi and benchmark English datasets. They considered both resource rich languages (English and Spanish) and resource-poor languages (Hindi and Telugu) for training to overcome problems like out-of-vocabulary and spelling errors. This approach takes aid of resource rich language for sentiment analysis of resource poor languages. Bhargava et al. [ 13 ] worked on monolingual tweets of Hindi, Bengali and Tamil languages. The experimentation is on binary classification (positive/negative) of tweets using the combination of RNN, CNN and LSTM neural networks. [ 82 ] worked on code-mixed data of Bengali and English languages for sentiment analysis. Convolutional Neural Network (CNN) is applied on code-mixed data by them. Similarly, they extended experiments on monolingual language (Telugu) using CNN network architecture.

Further, the advancement of deep learning leads to usage of various neural networks in many language processing tasks. Neural models alleviate the feature engineering problem faced in non-neural methods which depends on handcrafted features. But Neural models require large parameters for best generalization else model will be overfit. Hence neural models are not significant for low resources. Recently, in language processing due to the availability of large corpus, researchers have developed pretrained neural models which are trained on these large benchmark corpus/dataset. And it could be tested on different datasets for the similar tasks. Basically all these models are pretrained word vectors built on using large corpus. These benchmark pretrained models reduces the time from building the model from scratch. The various pretrained models used for language processing tasks are CoVe(Context Vectors), GLUE(General Language Understanding Evaluation), ELMo(Embedding from Language Models) [ 64 ], BERT(Bidirectional Encoder Representations from Transformers), etc. BERT is the recent efficient pretrained model developed by Google [ 27 ]. Few months’ back BERT had been adopted by Google search and it is trained over 70 languages. Among these languages there are few major Indian regional languages too. Better pretrained models and its research experiments are yet to be done for Indian regional languages by using large language resources.

4.3.6 Performance and limitations

Neural approach evolved for providing efficient solutions to various tasks. The concept of neurons in neural approach duplicates the functions of biological neurons which have features like self-learning, fault tolerance and noise immunity. Many architectures such as LSTM, RNN, CNN have evolved in the recent past and achieved commendable efficiency in various tasks, especially in language processing. However, they fail in situations like less resource/dataset and overfitting. They also especially require sufficient hardware support for faster execution. The Neural Machine Translation (NMT) task performs better than other state-of-the-art methods but care needs to be taken while using unknown and rare words. Multitask learning and multilingual models are suggested for translations of low resource languages [ 11 ]. Sentiment analysis task for Indian languages gives better results using neural approach, yet understanding of a low resource language’s words is challenging, because words are agglutinative and often differ in meaning with usage. During preprocessing, emoticons and punctuations are usually removed but these matter a lot in analyzing the sentiment or sarcastic nature of a word/sentence in any given language (E.g.: “What!” and “What?” - Even though the word ‘What’ is common, meanings differ owing to different punctuation) [ 13 ]. This affects the NMT too, because punctuation changes the meaning of sentences significantly (E.g.: Hang him, not leave him (&) Hang him not, leave him). Table 2 gives insights into the discussed researches on Indian regional languages for different language processing tasks (Transliteration, Lemmatization, Machine Translation, Sentiment Analysis, Named Entity Recognition) using rule, statistical and neural based models.

Other than some focused language processing tasks, researchers worked on other tasks of Indian regional languages too. [ 88 ] explored Question classification task for Hindi and Telugu languages using neural networks, where they considered both character and word level embedding for their task, with good results. Rajan et al. [ 69 ] worked on classification of Tamil text documents using vector space model and neural networks. Among these models, their experimentation results show that both methods are competent while neural network performs slightly better. They used Tamil corpus/Dataset taken from CIIL-Mysore-India.

EMILLE (Enabling Minority Language Engineering) [ 22 ] corpus has been created as a collaboration between Central Institute of Indian Languages (CIIL), Mysuru, India and EMILLE project, Lancaster University, UK. This EMILLE/CIIL corpus is available free for non-profit research works and constitutes monolingual, parallel and annotated corpora. Monolingual corpora have been constructed for 14 south Asian languages namely Assamese, Bengali, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Oriya, Punjabi, Sinhala, Tamil, Telugu and Urdu. It includes both written and spoken data (some among former mentioned languages). Parallel corpora consists of 2,00,000 words in English and respective translated words in languages like Hindi, Punjabi, Bengali, Gujarati and Urdu. Annotated corpora are available for Hindi and Urdu languages, especially for parts-of-speech tagging. These corpora are encoded using Unicode.

IJCNLP-2008 [ 23 ] data set for Named Entity Recognition (NER) task was created during the workshop on NER for South and South East Asian languages organized by IIIT, Hyderabad and contains datasets of Hindi, Bengali, Oriya, Telugu, and Urdu. Scarcity of resources in regional languages for various computational tasks was the motivation behind the creation of these datasets, especially for NER task.

Tab-delimited Bilingual Sentence Pairs [ 24 ] datasets have been developed by Tatoeba, a non-profitable organization, by collecting sentences from various languages. They specially focused on the development of large number of linguistic datasets of various low resource language sentences and its translations. The dataset can be utilized for translating any low resource language to English. Tab key acts as delimitation between source and translated sentences. There are a minimum of 100 or more sentences and their translations in each dataset. Figure 4 gives an example of the data in the dataset.

figure 4

Examples from Hindi-English translation dataset

Center for Development of Advanced Computing C-DAC [ 21 ] is an R&D organization which comes under the Ministry of Electronics and Information Technology (MeitY) of Indian government. As India is a multilingual nation, this organization developed many multilingual tools and solutions to reduce the barriers between Indian languages. All these tools and solutions are available to users for research work. It also provides Indian languages Corpora, and Dictionaries.

These are some sources of dataset aids for the exploration of new avenues in language processing tasks. As there is a scarcity of resources for many regional languages, researchers contributed their own datasets and also conducted their desired language processing tasks on these.

6 Discussion and future directions

According to research firm Common Sense Advisory, 72.1% of online customers spend their time on sites in their own language while 72.4% customers prefer to buy a product with information in their own language. People understand precisely if anybody communicates to them in their mother tongue. These are some of the reasons that it’s essential to make machines understand and communicate with the user in their own language. This paper has explored various methods for language processing tasks are explored for Indian regional languages. Since Indian regional languages are morphologically rich, agglutinative and have sentences with difficult to analyze structures, less research work has been attempted in these, compared with English. There is still need for good quality dictionaries such as WordNets, Corpora for the less resourced Indian languages. Even though some good language processing systems have been developed for some Indian languages with large number of speakers, there are still many areas which are untouched such as Code-Mixed language processing, Opinion extraction and so on for many Indian regional languages. Neural based approaches are yet to be experimented in many language processing tasks. Neural Unsupervised machine translations for various low resource Indian languages are not yet experimented. Another area of potential interest is Transfer Learning, where knowledge for less resourced task is obtained by gaining knowledge from resource rich domain/tasks. This reduces the problem of overfitting in neural networks. This is being used in image processing tasks but yet to be experimented in NLP applications/tasks. Visual Question Answering [ 92 ] is also another language processing task, where language processing of questions has not been experimented in Indian languages.

7 Conclusion

In this paper, various state-of-the-art techniques and approaches used for language processing tasks are reviewed in detail. Comprehensive reviews on language processing, especially on the Indian regional languages are presented. Various methods like tokenization, machine transliteration, lemmatization, stemming, POS tagging and so on, which are the building blocks for many natural language processing tasks, are reviewed. Major approaches like lexicon/rule based, statistical based and neural networks for various tasks like Machine Translation, Sentiment Analysis and Named Entity Recognition are discussed. In this article, detailed description of various research works for tackling the problems on low resource languages (especially Indian languages) is presented. The challenges faced in making machine understand natural languages and enabling machines for natural language generation are described. The dataset sources which are available for some Indian language processing tasks are also presented. Further to the descriptive review, promising future avenues like enabling machines to understand low resource Indian languages by generating corpora, multilingual models, Transfer learning and other natural language generation tasks for Indian languages are listed. With these particular points on future work and exploration of ongoing methods, we believe that the research on Indian regional language processing will be aided.

Akhtar MS, Kumar A, Ekbal A, Bhattacharyya P (2016) A hybrid deep learning architecture for sentiment analysis. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 482–493

Almatrafi O, Parack S, Chavan B (2015) Application of location-based sentiment analysis using twitter for identifying trends towards indian general elections 2014. In: Proceedings of the 9th international conference on ubiquitous information management and communication, ACM, p 41

Amarappa S, Sathyanarayana S (2013) Named entity recognition and classification in kannada language. Int J Electron Comput Sci Eng 2(1):281–289

Google Scholar  

Amarappa S, Sathyanarayana S (2015) Kannada named entity recognition and classification using conditional random fields. In: 2015 International conference on emerging research in electronics. Computer Science and Technology (ICERECT), IEEE, pp 186–191

Ameta J, Joshi N, Mathur I (2012) A lightweight stemmer for gujarati. arXiv preprint arXiv:12105486

Anand Kumar M, Dhanalakshmi V, Soman K, Rajendran S (2010) A sequence labeling approach to morphological analyzer for tamil language. Int J Comput Sci Eng 2(06):2201–2208

Antony P, Soman K (2010) Kernel based part of speech tagger for kannada. In: 2010 international conference on machine learning and cybernetics, IEEE, vol 4, pp 2139–2144

Antony P, Soman K (2011) Machine transliteration for indian languages: a literature survey. Int J Sci Eng Res IJSER 2:1–8

Antony P, Ajith V, Soman K (2010) Kernel method for english to kannada transliteration. In: 2010 international conference on recent trends in information. Telecommunication and Computing, IEEE, pp 336–338

Arbabi M, Fischthal SM, Cheng VC, Bart E (1994) Algorithms for arabic name transliteration. IBM J Res Dev 38(2):183–194

Article   Google Scholar  

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473

Balamurali A, Joshi A, Bhattacharyya P (2012) Cross-lingual sentiment analysis for indian languages using linked wordnets. In: Proceedings of COLING 2012: Posters, pp 73–82

Bhargava R, Arora S, Sharma Y (2019) Neural network-based architecture for sentiment analysis in indian languages. J Intell Syst 28(3):361–375

Bijal D, Sanket S (2014) Overview of stemming algorithms for indian and non-indian languages. arXiv preprint arXiv:14042878

Br S, Kumar R (2012) Kannada part-of-speech tagging with probabilistic classifiers. Int J Comput Appl 975:888

Broadwell GA, Butt M, King TH (2005) It aint necessarily s (v) o: Two kinds of vso languages. In: Proceedings of the LFG 05 conference. http://csli-publications. stanford. edu/LFG/10/lfg05. html. Stanford, CSLI Publications

Choudhary N, Singh R, Bindlish I, Shrivastava M (2018) Emotions are universal: Learning sentiment based representations of resource-poor languages using siamese networks. arXiv preprint arXiv:180400805

Dalal A, Nagaraj K, Sawant U, Shelke S (2006) Hindi part-of-speech tagging and chunking: a maximum entropy approach. In: Proceedings of the NLPAI machine learning contest, vol 6

Dandapat S, Sarkar S, Basu A (2007) Automatic part-of-speech tagging for bengali: an approach for morphologically rich languages in a poor resource scenario. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Association for Computational Linguistics, pp 221–224

Dasgupta S, Ng V (2006) Unsupervised morphological parsing of bengali. Lang Resour Eval 40(3–4):311–330

DatasetSource-CDAC (2003) https://www.cdac.in/index.aspx?id=products_services . Accessed on 12 Dec 2019

DatasetSource-EMILLE (2003) https://www.lancaster.ac.uk/fass/projects/corpus/emille/ . Accessed on 12 Dec 2019

DatasetSource-IJCNLP (2008) http://ltrc.iiit.ac.in/ner-ssea-08/ . Accessed on 12 Dec 2019

DatasetSource-Manythingsorg (2015) http://www.manythings.org/anki/ . Accessed on 12 Dec 2019

Dave S, Parikh J, Bhattacharyya P (2001) Interlingua-based english-hindi machine translation and language divergence. Mach Transl 16(4):251–304

Deri A, Knight K (2016) Grapheme-to-phoneme models for (almost) any language. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 399–408

Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805

Dhanalakshmi V, Shivapratap G, Soman Kp RS (2009) Tamil pos tagging using linear programming

Dhore M, Dixit S, Dhore R (2012) Hindi and marathi to english ne transliteration tool using phonology and stress analysis. In: Proceedings of COLING 2012: Demonstration Papers, pp 111–118

Ekbal A, Bandyopadhyay S (2007) A hidden markov model based named entity recognition system: Bengali and hindi as case studies. In: International conference on pattern recognition and machine intelligence, Springer, pp 545–552

Ekbal A, Bandyopadhyay S (2008a) Bengali named entity recognition using support vector machine. In: Proceedings of the IJCNLP-08 workshop on named entity recognition for South and South East Asian languages

Ekbal A, Bandyopadhyay S (2008b) Part of speech tagging in bengali using support vector machine. In: 2008 international conference on information technology, IEEE, pp 106–111

Ekbal A, Bandyopadhyay S (2009) A conditional random field approach for named entity recognition in bengali and hindi. Linguist Issues Lang Technol 2(1):1–44

Ekbal A, Bandyopadhyay S (2010a) Named entity recognition using appropriate unlabeled data, post-processing and voting. Informatica 34(1):459

MATH   Google Scholar  

Ekbal A, Bandyopadhyay S (2010b) Named entity recognition using support vector machine: a language independent approach. Int J Electr Comput Syst Eng 4(2):155–170

Ekbal A, Naskar SK, Bandyopadhyay S (2006) A modified joint source-channel model for transliteration. In: Proceedings of the COLING/ACL on main conference poster sessions, Association for Computational Linguistics, pp 191–198

Ekbal A, Haque R, Bandyopadhyay S (2007a) Bengali part of speech tagging using conditional random field. In: Proceedings of seventh international symposium on natural language processing (SNLP2007), pp 131–136

Ekbal A, Naskar SK, Bandyopadhyay S (2007b) Named entity recognition and transliteration in bengali. Lingvisticae Invest 30(1):95–114

Ekbal A, Haque R, Bandyopadhyay S (2008) Named entity recognition in bengali: A conditional random field approach. In: Proceedings of the third international joint conference on natural language processing, Vol II

Godase A, Govilkar S (2015) Machine translation development for Indian languages and its approaches. Int J Nat Lang Comput 4:55–74

Goldsmith J (2001) Unsupervised learning of the morphology of a natural language. Comput Ling 27(2):153–198

Article   MathSciNet   Google Scholar  

Goyal V, Lehal GS (2011) Hindi to punjabi machine translation system. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: systems demonstrations, Association for Computational Linguistics, pp 1–6

Gupta V, Lehal GS (2011) Named entity recognition for punjabi language text summarization. Int J Comput Appl 33(3):28–32

Jamatia A, Gambäck B, Das A (2015) Part-of-speech tagging for code-mixed english-hindi twitter and facebook chat messages. In: Proceedings of the international conference recent advances in natural language processing, pp 239–248

Jha V, Manjunath N, Shenoy PD, Venugopal K (2016) Hsra: Hindi stopword removal algorithm. In: 2016 international conference on microelectronics. Computing and Communications (MicroCom), IEEE, pp 1–5

Kaur B, Veer D (2016) Translation challenges and universal networking language. Int J Comput Appl 133(15):36–40

Kumar A, Kohail S, Ekbal A, Biemann C (2015a) Iit-tuda: System for sentiment analysis in indian languages using lexical acquisition. In: International conference on mining intelligence and knowledge exploration, Springer, pp 684–693

Kumar H, Harish B, Kumar S, Aradhya V (2018) Classification of sentiments in short-text: an approach using msmtp measure. In: Proceedings of the 2nd international conference on machine learning and soft computing, ACM, pp 145–150

Kumar KA, Rajasimha N, Reddy M, Rajanarayana A, Nadgir K (2015b) Analysis of users sentiments from kannada web documents. Procedia Comput Sci 54:247–256

Lakshmi BS, Shambhavi B (2019) Bidirectional long short-term memory for automatic english to kannada back-transliteration. Emerging Research in Computing. In: Information, communication and applications, Springer, pp 277–287

Madankar M, Chandak M, Chavhan N (2016) Information retrieval system and machine translation: a review. Procedia Comput Sci 78:845–850

Majumder P, Mitra M, Parui SK, Kole G, Mitra P, Datta K (2007) Yass: Yet another suffix stripper. ACM Trans Inf Syst 25(4):18

Moran S, McCloy D (eds) (2019) PHOIBLE 2.0. Max Planck Institute for the Science of Human History, Jena, https://phoible.org/

Narayan R, Chakraverty S, Singh V (2014) Neural network based parts of speech tagger for hindi. IFAC Proc Vol 47(1):519–524

Nayan A, Rao BRK, Singh P, Sanyal S, Sanyal R (2008) Named entity recognition for indian languages. In: Proceedings of the IJCNLP-08 workshop on named entity recognition for South and South East Asian Languages

Olinsky C, Black AW (2000) Non-standard word and homograph resolution for asian language text analysis. In: Sixth international conference on spoken language processing

Pal U, Chaudhuri B (2004) Indian script character recognition: a survey. Pattern Recogn 37(9):1887–1899

Pandey AK, Siddiqui TJ (2008) An unsupervised hindi stemmer with heuristic improvements. In: Proceedings of the second workshop on Analytics for noisy unstructured text data, ACM, pp 99–105

Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, Association for Computational Linguistics, pp 311–318

Parikh A (2009) Part-of-speech tagging using neural network. In: Proceedings of ICON

Patel P, Popat K, Bhattacharyya P (2010) Hybrid stemmer for gujarati. In: Proceedings of the 1st workshop on South and Southeast Asian Natural Language Processing, pp 51–55

Patil N, Patil AS, Pawar B (2016) Survey of named entity recognition systems with respect to indian and foreign languages. Int J Comput Appl 134(16):88

Paul S, Tandon M, Joshi N, Mathur I (2013) Design of a rule based hindi lemmatizer. In: Proceedings of Third international workshop on artificial intelligence, soft computing and applications, Chennai, India, Citeseer, pp 67–74

Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv preprint arXiv:180205365

Poornima C, Dhanalakshmi V, Anand K, Soman K (2011) Rule based sentence simplification for english to tamil machine translation system. Int J Comput Appl 25(8):38–42

Prabhu A, Joshi A, Shrivastava M, Varma V (2016) Towards sub-word level compositions for sentiment analysis of hindi-english code mixed text. arXiv preprint arXiv:161100472

Prathibha R, Padma M (2015) Design of rule based lemmatizer for kannada inflectional words. In: 2015 international conference on emerging research in electronics. Computer Science and Technology (ICERECT), IEEE, pp 264–269

PVS A, Karthik G (2007) Part-of-speech tagging and chunking using conditional random fields and transformation based learning. Shallow Parsing for South Asian Languages 21

Rajan K, Ramalingam V, Ganesan M, Palanivel S, Palaniappan B (2009) Automatic classification of tamil documents using vector space model and artificial neural network. Expert Syst Appl 36(8):10914–10918

Ramachandran VA, Krishnamurthi I (2012) An iterative stemmer for tamil language. In: Asian conference on intelligent information and database systems, Springer, pp 197–205

Ramanathan A, Rao DD (2003) A lightweight stemmer for hindi. In: The proceedings of EACL

Ramanathan A, Hegde J, Shah RM, Bhattacharyya P, Sasikumar M (2008) Simple syntactic and morphological processing can help english-hindi statistical machine translation. In: Proceedings of the third international joint conference on natural language processing, Vol I

Ranjan P, Basu HVSSA (2003) Part of speech tagging and local word grouping techniques for natural language parsing in hindi. In: Proceedings of the 1st international conference on natural language processing (ICON 2003), Citeseer

Ravi K, Ravi V (2016) Sentiment classification of hinglish text. In: 2016 3rd international conference on recent advances in information technology (RAIT), IEEE, pp 641–645

Reddy MV, Hanumanthappa M (2013) Indic language machine translation tool: english to kannada/telugu. In: Multimedia processing. Springer, Communication and Computing Applications, pp 35–49

Revanuru K, Turlapaty K, Rao S (2017) Neural machine translation of indian languages. In: Proceedings of the 10th annual ACM India compute conference, ACM, pp 11–20

Rohini V, Thomas M, Latha C (2016) Domain based sentiment analysis in regional language-kannada using machine learning algorithm. In: 2016 IEEE international conference on recent trends in electronics, information and communication technology (RTEICT), IEEE, pp 503–507

Saharia N, Sharma U, Kalita J (2012) Analysis and evaluation of stemming algorithms: a case study with assamese. In: Proceedings of the international conference on advances in computing, communications and informatics, ACM, pp 842–846

Saini S, Sahula V (2015) A survey of machine translation techniques and systems for indian languages. In: 2015 IEEE international conference on computational intelligence and communication technology, IEEE, pp 676–681

Se S, Vinayakumar R, Kumar MA, Soman K (2015) Amrita-cen@ sail2015: sentiment analysis in indian languages. In: International conference on mining intelligence and knowledge exploration, Springer, pp 703–710

Selvam M, Natarajan A (2009) Improvement of rule based morphological analysis and pos tagging in tamil language via projection and induction techniques. Int J Comput 3(4):357–367

Shalini K, Ravikurnar A, Vineetha R, Aravinda RD, Anand KM, Soman K (2018) Sentiment analysis of indian languages using convolutional neural networks. In: 2018 international conference on computer communication and informatics (ICCCI), IEEE, pp 1–4

Shrivastava M, Bhattacharyya P (2008) Hindi pos tagger using naive stemming: harnessing morphological information without extensive linguistic knowledge. In: International conference on NLP (ICON08), Pune, India

Singh S, Gupta K, Shrivastava M, Bhattacharyya P (2006) Morphological richness offsets resource demand-experiences in constructing a pos tagger for hindi. In: Proceedings of the COLING/ACL on main conference poster sessions, Association for Computational Linguistics, pp 779–786

Somers H (1999) Example-based machine translation. Mach Transl 14(2):113–157

Suba K, Jiandani D, Bhattacharyya P (2011) Hybrid inflectional stemmer and rule-based derivational stemmer for gujarati. In: Proceedings of the 2nd workshop on South Southeast Asian natural language processing (WSSANLP), pp 1–8

Todi KK, Mishra P, Sharma DM (2018) Building a kannada pos tagger using machine learning and neural network models. arXiv preprint arXiv:180803175

Tummalapalli M, Mamidi R (2018) Syllables for sentence classification in morphologically rich languages. In: Proceedings of the 32nd Pacific Asia conference on language, information and computation

Unnikrishnan P, Antony P, Soman K (2010) A novel approach for english to south dravidian language statistical machine translation system. Int J Comput Sci Eng 2(08):2749–2759

Vijayakrishna R, Sobha L (2008) Domain focused named entity recognizer for tamil using conditional random fields. In: Proceedings of the IJCNLP-08 workshop on named entity recognition for South and South East Asian Languages

Webster JJ, Kit C (1992) Tokenization as the initial phase in nlp. In: COLING 1992 Volume 4: The 15th international conference on computational linguistics

Wu Q, Teney D, Wang P, Shen C, Dick A, van den Hengel A (2017) Visual question answering: A survey of methods and datasets. Comput Vis Image Underst 163:21–40

Download references

Acknowledgements

This work is supported by Vision Group on Science and Technology (VGST), Department of IT,BT and Science and Technology, Government of Karnataka, India. [File No.: VGST/2019-20/GRD No.:850/397]

Author information

Authors and affiliations.

Department of Information Science and Engineering, JSS Science and Technology University, Mysuru, Karnataka State, India

B. S. Harish & R. Kasturi Rangan

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to B. S. Harish .

Ethics declarations

Conflict of interest.

The authors declare that they have no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Harish, B.S., Rangan, R.K. A comprehensive survey on Indian regional language processing. SN Appl. Sci. 2 , 1204 (2020). https://doi.org/10.1007/s42452-020-2983-x

Download citation

Received : 24 December 2019

Accepted : 29 May 2020

Published : 12 June 2020

DOI : https://doi.org/10.1007/s42452-020-2983-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Language processing
  • Machine translation
  • Named entity recognition
  • POS tagging

Advertisement

  • Find a journal
  • Publish with us
  • Track your research
  • CBSE Class 10th
  • CBSE Class 12th
  • UP Board 10th
  • UP Board 12th
  • Bihar Board 10th
  • Bihar Board 12th
  • Top Schools in India
  • Top Schools in Delhi
  • Top Schools in Mumbai
  • Top Schools in Chennai
  • Top Schools in Hyderabad
  • Top Schools in Kolkata
  • Top Schools in Pune
  • Top Schools in Bangalore

Products & Resources

  • JEE Main Knockout April
  • Free Sample Papers
  • Free Ebooks
  • NCERT Notes
  • NCERT Syllabus
  • NCERT Books
  • RD Sharma Solutions
  • Navodaya Vidyalaya Admission 2024-25
  • NCERT Solutions
  • NCERT Solutions for Class 12
  • NCERT Solutions for Class 11
  • NCERT solutions for Class 10
  • NCERT solutions for Class 9
  • NCERT solutions for Class 8
  • NCERT Solutions for Class 7
  • JEE Main 2024
  • JEE Advanced 2024
  • BITSAT 2024
  • View All Engineering Exams
  • Colleges Accepting B.Tech Applications
  • Top Engineering Colleges in India
  • Engineering Colleges in India
  • Engineering Colleges in Tamil Nadu
  • Engineering Colleges Accepting JEE Main
  • Top IITs in India
  • Top NITs in India
  • Top IIITs in India
  • JEE Main College Predictor
  • JEE Main Rank Predictor
  • MHT CET College Predictor
  • AP EAMCET College Predictor
  • GATE College Predictor
  • KCET College Predictor
  • JEE Advanced College Predictor
  • View All College Predictors
  • JEE Main Question Paper
  • JEE Main Mock Test
  • JEE Main Registration
  • JEE Main Syllabus
  • Download E-Books and Sample Papers
  • Compare Colleges
  • B.Tech College Applications
  • GATE 2024 Result
  • MAH MBA CET Exam
  • View All Management Exams

Colleges & Courses

  • MBA College Admissions
  • MBA Colleges in India
  • Top IIMs Colleges in India
  • Top Online MBA Colleges in India
  • MBA Colleges Accepting XAT Score
  • BBA Colleges in India
  • XAT College Predictor 2024
  • SNAP College Predictor
  • NMAT College Predictor
  • MAT College Predictor 2024
  • CMAT College Predictor 2024
  • CAT Percentile Predictor 2023
  • CAT 2023 College Predictor
  • CMAT 2024 Registration
  • TS ICET 2024 Registration
  • CMAT Exam Date 2024
  • MAH MBA CET Cutoff 2024
  • Download Helpful Ebooks
  • List of Popular Branches
  • QnA - Get answers to your doubts
  • IIM Fees Structure
  • AIIMS Nursing
  • Top Medical Colleges in India
  • Top Medical Colleges in India accepting NEET Score
  • Medical Colleges accepting NEET
  • List of Medical Colleges in India
  • List of AIIMS Colleges In India
  • Medical Colleges in Maharashtra
  • Medical Colleges in India Accepting NEET PG
  • NEET College Predictor
  • NEET PG College Predictor
  • NEET MDS College Predictor
  • DNB CET College Predictor
  • DNB PDCET College Predictor
  • NEET Application Form 2024
  • NEET PG Application Form 2024
  • NEET Cut off
  • NEET Online Preparation
  • Download Helpful E-books
  • LSAT India 2024
  • Colleges Accepting Admissions
  • Top Law Colleges in India
  • Law College Accepting CLAT Score
  • List of Law Colleges in India
  • Top Law Colleges in Delhi
  • Top Law Collages in Indore
  • Top Law Colleges in Chandigarh
  • Top Law Collages in Lucknow

Predictors & E-Books

  • CLAT College Predictor
  • MHCET Law ( 5 Year L.L.B) College Predictor
  • AILET College Predictor
  • Sample Papers
  • Compare Law Collages
  • Careers360 Youtube Channel
  • CLAT Syllabus 2025
  • CLAT Previous Year Question Paper
  • AIBE 18 Result 2023
  • NID DAT Exam
  • Pearl Academy Exam

Animation Courses

  • Animation Courses in India
  • Animation Courses in Bangalore
  • Animation Courses in Mumbai
  • Animation Courses in Pune
  • Animation Courses in Chennai
  • Animation Courses in Hyderabad
  • Design Colleges in India
  • Fashion Design Colleges in Bangalore
  • Fashion Design Colleges in Mumbai
  • Fashion Design Colleges in Pune
  • Fashion Design Colleges in Delhi
  • Fashion Design Colleges in Hyderabad
  • Fashion Design Colleges in India
  • Top Design Colleges in India
  • Free Design E-books
  • List of Branches
  • Careers360 Youtube channel
  • NIFT College Predictor
  • UCEED College Predictor
  • NID DAT College Predictor
  • IPU CET BJMC
  • JMI Mass Communication Entrance Exam
  • IIMC Entrance Exam
  • Media & Journalism colleges in Delhi
  • Media & Journalism colleges in Bangalore
  • Media & Journalism colleges in Mumbai
  • List of Media & Journalism Colleges in India
  • CA Intermediate
  • CA Foundation
  • CS Executive
  • CS Professional
  • Difference between CA and CS
  • Difference between CA and CMA
  • CA Full form
  • CMA Full form
  • CS Full form
  • CA Salary In India

Top Courses & Careers

  • Bachelor of Commerce (B.Com)
  • Master of Commerce (M.Com)
  • Company Secretary
  • Cost Accountant
  • Charted Accountant
  • Credit Manager
  • Financial Advisor
  • Top Commerce Colleges in India
  • Top Government Commerce Colleges in India
  • Top Private Commerce Colleges in India
  • Top M.Com Colleges in Mumbai
  • Top B.Com Colleges in India
  • IT Colleges in Tamil Nadu
  • IT Colleges in Uttar Pradesh
  • MCA Colleges in India
  • BCA Colleges in India

Quick Links

  • Information Technology Courses
  • Programming Courses
  • Web Development Courses
  • Data Analytics Courses
  • Big Data Analytics Courses
  • RUHS Pharmacy Admission Test
  • Top Pharmacy Colleges in India
  • Pharmacy Colleges in Pune
  • Pharmacy Colleges in Mumbai
  • Colleges Accepting GPAT Score
  • Pharmacy Colleges in Lucknow
  • List of Pharmacy Colleges in Nagpur
  • GPAT Result
  • GPAT 2024 Admit Card
  • GPAT Question Papers
  • NCHMCT JEE 2024
  • Mah BHMCT CET
  • Top Hotel Management Colleges in Delhi
  • Top Hotel Management Colleges in Hyderabad
  • Top Hotel Management Colleges in Mumbai
  • Top Hotel Management Colleges in Tamil Nadu
  • Top Hotel Management Colleges in Maharashtra
  • B.Sc Hotel Management
  • Hotel Management
  • Diploma in Hotel Management and Catering Technology

Diploma Colleges

  • Top Diploma Colleges in Maharashtra
  • UPSC IAS 2024
  • SSC CGL 2024
  • IBPS RRB 2024
  • Previous Year Sample Papers
  • Free Competition E-books
  • Sarkari Result
  • QnA- Get your doubts answered
  • UPSC Previous Year Sample Papers
  • CTET Previous Year Sample Papers
  • SBI Clerk Previous Year Sample Papers
  • NDA Previous Year Sample Papers

Upcoming Events

  • NDA Application Form 2024
  • UPSC IAS Application Form 2024
  • CDS Application Form 2024
  • CTET Admit card 2024
  • HP TET Result 2023
  • SSC GD Constable Admit Card 2024
  • UPTET Notification 2024
  • SBI Clerk Result 2024

Other Exams

  • SSC CHSL 2024
  • UP PCS 2024
  • UGC NET 2024
  • RRB NTPC 2024
  • IBPS PO 2024
  • IBPS Clerk 2024
  • IBPS SO 2024
  • Top University in USA
  • Top University in Canada
  • Top University in Ireland
  • Top Universities in UK
  • Top Universities in Australia
  • Best MBA Colleges in Abroad
  • Business Management Studies Colleges

Top Countries

  • Study in USA
  • Study in UK
  • Study in Canada
  • Study in Australia
  • Study in Ireland
  • Study in Germany
  • Study in China
  • Study in Europe

Student Visas

  • Student Visa Canada
  • Student Visa UK
  • Student Visa USA
  • Student Visa Australia
  • Student Visa Germany
  • Student Visa New Zealand
  • Student Visa Ireland
  • CUET PG 2024
  • IGNOU B.Ed Admission 2024
  • DU Admission
  • UP B.Ed JEE 2024
  • DDU Entrance Exam
  • IIT JAM 2024
  • IGNOU Online Admission 2024
  • Universities in India
  • Top Universities in India 2024
  • Top Colleges in India
  • Top Universities in Uttar Pradesh 2024
  • Top Universities in Bihar
  • Top Universities in Madhya Pradesh 2024
  • Top Universities in Tamil Nadu 2024
  • Central Universities in India
  • CUET PG Admit Card 2024
  • IGNOU Date Sheet
  • CUET Mock Test 2024
  • CUET Application Form 2024
  • CUET PG Syllabus 2024
  • CUET Participating Universities 2024
  • CUET Previous Year Question Paper
  • CUET Syllabus 2024 for Science Students
  • E-Books and Sample Papers
  • CUET Exam Pattern 2024
  • CUET Exam Date 2024
  • CUET Syllabus 2024
  • IGNOU Exam Form 2024
  • IGNOU Result
  • CUET PG Courses 2024

Engineering Preparation

  • Knockout JEE Main 2024
  • Test Series JEE Main 2024
  • JEE Main 2024 Rank Booster

Medical Preparation

  • Knockout NEET 2024
  • Test Series NEET 2024
  • Rank Booster NEET 2024

Online Courses

  • JEE Main One Month Course
  • NEET One Month Course
  • IBSAT Free Mock Tests
  • IIT JEE Foundation Course
  • Knockout BITSAT 2024
  • Career Guidance Tool

Top Streams

  • IT & Software Certification Courses
  • Engineering and Architecture Certification Courses
  • Programming And Development Certification Courses
  • Business and Management Certification Courses
  • Marketing Certification Courses
  • Health and Fitness Certification Courses
  • Design Certification Courses

Specializations

  • Digital Marketing Certification Courses
  • Cyber Security Certification Courses
  • Artificial Intelligence Certification Courses
  • Business Analytics Certification Courses
  • Data Science Certification Courses
  • Cloud Computing Certification Courses
  • Machine Learning Certification Courses
  • View All Certification Courses
  • UG Degree Courses
  • PG Degree Courses
  • Short Term Courses
  • Free Courses
  • Online Degrees and Diplomas
  • Compare Courses

Top Providers

  • Coursera Courses
  • Udemy Courses
  • Edx Courses
  • Swayam Courses
  • upGrad Courses
  • Simplilearn Courses
  • Great Learning Courses

Access premium articles, webinars, resources to make the best decisions for career, course, exams, scholarships, study abroad and much more with

Plan, Prepare & Make the Best Career Choices

Indian Culture Essay

India is renowned throughout the world for its tradition and culture. It is a country with many different cultures and traditions. The world's ancient civilisations can be found in this country. Good manners, etiquette, civilised dialogue, customs, beliefs, values, etc., are essential elements of Indian culture . India is a special country because of the ability of its citizens from many cultures and traditions to live together in harmony. Here are a few sample essays on ‘Indian culture’.

Indian Culture Essay

100 Words Essay on Indian Culture

India's culture is the oldest in the world and dates back over 5,000 years. The first and greatest cultures in the world are regarded as being those of India. The phrase "Unity in Diversity" refers to India as a diverse nation where people of many religions coexist while maintaining their distinct customs. People of different religions have different languages, culinary customs, ceremonies, etc and yet they all live in harmony.

Hindi is India's official language. However, there are 400 other languages regularly spoken in India's many states and territories, in addition to the country's nearly 22 recognised languages. History has established India as the country where religions like Buddhism and Hinduism first emerged.

200 Words Essay on Indian Culture

India is a land of diverse cultures, religions, languages, and traditions. The rich cultural heritage of India is a result of its long history and the various invasions and settlements that have occurred in the country. Indian culture is a melting pot of various customs and traditions, which have been passed down from generation to generation.

Religion | Religion plays a significant role in Indian culture. The major religions practiced in India are Hinduism, Islam, Buddhism, Sikhism, and Jainism. Each religion has its own set of beliefs, customs, and practices. Hinduism, the oldest religion in India, is the dominant religion and has a vast array of gods and goddesses. Islam, Buddhism, Sikhism, and Jainism are also widely practiced and have a significant number of followers in the country.

Food | Indian cuisine is known for its diverse range of flavors and spices. Each region in India has its own unique style of cooking and distinct dishes. Indian cuisine is known for its use of spices, herbs, and a variety of cooking techniques. Some of the most famous Indian dishes include biryani, curry, tandoori chicken, and dal makhani. Indian cuisine is also famous for its street food, which is a popular and affordable way to experience the diverse range of flavors that Indian food has to offer.

500 Words Essay on Indian Culture

Indian culture is known for its rich art and architecture. The ancient Indus Valley Civilization, which existed around 2500 BCE, had a sophisticated system of town planning and impressive architectural structures. Indian art is diverse and includes painting, sculpture, and architecture. The most famous form of Indian art is the cave paintings of Ajanta and Ellora, which date back to the 2nd century BCE. Indian architecture is also famous for its temples, palaces, and forts, which are a reflection of the rich cultural heritage of the country.

Music and dance are an integral part of Indian culture . Indian music is diverse and ranges from classical to folk to modern. The classical music of India is known for its use of ragas, which are a set of musical notes that are used to create a melody. The traditional Indian dance forms include Kathak, Bharatanatyam, and Kathakali. These dance forms are known for their elaborate costumes, expressive gestures, and intricate footwork.

My Experience

I had always been fascinated by the rich culture and history of India. So, when I finally got the opportunity to visit the country, I was beyond excited. I had heard so much about the diverse customs and traditions of India, and I couldn't wait to experience them firsthand. The moment I stepped off the plane and hit the streets, I was greeted by the overwhelming smell of spices and the hustle and bustle of the streets. I knew right away that I was in for an unforgettable journey.

My first stop was the ancient city of Varanasi, also known as Banaras. As I walked through the streets, I was struck by the vibrant colors and the sound of temple bells and chants. I visited the famous Kashi Vishwanath Temple and was amazed by the intricate architecture and the devotion of the devotees.

From Varanasi, I traveled to Jaipur, also known as the Pink City . Here, I visited the famous Amber Fort, which was built in the 16th century. The fort was a perfect example of the rich architecture of India and the level of craftsmanship that existed in ancient India.

As I continued my journey, I also had the opportunity to experience the food of India. From the spicy curries of the south to the tandoori dishes of the north, I was blown away by the range of flavors and the use of spices.

I also had the chance to experience the music and dance of India. I attended a Kathak dance performance and was mesmerized by the intricate footwork and the expressiveness of the dancers. I also had the opportunity to attend a classical music concert and was struck by the beauty of the ragas and the skill of the musicians.

My journey through India was truly an unforgettable experience. I had the chance to experience the diverse customs and traditions of India and was struck by the richness of the culture. From the ancient temples to the vibrant street markets, India is a treasure trove of history and culture. I knew that this would not be my last trip to India, as there is so much more to explore and experience.

Explore Career Options (By Industry)

  • Construction
  • Entertainment
  • Manufacturing
  • Information Technology

Data Administrator

Database professionals use software to store and organise data such as financial information, and customer shipping records. Individuals who opt for a career as data administrators ensure that data is available for users and secured from unauthorised sales. DB administrators may work in various types of industries. It may involve computer systems design, service firms, insurance companies, banks and hospitals.

Bio Medical Engineer

The field of biomedical engineering opens up a universe of expert chances. An Individual in the biomedical engineering career path work in the field of engineering as well as medicine, in order to find out solutions to common problems of the two fields. The biomedical engineering job opportunities are to collaborate with doctors and researchers to develop medical systems, equipment, or devices that can solve clinical problems. Here we will be discussing jobs after biomedical engineering, how to get a job in biomedical engineering, biomedical engineering scope, and salary. 

Ethical Hacker

A career as ethical hacker involves various challenges and provides lucrative opportunities in the digital era where every giant business and startup owns its cyberspace on the world wide web. Individuals in the ethical hacker career path try to find the vulnerabilities in the cyber system to get its authority. If he or she succeeds in it then he or she gets its illegal authority. Individuals in the ethical hacker career path then steal information or delete the file that could affect the business, functioning, or services of the organization.

GIS officer work on various GIS software to conduct a study and gather spatial and non-spatial information. GIS experts update the GIS data and maintain it. The databases include aerial or satellite imagery, latitudinal and longitudinal coordinates, and manually digitized images of maps. In a career as GIS expert, one is responsible for creating online and mobile maps.

Data Analyst

The invention of the database has given fresh breath to the people involved in the data analytics career path. Analysis refers to splitting up a whole into its individual components for individual analysis. Data analysis is a method through which raw data are processed and transformed into information that would be beneficial for user strategic thinking.

Data are collected and examined to respond to questions, evaluate hypotheses or contradict theories. It is a tool for analyzing, transforming, modeling, and arranging data with useful knowledge, to assist in decision-making and methods, encompassing various strategies, and is used in different fields of business, research, and social science.

Geothermal Engineer

Individuals who opt for a career as geothermal engineers are the professionals involved in the processing of geothermal energy. The responsibilities of geothermal engineers may vary depending on the workplace location. Those who work in fields design facilities to process and distribute geothermal energy. They oversee the functioning of machinery used in the field.

Database Architect

If you are intrigued by the programming world and are interested in developing communications networks then a career as database architect may be a good option for you. Data architect roles and responsibilities include building design models for data communication networks. Wide Area Networks (WANs), local area networks (LANs), and intranets are included in the database networks. It is expected that database architects will have in-depth knowledge of a company's business to develop a network to fulfil the requirements of the organisation. Stay tuned as we look at the larger picture and give you more information on what is db architecture, why you should pursue database architecture, what to expect from such a degree and what your job opportunities will be after graduation. Here, we will be discussing how to become a data architect. Students can visit NIT Trichy , IIT Kharagpur , JMI New Delhi . 

Remote Sensing Technician

Individuals who opt for a career as a remote sensing technician possess unique personalities. Remote sensing analysts seem to be rational human beings, they are strong, independent, persistent, sincere, realistic and resourceful. Some of them are analytical as well, which means they are intelligent, introspective and inquisitive. 

Remote sensing scientists use remote sensing technology to support scientists in fields such as community planning, flight planning or the management of natural resources. Analysing data collected from aircraft, satellites or ground-based platforms using statistical analysis software, image analysis software or Geographic Information Systems (GIS) is a significant part of their work. Do you want to learn how to become remote sensing technician? There's no need to be concerned; we've devised a simple remote sensing technician career path for you. Scroll through the pages and read.

Budget Analyst

Budget analysis, in a nutshell, entails thoroughly analyzing the details of a financial budget. The budget analysis aims to better understand and manage revenue. Budget analysts assist in the achievement of financial targets, the preservation of profitability, and the pursuit of long-term growth for a business. Budget analysts generally have a bachelor's degree in accounting, finance, economics, or a closely related field. Knowledge of Financial Management is of prime importance in this career.

Underwriter

An underwriter is a person who assesses and evaluates the risk of insurance in his or her field like mortgage, loan, health policy, investment, and so on and so forth. The underwriter career path does involve risks as analysing the risks means finding out if there is a way for the insurance underwriter jobs to recover the money from its clients. If the risk turns out to be too much for the company then in the future it is an underwriter who will be held accountable for it. Therefore, one must carry out his or her job with a lot of attention and diligence.

Finance Executive

Product manager.

A Product Manager is a professional responsible for product planning and marketing. He or she manages the product throughout the Product Life Cycle, gathering and prioritising the product. A product manager job description includes defining the product vision and working closely with team members of other departments to deliver winning products.  

Operations Manager

Individuals in the operations manager jobs are responsible for ensuring the efficiency of each department to acquire its optimal goal. They plan the use of resources and distribution of materials. The operations manager's job description includes managing budgets, negotiating contracts, and performing administrative tasks.

Stock Analyst

Individuals who opt for a career as a stock analyst examine the company's investments makes decisions and keep track of financial securities. The nature of such investments will differ from one business to the next. Individuals in the stock analyst career use data mining to forecast a company's profits and revenues, advise clients on whether to buy or sell, participate in seminars, and discussing financial matters with executives and evaluate annual reports.

A Researcher is a professional who is responsible for collecting data and information by reviewing the literature and conducting experiments and surveys. He or she uses various methodological processes to provide accurate data and information that is utilised by academicians and other industry professionals. Here, we will discuss what is a researcher, the researcher's salary, types of researchers.

Welding Engineer

Welding Engineer Job Description: A Welding Engineer work involves managing welding projects and supervising welding teams. He or she is responsible for reviewing welding procedures, processes and documentation. A career as Welding Engineer involves conducting failure analyses and causes on welding issues. 

Transportation Planner

A career as Transportation Planner requires technical application of science and technology in engineering, particularly the concepts, equipment and technologies involved in the production of products and services. In fields like land use, infrastructure review, ecological standards and street design, he or she considers issues of health, environment and performance. A Transportation Planner assigns resources for implementing and designing programmes. He or she is responsible for assessing needs, preparing plans and forecasts and compliance with regulations.

Environmental Engineer

Individuals who opt for a career as an environmental engineer are construction professionals who utilise the skills and knowledge of biology, soil science, chemistry and the concept of engineering to design and develop projects that serve as solutions to various environmental problems. 

Safety Manager

A Safety Manager is a professional responsible for employee’s safety at work. He or she plans, implements and oversees the company’s employee safety. A Safety Manager ensures compliance and adherence to Occupational Health and Safety (OHS) guidelines.

Conservation Architect

A Conservation Architect is a professional responsible for conserving and restoring buildings or monuments having a historic value. He or she applies techniques to document and stabilise the object’s state without any further damage. A Conservation Architect restores the monuments and heritage buildings to bring them back to their original state.

Structural Engineer

A Structural Engineer designs buildings, bridges, and other related structures. He or she analyzes the structures and makes sure the structures are strong enough to be used by the people. A career as a Structural Engineer requires working in the construction process. It comes under the civil engineering discipline. A Structure Engineer creates structural models with the help of computer-aided design software. 

Highway Engineer

Highway Engineer Job Description:  A Highway Engineer is a civil engineer who specialises in planning and building thousands of miles of roads that support connectivity and allow transportation across the country. He or she ensures that traffic management schemes are effectively planned concerning economic sustainability and successful implementation.

Field Surveyor

Are you searching for a Field Surveyor Job Description? A Field Surveyor is a professional responsible for conducting field surveys for various places or geographical conditions. He or she collects the required data and information as per the instructions given by senior officials. 

Orthotist and Prosthetist

Orthotists and Prosthetists are professionals who provide aid to patients with disabilities. They fix them to artificial limbs (prosthetics) and help them to regain stability. There are times when people lose their limbs in an accident. In some other occasions, they are born without a limb or orthopaedic impairment. Orthotists and prosthetists play a crucial role in their lives with fixing them to assistive devices and provide mobility.

Pathologist

A career in pathology in India is filled with several responsibilities as it is a medical branch and affects human lives. The demand for pathologists has been increasing over the past few years as people are getting more aware of different diseases. Not only that, but an increase in population and lifestyle changes have also contributed to the increase in a pathologist’s demand. The pathology careers provide an extremely huge number of opportunities and if you want to be a part of the medical field you can consider being a pathologist. If you want to know more about a career in pathology in India then continue reading this article.

Veterinary Doctor

Speech therapist, gynaecologist.

Gynaecology can be defined as the study of the female body. The job outlook for gynaecology is excellent since there is evergreen demand for one because of their responsibility of dealing with not only women’s health but also fertility and pregnancy issues. Although most women prefer to have a women obstetrician gynaecologist as their doctor, men also explore a career as a gynaecologist and there are ample amounts of male doctors in the field who are gynaecologists and aid women during delivery and childbirth. 

Audiologist

The audiologist career involves audiology professionals who are responsible to treat hearing loss and proactively preventing the relevant damage. Individuals who opt for a career as an audiologist use various testing strategies with the aim to determine if someone has a normal sensitivity to sounds or not. After the identification of hearing loss, a hearing doctor is required to determine which sections of the hearing are affected, to what extent they are affected, and where the wound causing the hearing loss is found. As soon as the hearing loss is identified, the patients are provided with recommendations for interventions and rehabilitation such as hearing aids, cochlear implants, and appropriate medical referrals. While audiology is a branch of science that studies and researches hearing, balance, and related disorders.

An oncologist is a specialised doctor responsible for providing medical care to patients diagnosed with cancer. He or she uses several therapies to control the cancer and its effect on the human body such as chemotherapy, immunotherapy, radiation therapy and biopsy. An oncologist designs a treatment plan based on a pathology report after diagnosing the type of cancer and where it is spreading inside the body.

Are you searching for an ‘Anatomist job description’? An Anatomist is a research professional who applies the laws of biological science to determine the ability of bodies of various living organisms including animals and humans to regenerate the damaged or destroyed organs. If you want to know what does an anatomist do, then read the entire article, where we will answer all your questions.

For an individual who opts for a career as an actor, the primary responsibility is to completely speak to the character he or she is playing and to persuade the crowd that the character is genuine by connecting with them and bringing them into the story. This applies to significant roles and littler parts, as all roles join to make an effective creation. Here in this article, we will discuss how to become an actor in India, actor exams, actor salary in India, and actor jobs. 

Individuals who opt for a career as acrobats create and direct original routines for themselves, in addition to developing interpretations of existing routines. The work of circus acrobats can be seen in a variety of performance settings, including circus, reality shows, sports events like the Olympics, movies and commercials. Individuals who opt for a career as acrobats must be prepared to face rejections and intermittent periods of work. The creativity of acrobats may extend to other aspects of the performance. For example, acrobats in the circus may work with gym trainers, celebrities or collaborate with other professionals to enhance such performance elements as costume and or maybe at the teaching end of the career.

Video Game Designer

Career as a video game designer is filled with excitement as well as responsibilities. A video game designer is someone who is involved in the process of creating a game from day one. He or she is responsible for fulfilling duties like designing the character of the game, the several levels involved, plot, art and similar other elements. Individuals who opt for a career as a video game designer may also write the codes for the game using different programming languages.

Depending on the video game designer job description and experience they may also have to lead a team and do the early testing of the game in order to suggest changes and find loopholes.

Radio Jockey

Radio Jockey is an exciting, promising career and a great challenge for music lovers. If you are really interested in a career as radio jockey, then it is very important for an RJ to have an automatic, fun, and friendly personality. If you want to get a job done in this field, a strong command of the language and a good voice are always good things. Apart from this, in order to be a good radio jockey, you will also listen to good radio jockeys so that you can understand their style and later make your own by practicing.

A career as radio jockey has a lot to offer to deserving candidates. If you want to know more about a career as radio jockey, and how to become a radio jockey then continue reading the article.

Choreographer

The word “choreography" actually comes from Greek words that mean “dance writing." Individuals who opt for a career as a choreographer create and direct original dances, in addition to developing interpretations of existing dances. A Choreographer dances and utilises his or her creativity in other aspects of dance performance. For example, he or she may work with the music director to select music or collaborate with other famous choreographers to enhance such performance elements as lighting, costume and set design.

Social Media Manager

A career as social media manager involves implementing the company’s or brand’s marketing plan across all social media channels. Social media managers help in building or improving a brand’s or a company’s website traffic, build brand awareness, create and implement marketing and brand strategy. Social media managers are key to important social communication as well.

Photographer

Photography is considered both a science and an art, an artistic means of expression in which the camera replaces the pen. In a career as a photographer, an individual is hired to capture the moments of public and private events, such as press conferences or weddings, or may also work inside a studio, where people go to get their picture clicked. Photography is divided into many streams each generating numerous career opportunities in photography. With the boom in advertising, media, and the fashion industry, photography has emerged as a lucrative and thrilling career option for many Indian youths.

An individual who is pursuing a career as a producer is responsible for managing the business aspects of production. They are involved in each aspect of production from its inception to deception. Famous movie producers review the script, recommend changes and visualise the story. 

They are responsible for overseeing the finance involved in the project and distributing the film for broadcasting on various platforms. A career as a producer is quite fulfilling as well as exhaustive in terms of playing different roles in order for a production to be successful. Famous movie producers are responsible for hiring creative and technical personnel on contract basis.

Copy Writer

In a career as a copywriter, one has to consult with the client and understand the brief well. A career as a copywriter has a lot to offer to deserving candidates. Several new mediums of advertising are opening therefore making it a lucrative career choice. Students can pursue various copywriter courses such as Journalism , Advertising , Marketing Management . Here, we have discussed how to become a freelance copywriter, copywriter career path, how to become a copywriter in India, and copywriting career outlook. 

In a career as a vlogger, one generally works for himself or herself. However, once an individual has gained viewership there are several brands and companies that approach them for paid collaboration. It is one of those fields where an individual can earn well while following his or her passion. 

Ever since internet costs got reduced the viewership for these types of content has increased on a large scale. Therefore, a career as a vlogger has a lot to offer. If you want to know more about the Vlogger eligibility, roles and responsibilities then continue reading the article. 

For publishing books, newspapers, magazines and digital material, editorial and commercial strategies are set by publishers. Individuals in publishing career paths make choices about the markets their businesses will reach and the type of content that their audience will be served. Individuals in book publisher careers collaborate with editorial staff, designers, authors, and freelance contributors who develop and manage the creation of content.

Careers in journalism are filled with excitement as well as responsibilities. One cannot afford to miss out on the details. As it is the small details that provide insights into a story. Depending on those insights a journalist goes about writing a news article. A journalism career can be stressful at times but if you are someone who is passionate about it then it is the right choice for you. If you want to know more about the media field and journalist career then continue reading this article.

Individuals in the editor career path is an unsung hero of the news industry who polishes the language of the news stories provided by stringers, reporters, copywriters and content writers and also news agencies. Individuals who opt for a career as an editor make it more persuasive, concise and clear for readers. In this article, we will discuss the details of the editor's career path such as how to become an editor in India, editor salary in India and editor skills and qualities.

Individuals who opt for a career as a reporter may often be at work on national holidays and festivities. He or she pitches various story ideas and covers news stories in risky situations. Students can pursue a BMC (Bachelor of Mass Communication) , B.M.M. (Bachelor of Mass Media) , or  MAJMC (MA in Journalism and Mass Communication) to become a reporter. While we sit at home reporters travel to locations to collect information that carries a news value.  

Corporate Executive

Are you searching for a Corporate Executive job description? A Corporate Executive role comes with administrative duties. He or she provides support to the leadership of the organisation. A Corporate Executive fulfils the business purpose and ensures its financial stability. In this article, we are going to discuss how to become corporate executive.

Multimedia Specialist

A multimedia specialist is a media professional who creates, audio, videos, graphic image files, computer animations for multimedia applications. He or she is responsible for planning, producing, and maintaining websites and applications. 

Quality Controller

A quality controller plays a crucial role in an organisation. He or she is responsible for performing quality checks on manufactured products. He or she identifies the defects in a product and rejects the product. 

A quality controller records detailed information about products with defects and sends it to the supervisor or plant manager to take necessary actions to improve the production process.

Production Manager

A QA Lead is in charge of the QA Team. The role of QA Lead comes with the responsibility of assessing services and products in order to determine that he or she meets the quality standards. He or she develops, implements and manages test plans. 

Process Development Engineer

The Process Development Engineers design, implement, manufacture, mine, and other production systems using technical knowledge and expertise in the industry. They use computer modeling software to test technologies and machinery. An individual who is opting career as Process Development Engineer is responsible for developing cost-effective and efficient processes. They also monitor the production process and ensure it functions smoothly and efficiently.

AWS Solution Architect

An AWS Solution Architect is someone who specializes in developing and implementing cloud computing systems. He or she has a good understanding of the various aspects of cloud computing and can confidently deploy and manage their systems. He or she troubleshoots the issues and evaluates the risk from the third party. 

Azure Administrator

An Azure Administrator is a professional responsible for implementing, monitoring, and maintaining Azure Solutions. He or she manages cloud infrastructure service instances and various cloud servers as well as sets up public and private cloud systems. 

Computer Programmer

Careers in computer programming primarily refer to the systematic act of writing code and moreover include wider computer science areas. The word 'programmer' or 'coder' has entered into practice with the growing number of newly self-taught tech enthusiasts. Computer programming careers involve the use of designs created by software developers and engineers and transforming them into commands that can be implemented by computers. These commands result in regular usage of social media sites, word-processing applications and browsers.

Information Security Manager

Individuals in the information security manager career path involves in overseeing and controlling all aspects of computer security. The IT security manager job description includes planning and carrying out security measures to protect the business data and information from corruption, theft, unauthorised access, and deliberate attack 

ITSM Manager

Automation test engineer.

An Automation Test Engineer job involves executing automated test scripts. He or she identifies the project’s problems and troubleshoots them. The role involves documenting the defect using management tools. He or she works with the application team in order to resolve any issues arising during the testing process. 

Applications for Admissions are open.

Aakash iACST Scholarship Test 2024

Aakash iACST Scholarship Test 2024

Get up to 90% scholarship on NEET, JEE & Foundation courses

SAT® | CollegeBoard

SAT® | CollegeBoard

Registeration closing on 19th Apr for SAT® | One Test-Many Universities | 90% discount on registrations fee | Free Practice | Multiple Attempts | no penalty for guessing

JEE Main Important Chemistry formulas

JEE Main Important Chemistry formulas

As per latest 2024 syllabus. Chemistry formulas, equations, & laws of class 11 & 12th chapters

TOEFL ® Registrations 2024

TOEFL ® Registrations 2024

Thinking of Studying Abroad? Think the TOEFL® test. Register now & Save 10% on English Proficiency Tests with Gift Cards

Resonance Coaching

Resonance Coaching

Enroll in Resonance Coaching for success in JEE/NEET exams

ALLEN JEE Exam Prep

ALLEN JEE Exam Prep

Start your JEE preparation with ALLEN

Everything about Education

Latest updates, Exclusive Content, Webinars and more.

Download Careers360 App's

Regular exam updates, QnA, Predictors, College Applications & E-books now on your Mobile

student

Cetifications

student

We Appeared in

Economic Times

Talk to our experts

1800-120-456-456

  • India Essay

ffImage

Essay on India

India is the largest democratic country. It is a big country divided into 29 states and 7 union territories. These states and union territories have been created so that the government can run the country more easily. India also has many different kinds of physical features in different parts of the country that are spread over its states and union territories. India is a very diverse country as well, which means that the people around the country are different in many ways. Even though India is such a diverse place, it is united as one country. 

Political Divisions

India is the seventh-largest country and has the second-largest population in the world. Here is the map of India showing 29 states and 7 union territories. These political divisions are made so that the government can run the country more easily. Though we live in different states, everyone is an Indian first.

[Image will be uploaded soon]

Physical Features

The Indian subcontinent has many different physical features shared with its neighbours that are also in the subcontinent – Pakistan, Nepal, Bhutan and Bangladesh. The physical features of India form six different natural regions. 

The Northern Mountains

The Northern Plains

The Great Indian Desert

The Southern Plateau

The Coastal Plains

The Island Regions

The Northern Mountains: These are the Himalayas, the highest mountain range in the world. They form a natural boundary between India and a large part of Asia. Two neighbouring countries, Nepal and Bhutan are situated in these mountains. 

The Northern Plains: They are located to the south of the Himalayas. They extend into Pakistan in the west. Bangladesh is situated on the eastern part of the plains. 

The Great Indian Desert: The western part of India is a desert with less rainfall. This desert is called the Thar Desert. 

The Southern Plateau: This plateau region lies to the south of the Great Northern Plains and is called the Deccan Plateau. The Vindhya and Satpura ranges in the north, the Western Ghats and the Eastern Ghats surround the Deccan Plateau. 

The Coastal Plains: The Eastern coastal plain lies between the Bay of Bengal and the Eastern Ghats. The western coastal plain lies between the Arabian Sea and the Western Ghats.

The Island Regions: The island regions of India are two archipelagos on either side of Peninsula India. The Lakshadweep Islands are in the Arabian Sea and the Andaman and Nicobar Islands are in the Bay of Bengal. 

The Rivers of India

The Indian subcontinent has many rivers. Some important rivers are the Indus, Ganga, Yamuna, Brahmaputra, Sutlej, the Narmada and Tapi rivers. 

These physical features and rivers link the people of India.

National Symbols

The National Flag of India is in the tricolour of deep saffron at the top, white in the middle and dark green at the bottom in equal proportions. The saffron stands for courage, sacrifice and the spirit of renunciation, the white for purity and the truth and the green for faith and fertility. In the centre of the white band, there is a wheel of law in the Sarnath Lion Capital.

The National Emblem of India is a replica of the Lion of Sarnath and symbolizes India’s reaffirmation of its ancient commitment to world peace and goodwill. 

The National Anthem of India is Jana Gana Mana and the National song is Vande Mataram.

The National Animal of India is Tiger, which symbolizes grace, strength and power.

The National Bird of India is Peacock, which symbolizes beauty, majesty and pride. 

The National Flower of India is Lotus, which symbolizes purity, wealth, richness, knowledge and serenity.

The National Tree of India is the Great Banyan Tree and because of its characteristics and longevity, the tree is considered immortal and sacred. It is an integral part of the myths and legends in India.

The National Fruit is Mango and it is the most cultivated fruit of the tropical world. 

Indian food is diverse. The geography of a region influences the food that people eat. The staple food of people is what grows in their regions. In North India, the staple food is Wheat. In East and South India, the staple food is Rice. In West India, the staple food is Millet. Daals are eaten in almost the entire country and prepared in different ways. 

Indians speak different languages. The Constitution of India mentions 22 languages. However, India has around 800 languages. Hindi is the official language of India. 

India is a country of many different religions and each has different festivals. Some important festivals are Baisakhi, Diwali, Eid, Ganesh Chaturthi, Dussehra and Christmas. 

Unity in Diversity

The people of India, their foods, festivals and languages – all these make India a very diverse country. However, there are also things that unite the people of India:

The National symbols like the Indian flag and the National Anthem.

The Constitution of India, which was written in the early years of our Independence. It unites the Indians because it has rules and laws that are the same for all people. 

The Constitution says that all Indians are equal in the eyes of the law.

All Indians who are over the age of 18 and have registered as voters can vote in elections.

arrow-right

FAQs on India Essay

Q1. Describe the National Flag of India.

Ans. The National Flag of India is in the tricolour of deep saffron at the top, white in the middle and dark green at the bottom in equal proportions. The saffron stands for courage, sacrifice and the spirit of renunciation, the white, for purity and the truth and the green for faith and fertility. In the centre of the white band, there is a wheel of law in the Sarnath Lion Capital.

Q2. What is the population of India?

Ans. The population of India is 1 billion 325 million. India has the second-largest population in the world.

Q3. What are the important Festivals Celebrated in India?

Ans. Some of the important festivals celebrated in India are Diwali, Dussehra, Eid and Christmas.

Q4. Why is India called the largest Democratic Country?

Ans. India is the largest democratic country because the citizens of India have the right to elect their representatives who form and run the government.

Essay on India For Students and Children

500+ words essay on india.

India is a great country where people speak different languages but the national language is Hindi. India is full of different castes, creeds, religion, and cultures but they live together. That’s the reasons India is famous for the common saying of “ unity in diversity “. India is the seventh-largest country in the whole world.

Geography and Culture

India has the second-largest population in the world. India is also knowns as Bharat, Hindustan and sometimes Aryavart. It is surrounded by oceans from three sides which are Bay Of Bengal in the east, the Arabian Sea in the west and Indian oceans in the south. Tiger is the national animal of India. Peacock is the national bird of India. Mango is the national fruit of India. “ Jana Gana Mana ” is the national anthem of India . “Vande Mataram” is the national song of India. Hockey is the national sport of India. People of different religions such as Hinduism, Buddhism , Jainism, Sikhism, Islam, Christianity and Judaism lives together from ancient times. India is also rich in monuments, tombs, churches, historical buildings, temples, museums, scenic beauty, wildlife sanctuaries , places of architecture and many more. The great leaders and freedom fighters are from India.

F lag of India

The indian flag has tricolors.

The first color that is uppermost color in the flag which is the saffron color, stands for purity. The second color i.e. the middle color in the flag is the white color and it stands for peace. The third color that is the lowest color in the flag is the green color and it stands for fertility. The white color has an Ashoka Chakra of blue color on it. Ashoka Chakra contains twenty-four spokes which are equally divided. India has 29 states and 7 union territories.

essay on india map

Follow this link to get a Physical and state-wise Map of India

My Favorite States from India are as follows –

Rajasthan itself has a glorious history. It is famous for many brave kings, their deeds, and their art and architecture. It has a sandy track that’s why the nuclear test was held here. Rajasthan is full of desert, mountain range, lakes, dense forest, attractive oases, and temples, etc. Rajasthan is also known as “Land Of Sacrifice”. In Rajasthan, you can see heritage things of all the kings who ruled over there and for that, you can visit Udaipur, Jodhpur, Jaisalmer, Chittaurgarh, etc.

Madhya Pradesh

Madhya Pradesh is bigger than a foreign (Italy) country and smaller than Oman. It also has tourists attractions for its places. In Madhya Pradesh, you can see temples, lakes, fort, art and architecture, rivers, jungles, and many things. You can visit in Indore, Jabalpur, Ujjain, Bhopal, Gwalior and many cities. Khajuraho, Sanchi Stupa, Pachmarhi, Kanha national park, Mandu, etc. are the places must visit.

Jammu and Kashmir

Jammu and Kashmir are known as heaven on earth . We can also call Jammu and Kashmir as Tourists Paradise. There are many places to visit Jammu and Kashmir because they have an undisturbed landscape, motorable road, beauty, lying on the banks of river Jhelum, harmony, romance, sceneries, temples and many more.

In Jammu and Kashmir, u can enjoy boating, skiing, skating, mountaineering, horse riding, fishing, snowfall, etc. In Jammu and Kashmir, you can see a variety of places such as Srinagar, Vaishnav Devi, Gulmarg, Amarnath, Patnitop, Pahalgam, Sonamarg, Lamayuru, Nubra Valley, Hemis, Sanasar,  Anantnag,  Kargil, Dachigam National Park, Pulwama, Khilanmarg, Dras, Baltal, Bhaderwah, Pangong Lake, Magnetic Hill, Tso Moriri, Khardung La, Aru Valley, Suru Basin,Chadar Trek, Zanskar Valley, Alchi Monastery, Darcha Padum Trek, Kishtwar National Park, Changthang Wildlife Sanctuary, Nyoma, Dha Hanu, Uleytokpo, Yusmarg, Tarsar Marsar Trek and many more.

It is known as the ‘God’s Own Country’, Kerala is a state in India, situated in the southwest region, it is bordered by a number of beaches; covered by hills of Western Ghats and filled with backwaters, it is a tourist destination attracting people by its natural beauty. The most important destinations which you can see in Kerela are the museum, sanctuary, temples, backwaters, and beaches. Munnar, Kovalam, Kumarakom, and Alappad.

India is a great country having different cultures, castes, creed, religions but still, they live together. India is known for its heritage, spices, and of course, for people who live here. That’s the reasons India is famous for the common saying of “unity in diversity”. India is also well known as the land of spirituality , philosophy, science, and technology.

Customize your course in 30 seconds

Which class are you in.

tutor

  • Travelling Essay
  • Picnic Essay
  • Our Country Essay
  • My Parents Essay
  • Essay on Favourite Personality
  • Essay on Memorable Day of My Life
  • Essay on Knowledge is Power
  • Essay on Gurpurab
  • Essay on My Favourite Season
  • Essay on Types of Sports

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Download the App

Google Play

  • Solar Eclipse 2024

What the World Has Learned From Past Eclipses

C louds scudded over the small volcanic island of Principe, off the western coast of Africa, on the afternoon of May 29, 1919. Arthur Eddington, director of the Cambridge Observatory in the U.K., waited for the Sun to emerge. The remains of a morning thunderstorm could ruin everything.

The island was about to experience the rare and overwhelming sight of a total solar eclipse. For six minutes, the longest eclipse since 1416, the Moon would completely block the face of the Sun, pulling a curtain of darkness over a thin stripe of Earth. Eddington traveled into the eclipse path to try and prove one of the most consequential ideas of his age: Albert Einstein’s new theory of general relativity.

Eddington, a physicist, was one of the few people at the time who understood the theory, which Einstein proposed in 1915. But many other scientists were stymied by the bizarre idea that gravity is not a mutual attraction, but a warping of spacetime. Light itself would be subject to this warping, too. So an eclipse would be the best way to prove whether the theory was true, because with the Sun’s light blocked by the Moon, astronomers would be able to see whether the Sun’s gravity bent the light of distant stars behind it.

Two teams of astronomers boarded ships steaming from Liverpool, England, in March 1919 to watch the eclipse and take the measure of the stars. Eddington and his team went to Principe, and another team led by Frank Dyson of the Greenwich Observatory went to Sobral, Brazil.

Totality, the complete obscuration of the Sun, would be at 2:13 local time in Principe. Moments before the Moon slid in front of the Sun, the clouds finally began breaking up. For a moment, it was totally clear. Eddington and his group hastily captured images of a star cluster found near the Sun that day, called the Hyades, found in the constellation of Taurus. The astronomers were using the best astronomical technology of the time, photographic plates, which are large exposures taken on glass instead of film. Stars appeared on seven of the plates, and solar “prominences,” filaments of gas streaming from the Sun, appeared on others.

Eddington wanted to stay in Principe to measure the Hyades when there was no eclipse, but a ship workers’ strike made him leave early. Later, Eddington and Dyson both compared the glass plates taken during the eclipse to other glass plates captured of the Hyades in a different part of the sky, when there was no eclipse. On the images from Eddington’s and Dyson’s expeditions, the stars were not aligned. The 40-year-old Einstein was right.

“Lights All Askew In the Heavens,” the New York Times proclaimed when the scientific papers were published. The eclipse was the key to the discovery—as so many solar eclipses before and since have illuminated new findings about our universe.

Telescope used to observe a total solar eclipse, Sobral, Brazil, 1919.

To understand why Eddington and Dyson traveled such distances to watch the eclipse, we need to talk about gravity.

Since at least the days of Isaac Newton, who wrote in 1687, scientists thought gravity was a simple force of mutual attraction. Newton proposed that every object in the universe attracts every other object in the universe, and that the strength of this attraction is related to the size of the objects and the distances among them. This is mostly true, actually, but it’s a little more nuanced than that.

On much larger scales, like among black holes or galaxy clusters, Newtonian gravity falls short. It also can’t accurately account for the movement of large objects that are close together, such as how the orbit of Mercury is affected by its proximity the Sun.

Albert Einstein’s most consequential breakthrough solved these problems. General relativity holds that gravity is not really an invisible force of mutual attraction, but a distortion. Rather than some kind of mutual tug-of-war, large objects like the Sun and other stars respond relative to each other because the space they are in has been altered. Their mass is so great that they bend the fabric of space and time around themselves.

Read More: 10 Surprising Facts About the 2024 Solar Eclipse

This was a weird concept, and many scientists thought Einstein’s ideas and equations were ridiculous. But others thought it sounded reasonable. Einstein and others knew that if the theory was correct, and the fabric of reality is bending around large objects, then light itself would have to follow that bend. The light of a star in the great distance, for instance, would seem to curve around a large object in front of it, nearer to us—like our Sun. But normally, it’s impossible to study stars behind the Sun to measure this effect. Enter an eclipse.

Einstein’s theory gives an equation for how much the Sun’s gravity would displace the images of background stars. Newton’s theory predicts only half that amount of displacement.

Eddington and Dyson measured the Hyades cluster because it contains many stars; the more stars to distort, the better the comparison. Both teams of scientists encountered strange political and natural obstacles in making the discovery, which are chronicled beautifully in the book No Shadow of a Doubt: The 1919 Eclipse That Confirmed Einstein's Theory of Relativity , by the physicist Daniel Kennefick. But the confirmation of Einstein’s ideas was worth it. Eddington said as much in a letter to his mother: “The one good plate that I measured gave a result agreeing with Einstein,” he wrote , “and I think I have got a little confirmation from a second plate.”

The Eddington-Dyson experiments were hardly the first time scientists used eclipses to make profound new discoveries. The idea dates to the beginnings of human civilization.

Careful records of lunar and solar eclipses are one of the greatest legacies of ancient Babylon. Astronomers—or astrologers, really, but the goal was the same—were able to predict both lunar and solar eclipses with impressive accuracy. They worked out what we now call the Saros Cycle, a repeating period of 18 years, 11 days, and 8 hours in which eclipses appear to repeat. One Saros cycle is equal to 223 synodic months, which is the time it takes the Moon to return to the same phase as seen from Earth. They also figured out, though may not have understood it completely, the geometry that enables eclipses to happen.

The path we trace around the Sun is called the ecliptic. Our planet’s axis is tilted with respect to the ecliptic plane, which is why we have seasons, and why the other celestial bodies seem to cross the same general path in our sky.

As the Moon goes around Earth, it, too, crosses the plane of the ecliptic twice in a year. The ascending node is where the Moon moves into the northern ecliptic. The descending node is where the Moon enters the southern ecliptic. When the Moon crosses a node, a total solar eclipse can happen. Ancient astronomers were aware of these points in the sky, and by the apex of Babylonian civilization, they were very good at predicting when eclipses would occur.

Two and a half millennia later, in 2016, astronomers used these same ancient records to measure the change in the rate at which Earth’s rotation is slowing—which is to say, the amount by which are days are lengthening, over thousands of years.

By the middle of the 19 th century, scientific discoveries came at a frenetic pace, and eclipses powered many of them. In October 1868, two astronomers, Pierre Jules César Janssen and Joseph Norman Lockyer, separately measured the colors of sunlight during a total eclipse. Each found evidence of an unknown element, indicating a new discovery: Helium, named for the Greek god of the Sun. In another eclipse in 1869, astronomers found convincing evidence of another new element, which they nicknamed coronium—before learning a few decades later that it was not a new element, but highly ionized iron, indicating that the Sun’s atmosphere is exceptionally, bizarrely hot. This oddity led to the prediction, in the 1950s, of a continual outflow that we now call the solar wind.

And during solar eclipses between 1878 and 1908, astronomers searched in vain for a proposed extra planet within the orbit of Mercury. Provisionally named Vulcan, this planet was thought to exist because Newtonian gravity could not fully describe Mercury’s strange orbit. The matter of the innermost planet’s path was settled, finally, in 1915, when Einstein used general relativity equations to explain it.

Many eclipse expeditions were intended to learn something new, or to prove an idea right—or wrong. But many of these discoveries have major practical effects on us. Understanding the Sun, and why its atmosphere gets so hot, can help us predict solar outbursts that could disrupt the power grid and communications satellites. Understanding gravity, at all scales, allows us to know and to navigate the cosmos.

GPS satellites, for instance, provide accurate measurements down to inches on Earth. Relativity equations account for the effects of the Earth’s gravity and the distances between the satellites and their receivers on the ground. Special relativity holds that the clocks on satellites, which experience weaker gravity, seem to run slower than clocks under the stronger force of gravity on Earth. From the point of view of the satellite, Earth clocks seem to run faster. We can use different satellites in different positions, and different ground stations, to accurately triangulate our positions on Earth down to inches. Without those calculations, GPS satellites would be far less precise.

This year, scientists fanned out across North America and in the skies above it will continue the legacy of eclipse science. Scientists from NASA and several universities and other research institutions will study Earth’s atmosphere; the Sun’s atmosphere; the Sun’s magnetic fields; and the Sun’s atmospheric outbursts, called coronal mass ejections.

When you look up at the Sun and Moon on the eclipse , the Moon’s day — or just observe its shadow darkening the ground beneath the clouds, which seems more likely — think about all the discoveries still yet waiting to happen, just behind the shadow of the Moon.

More Must-Reads From TIME

  • Exclusive: Google Workers Revolt Over $1.2 Billion Contract With Israel
  • Jane Fonda Champions Climate Action for Every Generation
  • Stop Looking for Your Forever Home
  • The Sympathizer Counters 50 Years of Hollywood Vietnam War Narratives
  • The Bliss of Seeing the Eclipse From Cleveland
  • Hormonal Birth Control Doesn’t Deserve Its Bad Reputation
  • The Best TV Shows to Watch on Peacock
  • Want Weekly Recs on What to Watch, Read, and More? Sign Up for Worth Your Time

Contact us at [email protected]

You May Also Like

IMAGES

  1. Most Spoken Languages in India Mapped

    essay on different languages in india

  2. What language is spoken in India?

    essay on different languages in india

  3. Write an essay on Indian Culture

    essay on different languages in india

  4. op-ed

    essay on different languages in india

  5. Paragraph on Indian Culture 100, 150, 200, 250 to 300 Words for Kids

    essay on different languages in india

  6. The List of Languages Spoken in India

    essay on different languages in india

VIDEO

  1. Sigma in different languages India

  2. Write a short essay on Our Country India

  3. अन्य भारतीय भाषा । किस राज्य में कौन सी भाषा बोली जाती है।

  4. Top 10 most spoke languages in India #charanteja #viral

  5. Top 10 Toughest Language in India #shorts #viral #trending #language

  6. अलग अलग भाषा में Hello कैसे बोलते है?

COMMENTS

  1. Essay on Indian Languages

    Essay on Indian Languages. India is the home of a very large number of languages. In fact, so many languages and dialects are spoken in India that it is often described as a 'museum of languages'. The language diversity is by all means baffling. In popular parlance it is often described as 'linguistic pluralism'.

  2. Multilingualism in India

    Languages in India are categorized into language families based on their different linguistic or­igins, which often include different scripts as well. The main language families include Dravidian, Indo-Aryan, and Sino-Tibetan. Bodo is the Sino-Tibetan language spoken in northeastern Indian states with the most speakers (1.4 million).

  3. Languages of India

    Languages spoken in the Republic of India belong to several language families, the major ones being the Indo-Aryan languages spoken by 78.05% of Indians and the Dravidian languages spoken by 19.64% of Indians; both families together are sometimes known as Indic languages. Languages spoken by the remaining 2.31% of the population belong to the Austroasiatic, Sino-Tibetan, Tai-Kadai, and a ...

  4. What Languages Are Spoken In India?

    Hindi is the most spoken language in India with 41% of the population being first language speakers, but the other 59% of the population speak over 30 different languages. Due to their long history, Tamil, Sanskrit, Malayalam, Odia, and Telugu have been designated classica languages. Most Indian languages are classified into one of four groups ...

  5. India: A linguistic civilisation

    India: A linguistic civilisation. Unlike Europe, the speakers of hundreds of different languages in India agreed to belong to a single nation because the Constitution promised them freedom of expression, making it mandatory on the state to encourage languages 'without harming other languages'. There is a raging debate in the country about ...

  6. Which Languages Are Spoken In India?

    A census conducted in 2011 showed that India has about 19,569 languages and dialects, of which almost 1,369 are considered dialects and only 121 are recognized as languages (the acceptance criterion being that the language has 10,000 or more speakers). The languages spoken in India belong mainly to two big linguistic families: the Indo-European ...

  7. India: The Land of Diverse Languages and Scripts

    Unlike many other places in the world, the people of India write these languages in multiple scripts, making our nation one of the most graphically diverse nations across the globe. An interesting fact - According to Article 343 of the Indian Constitution, the official language of the Union is Hindi in the Devanagari script.

  8. Multilingualism in India

    India is one of the most linguistically diverse countries in the world, with over 19,500 languages spoken throughout the nation. This diversity offers a unique opportunity for Indians to be multilingual, which means being able to use more than one language in communication. According to the 2011 Census of India, more than 25% of the population ...

  9. Exploring Linguistic Diversity in India: A Spatial Analysis

    India is a land characterized by "unity in diversity" amidst a multicultural society. This is represented by variety in culture such as different languages religions, castes, house types, dance forms, and dietary patterns (Noble and Dutt, India: cultural patterns and processes. Westview Press, Boulder, 1982).

  10. PDF Exploring Linguistic Diversity in India: A Spatial Analysis

    tion spoke these 22 scheduled languages with another 100 nonscheduled languages spoken by a minimum of 10,000 people in different regions (Census of India 2001). The 2001 Census of India has declared 22 scheduled languages (Table 1). India is characterized by an astounding degree of linguistic diversity since each

  11. Translation in India: Multilingual practices and cultural histories of

    Translation is not taking place in between monolingual realities but rather within multilingual realities" (Meylaerts 2016, 519; emphasis in original). Translation is, therefore, no longer seen in ontological opposition to multilingualism, made redundant by the simultaneous presence and knowledge of multiple languages.

  12. Exploring Linguistic Diversity in India: A Spatial Analysis

    Indian multilingualism became unique because of its dynamic relationship of its language. The present work is an attempt to find out the nature of multilingualism in India. It also aims to look into the different aspects of Indian multilingualism arising due to the high diversity of Indian societies. Download Free PDF.

  13. Languages of India

    The Different Languages of India. The languages spoken in India have ancient roots and belong to two major languages families. The majority of Indian languages belong to the Indo-Aryan family which is derived from Sanskrit and influenced strongly by Persian and Arabic. Most of North India speaks Indo-Aryan languages such as Hindi, Punjabi, and ...

  14. PDF Linguistic history and language diversity in India: Views and counterviews

    1. Introduction. The aim of this paper is to lend a linguistic perspective on the issue of human diversity and ancestry in India to the non-linguists at this seminar. The paper is an overview of the major views and evidences gleaned from the available literature. The paper is organized as follows: Section 1 introduces the linguistic diversity ...

  15. Language Movements and Democracy in India

    In this feature essay, Language Movements and Democracy in India, Mithilesh Kumar Jha draws on his recent book Language Politics and Public Sphere in North India (Oxford UP). In the piece, he argues that capturing the real and continuing tensions and challenges of democratic practices in India requires attention to how they are performed and understood within its numerous vernacular spheres ...

  16. Preserving India's Cultural Heritage: The Role of Languages

    Languages are crucial for maintaining Indian culture, as they communicate knowledge and identity. With over 1,600 languages spoken, India has one of the highest linguistic diversities globally.

  17. Short Essay on Languages in India

    The old languages have left their mark on the other languages which we speak today. There are two main groups of languages — the Indo European (Indo-Aryan) and the Dravidian. These two groups have not developed in isolation from each other. Sanskrit was the language of Indo Aryans who came to India. Sanskrit was gradually standardized and ...

  18. The Linguistic Identity of India

    The Linguistic Identity of India. India is a nation where over 1.2 billion people speak 19,569 different languages or dialects as a "mother tongue.". The largest language census revealed interesting data concerning linguistics, identity and nationalism. Background. Languages spoken in India belong to several language families, the major ones ...

  19. A comprehensive survey on Indian regional language processing

    There are different native languages existing in various parts of the world, each with its own alphabet, signs and grammar. If there is a nation where old and morphologically rich varieties of regional languages exist that is India ... the 26th international conference on computational linguistics: technical papers, pp 482-493. Almatrafi O ...

  20. A Brief Background of The Language Issue in India

    Table of Contents. THE BACKGROUND OF THE LANGUAGE ISSUE IN INDIA. THE LANGUAGE ISSUE TODAY. APPENDIX A Discussion Information . APPENDIX B Topics for Discussion on Language Policy in India.. APPENDIX C Excerpts from the Discussions REFERENCES. Note: Because this is being made available on the internet, I have changed the original paper so that the informants' full names are not given---their ...

  21. Indian Culture Essay in English

    200 Words Essay on Indian Culture. India is a land of diverse cultures, religions, languages, and traditions. The rich cultural heritage of India is a result of its long history and the various invasions and settlements that have occurred in the country. Indian culture is a melting pot of various customs and traditions, which have been passed ...

  22. India Essay for Students in English

    The Constitution of India mentions 22 languages. However, India has around 800 languages. Hindi is the official language of India. Festivals. India is a country of many different religions and each has different festivals. Some important festivals are Baisakhi, Diwali, Eid, Ganesh Chaturthi, Dussehra and Christmas. Unity in Diversity. The ...

  23. Essay on India For Students and Children

    500+ Words Essay on India. India is a great country where people speak different languages but the national language is Hindi. India is full of different castes, creeds, religion, and cultures but they live together. That's the reasons India is famous for the common saying of " unity in diversity ". India is the seventh-largest country in ...

  24. What the World Has Learned From Past Eclipses

    SSPL-Getty Images. C louds scudded over the small volcanic island of Principe, off the western coast of Africa, on the afternoon of May 29, 1919. Arthur Eddington, director of the Cambridge ...