What are the key principles of Chomsky’s Theory of Language Development, specifically regarding Universal Grammar?

Noam Chomsky’s Theory of Language Development has been a topic of great interest and debate in the field of linguistics since its introduction in the 1950s. At the core of this theory is the concept of Universal Grammar, which proposes that all human beings possess an innate ability to acquire and use language. Chomsky’s work has revolutionized our understanding of language acquisition and has had a profound impact on the fields of linguistics, psychology, and education. In this essay, we will explore the key principles of Chomsky’s Theory of Language Development, with a focus on the concept of Universal Grammar and its implications for language acquisition.

Universal grammar is a theory in linguistics that suggests that there are properties that all possible natural human languages have. Usually credited to Noam Chomsky, the theory suggests that some rules of grammar are hard-wired into the brain, and manifest without being taught. There is still much argument whether there is such a thing and what it would be.



If humans growing up under normal conditions (not conditions of extreme deprivation) always develop a language with property X (for example, distinguishing nouns from verbs, or distinguishing function words from lexical words) then property X is a property of universal grammar in this most general sense (here not capitalized).

There are theoretical senses of the term Universal Grammar as well (here capitalized). The most general of these would be that Universal Grammar is whatever properties of a normally developing human brain cause it to learn languages that conform to universal grammar (the non-capitalized, pretheoretical sense). Using the above examples, Universal Grammar would be the property that the brain has that causes it to posit a difference between nouns and verbs whenever presented with linguistic data.

As Chomsky puts it, “Evidently, development of language in the individual must involve three factors: (1) genetic endowment, which sets limits on the attainable languages, thereby making language acquisition possible; (2) external data, converted to the experience that selects one or another language within a narrow range; (3) principles not specific to FL.” [FL is the faculty of language, whatever properties of the brain cause it to learn language.] So (1) is Universal Grammar in the first theoretical sense, (2) is the linguistic data which the child is exposed to.

Sometimes aspects of Universal Grammar in this sense seem to be describable in terms of general facts about cognition. For example, if a predisposition to categorize events and objects as different classes of things is part of human cognition, and as a direct result nouns and verbs show up in all languages, then it could be said that this aspect of Universal Grammar is not specific to language, but is part of cognition more generally. To distinguish properties of languages that can be traced to other facts about cognition from properties of languages that cannot, the abbreviation UG* can be used. UG is the term often used by Chomsky for those aspects of the human brain which cause language to be the way it is (i.e. are Universal Grammar in the sense used here) but here for discussion it is used for those aspects which are furthermore specific to language (thus UG, as Chomsky uses it, is just an abbreviation for Universal Grammar, but UG* as used here is a subset of Universal Grammar).

In the same article, Chomsky casts the theme of a larger research program in terms of the following question: “How little can be attributed to UG while still accounting for the variety of I-languages attained, relying on third factor principles?” (I-languages meaning internal languages, the brain states that correspond to knowing how to speak and understand a particular language, and third factor principles meaning (3) in the previous quote).

Chomsky has speculated that UG might be extremely simple and abstract, for example only a mechanism for combining symbols in a particular way, which he calls Merge. To see that Chomsky does not use the term “UG” in the narrow sense UG* suggested above, consider the following quote from the same article:

“The conclusion that Merge falls within UG holds whether such recursive generation is unique to FL or is appropriated from other systems.”

i.e. Merge is part of UG because it causes language to be the way it is, is universal, and is not part of (2) (the environment) or (3) (general properties independent of genetics and environment). Merge is part of Universal Grammar whether it is specific to language or whether, as Chomsky suggests, it is also used for example in mathematic thinking.

The distinction is important because there is a long history of argument about UG*, whereas most people working on language agree that there is Universal Grammar. Many people assume that Chomsky means UG* when he writes UG (and in some cases he might actually mean UG*, though not in the passage quoted above).

Some students of universal grammar study a variety of grammars to abstract generalizations called linguistic universals, often in the form of “If X holds true, then Y occurs.” These have been extended to a range of traits, from the phonemes found in languages, to what word orders languages choose, to why children exhibit certain linguistic behaviors.

The idea can be traced to Roger Bacon’s observation that all languages are built upon a common grammar, substantially the same in all languages, even though it may undergo in them accidental variations, and the 13th century speculative grammarians who, following Bacon, postulated universal rules underlying all grammars. The concept of a universal grammar or language was at the core of the 17th century projects for philosophical languages. The 18th century in Scotland saw the emergence of a vigorous universal grammar school. Later linguists who have influenced this theory include Noam Chomsky and Richard Montague, developing their version of the theory as they considered issues of the Argument from poverty of the stimulus to arise from the constructivist approach to linguistic theory. The application of the idea to the area of second language acquisition (SLA) is represented mainly by the McGill linguist Lydia White.

Most syntacticians generally concede that there are parametric points of variation between languages, although heated debate occurs over whether UG constraints are essentially universal due to being “hard-wired” (Chomsky’s Principles and Parameters approach), a logical consequence of a specific syntactic architecture (the Generalized Phrase Structure approach) or the result of functional constraints on communication (the functionalist approach).



During the early 20th century, language was usually understood from a behaviourist perspective, suggesting that language learning, like any other kind of learning, could be explained by a succession of trials, errors, and rewards for success. In other words, children learned their mother tongue by simple imitation, listening to and repeating what adults said.

The idea can be traced to Roger Bacon’s observation that all languages are built upon a common grammar, substantially the same in all languages, even though it may undergo accidental variations, and the 13th century speculative grammarians who, following Bacon, postulated universal rules underlying all grammars. The concept of a universal grammar or language was at the core of the 17th century projects for philosophical languages. There is a Scottish school of universal grammarians from the 18th century, to be distinguished from the philosophical language project, and including authors such as James Beattie, Hugh Blair, James Burnett, James Harris, and Adam Smith. The article on “Grammar” in the first edition of the Encyclopedia Britannica (1771) contains an extensive section titled “Of Universal Grammar.”

The idea rose to notability in modern linguistics with theorists such as Noam Chomsky and Richard Montague, developed in the 1950s to 1970s, as part of the “Linguistics Wars”.


Chomsky’s theory

Linguist Noam Chomsky made the argument that the human brain contains a limited set of rules for organizing language. In turn, there is an assumption that all languages have a common structural basis. This set of rules is known as universal grammar.

Speakers proficient in a language know what expressions are acceptable in their language and what expressions are unacceptable. The key puzzle is how speakers should come to know the restrictions of their language, since expressions which violate those restrictions are not present in the input, indicated as such. This absence of negative evidence—that is, absence of evidence that an expression is part of a class of the ungrammatical sentences in one’s language—is the core of the poverty of stimulus argument. For example, in English one cannot relate a question word like ‘what’ to a predicate within a relative clause (1):

(1) *What did John meet a man who sold?

Such expressions are not available to the language learners, because they are, by hypothesis, ungrammatical for speakers of the local language. Speakers of the local language do not utter such expressions and note that they are unacceptable to language learners. Universal grammar offers a solution to the poverty of the stimulus problem by making certain restrictions universal characteristics of human languages. Language learners are consequently never tempted to generalize in an illicit fashion.


Presence of creole languages

The presence of creole languages is sometimes cited as further support for this theory, especially by Bickerton’s controversial language bioprogram theory. Creoles are languages that are developed and formed when different societies come together and are forced to devise their own system of communication. The system used by the original speakers is typically an inconsistent mix of vocabulary items known as a pidgin. As these speakers’ children begin to acquire their first language, they use the pidgin input to effectively create their own original language, known as a creole. Unlike pidgins, creoles have native speakers and make use of a full grammar.

According to Bickerton, the idea of universal grammar is supported by creole languages because certain features are shared by virtually all of these languages. For example, their default point of reference in time (expressed by bare verb stems) is not the present moment, but the past. Using pre-verbal auxiliaries, they uniformly express tense, aspect, and mood. Negative concord occurs, but it affects the verbal subject (as opposed to the object, as it does in languages like Spanish). Another similarity among creoles is that questions are created simply by changing a declarative sentence’s intonation, not its word order or content.

However, extensive work by Carla Hudson-Kam and Elissa Newport suggests that creole languages may not support a universal grammar, as has sometimes been supposed. In a series of experiments, Hudson-Kam and Newport looked at how children and adults learn artificial grammars. Notably, they found that children tend to ignore minor variations in the input when those variations are infrequent, and reproduce only the most frequent forms. In doing so, they tend to standardize the language that they hear around them. Hudson-Kam and Newport hypothesize that in a pidgin situation (and in the real life situation of a deaf child whose parents were disfluent signers), children are systematizing the language they hear based on the probability and frequency of forms, and not, as has been suggested on the basis of a universal grammar. Further, it seems unsurprising that creoles would share features with the languages they are derived from and thus look similar “grammatically.”



Since their inception, universal grammar theories have been subjected to vocal and sustained criticism. In recent years, with the advent of more sophisticated brands of computational modeling and more innovative approaches to the study of language acquisition, these criticisms have multiplied.

Geoffrey Sampson maintains that universal grammar theories are not falsifiable and are therefore pseudoscientific theory. He argues that the grammatical “rules” linguists posit are simply post-hoc observations about existing languages, rather than predictions about what is possible in a language. Similarly, Jeffrey Elman argues that the unlearnability of languages assumed by Universal Grammar is based on a too-strict, “worst-case” model of grammar, that is not in keeping with any actual grammar. In keeping with these points, James Hurford argues that the postulate of a language acquisition device (LAD) essentially amounts to the trivial claim that languages are learnt by humans, and thus, that the LAD is less a theory than an explanandum looking for theories.

Sampson, Roediger, Elman and Hurford are hardly alone in suggesting that several of the basic assumptions of Universal Grammar are unfounded. Indeed, a growing number of language acquisition researchers argue that the very idea of a strict rule-based grammar in any language flies in the face of what is known about how languages are spoken and how languages evolve over time. For instance, Morten Christiansen and Nick Chater have argued that the relatively fast-changing nature of language would prevent the slower-changing genetic structures from ever catching up, undermining the possibility of a genetically hard-wired universal grammar. In addition, it has been suggested that people learn about probabilistic patterns of word distributions in their language, rather than hard and fast rules (see the distributional hypothesis). It has also been proposed that the poverty of the stimulus problem can be largely avoided, if we assume that children employ similarity-based generalization strategies in language learning, generalizing about the usage of new words from similar words that they already know how to use.

Another way of defusing the poverty of the stimulus argument is to assume that if language learners notice the absence of classes of expressions in the input, they will hypothesize a restriction (a solution closely related to Bayesian reasoning). In a similar vein, language acquisition researcher Michael Ramscar has suggested that when children erroneously expect an ungrammatical form that then never occurs, the repeated failure of expectation serves as a form of implicit negative feedback that allows them to correct their errors over time. This implies that word learning is a probabilistic, error-driven process, rather than a process of fast mapping, as many nativists assume.

Finally, in the domain of field research, the Pirahã language is claimed to be a counterexample to the basic tenets of Universal Grammar. Among other things, this language is alleged to lack all evidence for recursion, including embedded clauses, as well as quantifiers and color terms. Some other linguists have argued, however, that some of these properties have been misanalyzed, and that others are actually expected under current theories of Universal Grammar. While most languages studied in that respect do indeed seem to share common underlying rules, research is hampered by considerable sampling bias. Linguistically, most diverse areas such as tropical Africa and America, as well as the diversity of Indigenous Australian and Papuan languages, have been insufficiently studied. Furthermore, language extinction has disproportionately affected areas where the most unconventional languages are found.



Universal Grammar is made up of a set of rules that apply to most or all natural human languages. Most of these rules come in the form of “if a language has a feature X, it will also have the feature Y.” Rules that are widely considered as part of UG include:

  • If a language is head-initial (like English), it will have prepositional phrases; if and only if it is head-final (like Japanese) will it have post-positional phrases.
  • If a language has a word for purple, it will have a word for red.
Scroll to Top