10.1126/science.291.5501.114
Crossref journal-article
American Association for the Advancement of Science (AAAS)
Science (221)
Abstract

Universal grammar specifies the mechanism of language acquisition. It determines the range of grammatical hypothesis that children entertain during language learning and the procedure they use for evaluating input sentences. How universal grammar arose is a major challenge for evolutionary biology. We present a mathematical framework for the evolutionary dynamics of grammar learning. The central result is a coherence threshold, which specifies the condition for a universal grammar to induce coherent communication within a population. We study selection of grammars within the same universal grammar and competition between different universal grammars. We calculate the condition under which natural selection favors the emergence of rule-based, generative grammars that underlie complex language.

Bibliography

Nowak, M. A., Komarova, N. L., & Niyogi, P. (2001). Evolution of Universal Grammar. Science, 291(5501), 114–118.

Authors 3
  1. Martin A. Nowak (first)
  2. Natalia L. Komarova (additional)
  3. Partha Niyogi (additional)
References 41 Referenced 269
  1. S. Pinker Words and Rules (Basic Books New York 1999).
  2. G. A. Miller The Science of Words (Scientific American Library New York 1996).
  3. N. Chomsky in The View from Building 20 K. Hale S. J. Keyser Eds. (MIT Press Cambridge MA 1993) pp. 1–52.
  4. Gold E. M., Inform. Control 10, 447 (1967). (10.1016/S0019-9958(67)91165-5) / Inform. Control by Gold E. M. (1967)
  5. N. R. Hornstein D. W. Lightfoot Explanation in Linguistics (Longman London 1981).
  6. R. Jackendoff The Architecture of the Language Faculty (MIT Press Cambridge MA 1997). (10.1163/9789004373167_003)
  7. N. Chomsky Rules and Representations (Columbia Univ. Press New York 1980). (10.1017/S0140525X00001515)
  8. K. Wexler P. Culicover Formal Principles of Language Acquisition (MIT Press Cambridge MA 1980).
  9. Gibson E., Wexler K., Ling. Inquiry 25, 407 (1994). / Ling. Inquiry by Gibson E. (1994)
  10. D. Lightfoot The Development of Language: Acquisition Changes and Evolution (Blackwell/Maryland Lecture in Language and Cognition Oxford 1999).
  11. Manzini R., Wexler K., Ling. Inquiry 18, 413 (1987). / Ling. Inquiry by Manzini R. (1987)
  12. P. Niyogi The Informational Complexity of Learning (Kluwer Academic Boston 1998). (10.1007/978-1-4615-5459-2)
  13. D. Osherson M. Stob S. Weinstein Systems That Learn (MIT Press Cambridge MA 1986). (10.7551/mitpress/6609.001.0001)
  14. Universal grammar and the innateness of grammatical principles of human language are controversial issues. Here we use a very general definition of universal grammar denoting “mechanism of language acquisition ” which is certainly innate. The continuity hypothesis states that universal grammar is available to the child at all stages of development (15) whereas the maturationist hypothesis holds that universal grammar is changing during language acquisition (16).
  15. N. Hyams Language Acquisition and the Theory of Parameters (Reidel Dordrecht Netherlands 1986). (10.1007/978-94-009-4638-5)
  16. A. Radford Syntactic Theory and the Acquisition of English Syntax (Blackwell Oxford 1990).
  17. S. Pinker The Language Instinct (Morrow New York 1994). (10.1037/e412952005-009)
  18. M. Gopnik in Language Logic and Concepts R. Jackendoff P. Bloom K. Wynn Eds. (MIT Press Cambridge MA 1999) pp. 263–284. (10.7551/mitpress/4118.003.0014)
  19. J. Maynard Smith E. Szathmary The Major Transitions in Evolution (Freeman Spektrum Oxford 1995).
  20. Brandon R., Hornstein N., Biol. Philos. 1, 169 (1986). (10.1007/BF00142900) / Biol. Philos. by Brandon R. (1986)
  21. Pinker S., Bloom A., Behav. Brain Sci. 13, 707 (1990). (10.1017/S0140525X00081061) / Behav. Brain Sci. by Pinker S. (1990)
  22. Newmayer F., Lang. Commun. 11, 3 (1991). (10.1016/0271-5309(91)90011-J) / Lang. Commun. by Newmayer F. (1991)
  23. Hashimoto T., Ikegami T., Biosystems 38, 1 (1996). (10.1016/0303-2647(95)01563-9) / Biosystems by Hashimoto T. (1996)
  24. M. D. Hauser The Evolution of Communication (Harvard Univ. Press Cambridge MA 1996).
  25. J. R. Hurford M. Studdert-Kennedy C. Knight Eds. Approaches to the Evolution of Language (Cambridge Univ. Press Cambridge 1998).
  26. Nowak M. A., Krakauer D. C., Proc. Natl. Acad. Sci. U.S.A. 96, 8028 (1999). (10.1073/pnas.96.14.8028) / Proc. Natl. Acad. Sci. U.S.A. by Nowak M. A. (1999)
  27. V. Vapnik The Nature of Statistical Learning Theory (Springer New York 1995). (10.1007/978-1-4757-2440-0)
  28. Valiant L. G., Commun. ACM 27, 436 (1984). (10.1145/1968.1972) / Commun. ACM by Valiant L. G. (1984)
  29. A grammar mediates a mapping between form and meaning. The countably infinite number of possible linguistic expressions can be represented as strings over a finite syntactic alphabet Σ 1 . The set of all possible strings of Σ 1 is denoted by Σ 1 * . Similarly one can enumerate all possible meanings Σ 2 * as strings over a primitive semantic alphabet Σ 2 . Therefore Σ 1 * is the set of all possible linguistic expressions and Σ 2 * is the set of all possible meanings. A grammar G i generates a subset of Σ 1 * × Σ 2 * that is a (potentially infinite) set of sentence-meaning pairs. Mathematically G i is specified by a measure μ i on Σ 1 * × Σ 2 * . We can define a ij = μ i ( G i ∩ G j ) to be simply the proportion of sentence meaning pairs that G i and G j have in common. Hence a ij is the probability that a user of G i speaks an utterance that a user of G j can understand.
  30. Equation 1 is similar to the standard quasi-species equation but has frequency-dependent fitness functions. For quasi-species theory see M. Eigen and P. Schuster [ The Hypercycle. A Principle of Natural Self-Organisation (Springer Berlin 1979)].
  31. This result can also be discussed in the principles and parameters framework. Universal grammar is determined by genetically inherited principles which limit the number of candidate grammars and specify the learning mechanism. The parameters have to be learned by evaluating input sentences. We can calculate the maximum number of parameters consistent with the coherence threshold. Suppose there are k independent parameters that can be represented as binary switches. Therefore n = 2 k . For the memoryless learner we obtain k < log 2 ( b / C 1 ). For the batch learner we obtain k < b /( C 2 log 2). The innate principles have to reduce the number of parameters k to fulfill these conditions. A different approach is optimality theory. There are k constraints. Each grammar is given by a specific ordering of these constraints. Hence n = k ! [A. Prince P. Smolensky in Technical Report RuCCS TR-2 (MIT Press Cambridge MA 1993) pp. 234–272;
  32. Tesar B. B., Smolensky P., Lingua 106, 161 (1998)]. (10.1016/S0024-3841(98)00033-3) / Lingua by Tesar B. B. (1998)
  33. Let us consider a generalization that allows us to define grammars of varying intrinsic fitness. Each user of G i is characterized by an encoding matrix P and a decoding matrix Q. Here P kl = μ( s k m l )/Σ i μ( s i m l ) = μ( s k m l ) which is simply the probability of using the expression s k to convey the meaning m l . Similarly Q kl = μ( s k m l )/Σ j μ( s k m j ) = μ( m l s k ) is the probability of interpreting the expression s k to mean m l . The need to communicate meanings is related to events in the shared world of the linguistic community. Therefore one can define a measure σ on the set of possible meanings (Σ 2 * ) that speakers and hearers might wish to communicate with each other. Given this we can define a ij = tr [ P ( i ) Λ( Q ( j ) ) T ] where Λ is a diagonal matrix such that Λ ii = σ( m i ). This is the probability that an event occurs and is successfully communicated from a user of G i to a user of G j . F ( G i G i ) is the probability that users of G i will have a successful communication with each other. Communication might break down in one of two ways: (i) poverty: an event happens whose meaning cannot be encoded by G i and (ii) ambiguity: an event happens whose meaning has an ambiguous encoding in G i leading to a possibility of misunderstanding. Thus F ( G i G i ) is a number between 0 and 1 and denotes the fitness of G i . Maximum fitness F ( G i G i ) = 1 is achieved by grammars that can express every possible meaning (zero poverty) and have no ambiguities.
  34. To study the effect of finite (small) population sizes the deterministic Eq. 1 is replaced by a stochastic process. In this case we observe that the population adopts one of the candidate grammars (that admits a stable equilibrium) for some time and then jumps to another equilibrium. If the candidate grammars differ in their fitness then the stochastic process performs an evolutionary optimization on the space of all grammars.
  35. Denote by x i the fraction of individuals who use G i of universal grammar U 1 ; denote by y i the fraction of individuals who use G i of universal grammar U 2 . U 1 and U 2 contain respectively n 1 and n 2 candidate grammars. Some of the candidate grammars can be part of both universal grammars. The universal grammars U 1 and U 2 can also differ in the number of sample sentences b 1 and b 2 that are being considered. Therefore we have to take into account the rate of producing offspring with grammatical communication; this rate is given by the declining function r ( b ). An alternative interpretation is that r ( b ) describes the cost that is associated with learning. The dynamics are described by x˙i=r(b1)∑j=1n1 xjfj(1)Qji(1)−φxi i=1 … n1 y˙i=r(b2)∑j=1n2 yj(2)fj(2)Qji(2)−φyi i=1 … n2We have f i (m ) = Σ j n 1 =1 x j F ( G i ( m ) G j (1) ) + Σ j n 2 =1 y j F(G i ( m ) G j (2) ) m ∈ {1 2} and φ = Σ i n 1 −1 f i (1) x i r ( b 1 ) + Σ i n 2 =1 f i (2) y i r ( b 2 ) where the superscripts 1 and 2 refer to U 1 and U 2 respectively.
  36. In general it is advantageous to reduce the size of the search space because a smaller n leads to a larger accuracy of grammar acquisition. The situation is more complex however. Consider two universal grammars U 1 and U 2 with n 1 > n 2 . Suppose U 1 is resident and U 2 is an invading mutant. If n 1 exceeds the coherence threshold then U 2 will always out-compete U 1 . If n 1 is below the coherence threshold then U 2 can only invade if the specific grammar adopted by the population of U 1 speakers is also part of U 2 ; otherwise U 1 can resist invasion by U 2 . The selective difference between U 1 and U 2 is small if both n 1 and n 2 values are either well above or well below the coherence threshold. Hence selection is strongest close to the coherence threshold (if n 1 ≈ n 2 ).
  37. This problem has been solved before in a different context. How many words N can be stably maintained in a population if each child hears b words during its language acquisition period and has a probability ρ to memorize a new word after one encounter? The answer is N < b ρ [
  38. Nowak M. A., Plotkin J. B., Jansen V. A. A., Nature 404, 495 (2000)]. (10.1038/35006635) / Nature by Nowak M. A. (2000)
  39. The implicit assumption here is of course that the rule-based grammars can generate at least these N sentence types. In a principles and parameters framework condition 4 implies that the number of parameters k has to be less than N.
  40. W. von Humboldt Linguistic Variability and Intellectual Development (Univ. of Pennsylvania Press Philadelphia 1972).
  41. Support from the Packard Foundation the Leon Levy and Shelby White Initiatives Fund the Florence Gould Foundation the Ambrose Monell Foundation the Alfred P. Sloan Foundation and the NSF is gratefully acknowledged.
Dates
Type When
Created 23 years, 1 month ago (July 27, 2002, 5:53 a.m.)
Deposited 1 year, 7 months ago (Jan. 13, 2024, 4:46 a.m.)
Indexed 1 month, 4 weeks ago (July 2, 2025, 2:45 p.m.)
Issued 24 years, 7 months ago (Jan. 5, 2001)
Published 24 years, 7 months ago (Jan. 5, 2001)
Published Print 24 years, 7 months ago (Jan. 5, 2001)
Funders 0

None

@article{Nowak_2001, title={Evolution of Universal Grammar}, volume={291}, ISSN={1095-9203}, url={http://dx.doi.org/10.1126/science.291.5501.114}, DOI={10.1126/science.291.5501.114}, number={5501}, journal={Science}, publisher={American Association for the Advancement of Science (AAAS)}, author={Nowak, Martin A. and Komarova, Natalia L. and Niyogi, Partha}, year={2001}, month=jan, pages={114–118} }