You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
27 lines
2.4 KiB
27 lines
2.4 KiB
## Chomsky Hierarchy
|
|
|
|
Academic studies of languages have focused heavily on the syntax of languages. This study included both natural languages and formal languages. A natural language is what we normally think of when we speak of languages. A **formal language** is a mathematical entity that follows strict rules. While a natural language has syntax, rules are frequently bent when a natural language is used. A formal language is usually defined by its syntax in a rigidly defined mathematical manner.
|
|
|
|
Noam Chomsky is a linguist who taught at MIT. While there he developed a classification of languages that has become known as the **Chomsky Hierarchy** (aka Chomsky-Schützenberger Hierarchy). This idea centers around the idea that there are four classes of formal grammars that have the capability of generating increasingly complex languages from a syntactical point of view.
|
|
|
|
An **automaton** is a machine that performs a function according to a predetermined set of coded instructions, especially one capable of a range of programmed responses to different circumstances.
|
|
|
|
This idea of a hierarch of grammars was fundamental in the development of the theory of formal languages. Of interest to computer science is the fact that each of these grammar types has a dual automaton that recognizes that language.
|
|
|
|
| grammar type | automaton |
|
|
|-------------|-----------|
|
|
| recursively enumerable | turing machine/lambda calculus |
|
|
| context sensitive | linear bounded turing machine |
|
|
| context free | pushdown automaton |
|
|
| regular | finite state machine |
|
|
|
|
The **recursively enumerable languages** correspond in some sense to what used to be called computable functions. Any program you write in a modern programming language falls into this category. The idea that the effectively computable functions are the same as those that can be implemented with a Turing machine is known as the **Church-Turing thesis**.
|
|
|
|
The remaining languages are characterized by the structure of their syntax.
|
|
|
|
The **context sensitive languages** are not going to be described in this course.
|
|
|
|
The **context free languages** have a syntax structure that forms tree shapes. These are very useful in computing. Most programming languages will specify a context free grammar for the language and use a program called a parser to produce the corresponding tree structure.
|
|
|
|
The **regular languages** are known to programmers as regular expressions.
|