Formal Grammar
Main Source:
- Book chapter 2.1
- Neso Academy playlist 63-64
Formal Grammar is the formalization of grammar, which is the rules to describe languages.
Chomsky Hierarchy
There are many types of formal grammar, the types of grammar based on Noam Chomsky are regular grammar, context-free grammar, context-sensitive grammar, and unrestricted grammar.
Source: https://youtu.be/WgEsPTAL55Q?si=4LOZ1qtkL1hunWmf&t=70
Every grammar generates its own language, which is recognized by a specific automaton.
Grammar
As explained in grammar, in general, a grammar is defined as quadruple: .
With grammar , we have three categories of symbol, (also known as the start symbol), , and . The category itself doesn't mean anything in language, the actual character that appears in the language are and . The production rules describe that the start symbol can be replaced with . The and itself can be replaced by and , respectively.
For example, starting from
(by )
(by )
(by )
The production rule of a general grammar is in the form of , where is a non-terminal symbol, and is a string of terminal or non-terminal symbol (formally ). The production rule allows the transformation from non-terminal symbols to another non-terminal or terminal symbols, making it possible to expand the language recursively.
The book also notate the production rule in the form of . In other word, one or more terminal or non-terminal symbol can be transformed to zero or more (so includes empty string) terminal or non-terminal symbol.
For example, when the production rule generates another start symbol, such as , then, starting from , we can produce , then further apply the same production rule to obtain , and continue this process until we replace all non-terminal symbol to terminal symbol.
Example
The process of generating language from grammar is also known as derivation. The set of all language generated by some grammar is denoted as . The language generated by the previous grammar example, , consists of only a single string: .
Another example from the video:
Source: https://youtu.be/ejXgLRSIxsA?si=K8kIBdPOzhFgqsQW&t=573
This grammar has a recursive production rule. We can replace with , which mean we can potentially generate infinite amount of "a"'s. The language generated by this grammar creates a pattern , therefore the is a set of string with that patterns.
Regular Grammar
Regular grammar is the grammar that describes regular language, which is the languages recognized by finite automaton like DFAs and NFAs.
Left & Right Grammar
The formal description of regular grammar is same as the general grammar, except the production rule follows either left or right types of grammar. With , and , right and left grammar is in the form:
- Right linear grammar: , example .
- Left linear grammar: , example .
In right grammar, the production rule places non-terminal symbols on the right, while in a left grammar, it is the opposite.
Additionally,
The "|" in the notation is a shorthand to describe two and .
Context-Free Grammar
See the next topic.