Regular Languages (Part 1)
Main Source:
- Book chapter 1.4
- Neso Academy playlist 8-9
- Neso Academy playlist 45-48
Regular Language
Regular language is a language that can be recognized by some finite state machine. It is constructed by three set operations, union, concatenation, and Kleene star.
With alphabet , the class of regular languages over is defined as:
- is a regular language
- For each is a regular language
- For any natural number if are regular languages, then so is .
- For any natural number , if are regular languages, then so is .
- If is a regular language, then so is .
- Nothing else is a regular language unless its construction follows from rules (1) to (5).
So, a regular language can be constructed using the three mentioned operations, including any symbol, and the empty set is also considered a regular language.
Closure Properties
If an operation is said to have closure property, it means that performing the operation on members of the set should yield a result that is also a member of the set. For example, addition in the set of integers own the closure property, because adding two integers will always result in another integer. This is not the case for division, as it might result in fraction.
In regular language, operation that owns closure property are union, concatenation, intersection, set difference, complementation, and Kleene star.
Regular Expression
Regular expression is the formal notation of describing regular languages. It consists of a sequence of characters that represents a pattern or rule used to match and manipulate strings within a given alphabet.
Character and operators:
- Terminal symbols: Terminal symbol including symbols that are contained within the , including (empty string) and (empty set) are regular expression.
- Union (): Union operation is like a logical OR, if we have two regular expression, , union between them would be represented as . This mean we are making a pattern that match either or .
- Concatenation (): With , concatenation is , and it indicates that the two expressions must occur consecutively in the input string.
- Kleene star (): It acts like an iteration that specifies that the preceding expression can occur zero or more times. It matches any number of occurrences of the preceding expression, including no occurrence at all. For example, the regular expression matches strings like or "", "a", "aa", "aaa", and so on.
All the operations behave recursively, meaning we can apply it not only to individual characters or subexpressions but also to larger expressions or groups of expressions. For example, with , we can make it as .
Common Expression
Assume that
- Zero or more: We can use Kleene star for this, means one or more a's.
- One or more: Use one 'a' before the zero or more expression: ( is also right).
- Zero or one: describe optional 'a'.
- Any string: Union all alphabet with a Kleene star: