DV1565 / DV1511:

Compiler and Interpreter Technology

Wednesday, January 18th, 2017

Introducing formal languages and DFA.

Table of Contents
What are formal languages?Regular language class
How are they defined?DFA machine class
What are machines?Translation of regex into DFA
How are they defined?Simulation of DFA
Relationship between languages and machines.
Hierarchy of language power.

1. General view of compilation

%3 clusterProg                                         Program clusterConv clusterRep clusterWork string Text in a string conv A conversion string->conv output Some output format rep Model of meaning conv->rep conv1 ... blah rep->blah work Some kind of work rep->work blah1 blah->blah1 blah3 blah->blah3 blah2 blah1->blah2 blah4 blah1->blah4 blah5 blah2->blah5 blah5->work work->output foo ...

2. Why we need formal language

Time flies like an arrow. Fruit flies like a banana.
You will be very fortunate to get this person to work for you.
I am pleased to say that this candidate is a former colleague of mine.
I once shot an elephant in my pajamas.

3. The syntax of formal languages.

4. An example of a formal language

6. Formal expression of rules

The barber shaves all men who do not shave themselves

7. Machines

8. Automata

9. Language Power : Hierarchy

> Turing machine Linear bounded automaton Push-down automaton Finite automaton Regular language Context-free language Context-sensitive language Recursively enumerable language Lexical Analysis Parsing

10. Language Power : Scanners

> Turing machine Linear bounded automaton Push-down automaton Finite automaton Regular language Context-free language Context-sensitive language Recursively enumerable language Lexical Analysis Parsing

11. Language Power : Parsers

> Turing machine Linear bounded automaton Push-down automaton Finite automaton Regular language Context-free language Context-sensitive language Recursively enumerable language Lexical Analysis Parsing

12. Languages and Machines

13. Languages and Machines

Lecture 2 / Chapter 3Lecture 3 / Chapter 3Lecture 4 / Chapter 4Lecture 5 / Chapter 4
Machine classDFANFAPDAPDA
Language classRegexFamily of RegexLLLR
SimulationDirect construction (C)Direct construction (C)Recursive DescentShift/Reduce Alg.
TranslationDirect constructionAlg from 3.7.1Manual implementationAlg. (bison)
Machine OutputYes/NoToken StreamParse TreeParse Tree

14. Turing Machines

Break (15mins)





Intermission

15. Language class: Review of Regular Expressions

pushl %ebp;movl %esp,%ebp;.*;leave(matching instruction-strings)
body/(span/style|div/id)*(matching node-string paths in html-trees)
Regular Languages are more general than we have seen previously.

16. Language class: Definition of Regular Language

Diverging from §3.3 by starting with DFA def.

17. Language class: Examples of Regular Languages

more on pg122 (§3.3.3)

18. Machine class: Deterministic Finite Automata

%3 0 0 0->0 [0-9] 1 1 0->1 '.' 2 2 1->2 [0-9] 2->2 [0-9] ... 3 . 1 1 2 ...

19. Machine class: DFA Execution Rules

20. Machine class: Execution Example

%3 0 0 0->0 [0-9] 1 1 0->1 '.' 2 2 1->2 [0-9] 2->2 [0-9] ... 3 . 1 ...
%3 0 0 0->0 [0-9] 1 1 0->1 '.' 2 2 1->2 [0-9] 2->2 [0-9] ... . 1 4 ...
%3 0 0 0->0 [0-9] 1 1 0->1 '.' 2 2 1->2 [0-9] 2->2 [0-9] ... 1 4 ...
%3 0 0 0->0 [0-9] 1 1 0->1 '.' 2 2 1->2 [0-9] 2->2 [0-9] ... 4 ...

21. Machine class: caveats and variations

22. Translation: Converting REs to DFAs

RegexDFAPrinciple
axb %3 0 0 1 1 0->1 a 2 2 1->2 x 3 3 2->3 b Sequences are chains
a(x|y)b %3 0 0 1 1 0->1 a 2 2 1->2 x 3 3 1->3 y 4 4 2->4 b 3->4 b Choices are splits and joins
a*b %3 0 0 0->0 a 1 1 0->1 b Repetition becomes loops
a+b = aa*b %3 0 0 1 1 0->1 a 1->1 a 2 2 1->2 b Minimum iterations are prefixs
(ab)*c %3 0 0 1 1 0->1 a 2 2 0->2 c 1->0 b Groups expand to subgraphs

23. Translation: Correspondence

StateSymbolJump
0[0-9]0
0.1
1[0-9]2
2[0-9]2
%3 0 0 0->0 [0-9] 1 1 0->1 '.' 2 2 1->2 [0-9] 2->2 [0-9]

24. Simulation: Direct style

bool state1() { switch(head) { case '0': case '1': ... case '9': return nextHead(false) && state1(); case '.': return nextHead(false) && state2(); default: return false; } } uint32_t position = 0; char *tape = "3.14"; bool nextHead(bool accepting) { if(tape[position+1]==0) return accepting; position++; return true; }

25. Simulation: Interpreter style

struct state machine[] = { { false, 2, { 0, "0123456789" }, { 1, "." } }, { false, 1, { 2, "0123456789" } }, { true, 1, { 2, "0123456789" } } }; char *tape = "3.14"; struct state* cur=machine; bool check() { for(int i=0; i<cur->num; i++) if( contains(cur->edge[i].labels, *tape) ) { cur = machine + cur->edge[i].target; tape++; return true; } return false; } bool match() { while( *tape !=0 ) { if( !check() ) return false; if( *tape==0 ) return cur->accept; return cur->accept; } }

26. Simulation:

27. Summary