- Write A Lex Program For Basic Desktop Calculator Using Yacc
- Computer Program For Basic Education 2018/2019
- Training Program For Basic Training
Before 1975 writing a compiler was a very time-consuming process. Then Lesk [1975] and Johnson [1975] published papers on lex and yacc.These utilities greatly simplify compiler writing. Implementation details for lex and yacc may be found in Aho [2006]. Flex and bison, clones for lex and yacc, can be obtained for free from GNU and Cygwin. Cygwin is a 32-bit Windows ports of GNU software. A real English-language program would probably use this flex code to recognize parts of speech, combined with a yacc program to recognize the grammatical structure of the sentence. Ch301.y and ch301.l, the yacc and flex portions of a simple calculator program (copied from the O'Reilly book). A real English-language program would probably use this flex code to recognize parts of speech, combined with a yacc program to recognize the grammatical structure of the sentence. Ch301.y and ch301.l, the yacc and flex portions of a simple calculator program (copied from the O'Reilly book). Example Program for the lex and yacc Programs. This section describes example programs for the lex and yacc commands. Together, these example programs create a simple, desk-calculator program that performs addition, subtraction, multiplication, and division operations. Using the lex Program with the yacc Program. How do I design a calculator using functions in c++? Update Cancel. By Stroustrup has an example of a calculator program. If I was doing it, I would download an open source version of lex and yacc, and use that instead of doing the painful lexing and parsing work on my own. With lex you specify what will be your tokens. Simple calculator compiler using Lex and YACC. The table is translated to a program which reads an input stream, copying it to an output stream and partitioning the input into strings which match the given expressions. With the help of YACC and Lex tool one can write their own compiler. Published in: 2011 3rd International Conference on.
Build code with lex and yacc, Part 1
Meet lex, yacc, flex, and bison
Content series:
This content is part # of # in the series: Build code with lex and yacc, Part 1
This content is part of the series:Build code with lex and yacc, Part 1
Stay tuned for additional content in this series.
Most people never need to know what lex and yacc do. You occasionallyneed to have them installed to compile something you downloaded, but forthe most part, it just works. Maybe an occasional README file refers to'shift/reduce' conflicts. However, these tools remain a valuable part ofthe Unix toolbox, and a bit of familiarity with them will go a long way.
In fact, while lex and yacc are almost always mentioned in the samebreath, they can be used separately. A number of very entertaining iftrivial programs have been written entirely in lex (see Related topics for links to those). Programs using yacc,but not lex, are rarer.
Throughout these articles, the names 'lex' and 'yacc' will be used toinclude also flex and bison, the GNU versions of these utilities. Thecode provided should work in all major versions, such as MKS yacc. It'sall one big happily family!
This is a two-part series. The first article will introduce lex andyacc in more general terms, exploring what they do and how they do it.The second shows an actual application built using them.
What are lex and yacc, and why do people refer to them together?
Lex and yacc are a matched pair of tools. Lex breaks down files intosets of 'tokens,' roughly analogous to words. Yacc takes sets of tokensand assembles them into higher-level constructs, analogous to sentences.Yacc is designed to work with the output of Lex, although you can writeyour own code to fill that gap. Likewise, lex's output is mostly designedto be fed into some kind of parser.
They're for reading files that have reasonably structured formats.For instance, code in many programming languages can be read using lex andyacc. Many data files have predictable enough formats to be read usingthem, as well. Lex and yacc can be used to parse fairly simple andregular grammars. Natural languages are beyond their scope, but mostcomputer programming languages are within their bounds.
Lex and yacc are tools for building programs. Their output is itselfcode, which needs to be fed into a compiler; typically, additional usercode is added to use the code generated by lex and/or yacc. Some simpleprograms can get by on almost no additional code; others use a parser as atiny portion of a much larger and more complicated program.
A more detailed look at each of these programs is in order.
Lex: A lexical analyzer generator
A lexical analyzer isn't a handheld gizmo you get on a sci-fi show.It's a program that breaks input into recognized pieces. For instance, asimple lexical analyzer might count the words in its input. Lex takes aspecification file and builds a corresponding lexical analyzer, coded inC.
Perhaps the best way to approach this is to look at an example.Here's a simple lex program, taken from the man page for flex.
Listing 1. Counting words using lex
This program has three sections, separated by %%
markers. The first and last sections are plain oldC code. The middle section is the interesting one. It consists of aseries of rules that lex translates into the lexical analyzer. Eachrule, in turn, consists of a regular expression and some code to be runwhen that regular expression is matched. Any text that isn't matched issimply copied to standard output. So, if your program is trying to parsea language, it's important to make sure that all possible inputs arecaught by the lexer; otherwise, you get whatever's left over displayed tothe user as though it were a message.
In fact, the code in Listing 1. is a complete program; if you run itthrough lex, compile it, and run the result, it'll do exactly what itlooks like it does.
In this case, it's very easy to see what happens. A newline alwaysmatches the first rule. Any other character matches the second. Lextries each rule in order, matching the longest stream of input it can.If something doesn't match any rules at all, lex will just copy it tostandard output; this behavior is often undesirable. A simple solution isto add a last rule that matches anything, and either does nothing (ifyou're lazy) or emits some kind of diagnostic. Note that lex givespreference to longer matches, even if they're later. So, given theserules:
u { printf('You!n'); }
uu { printf('W!n'); }
and 'uuu' for input, Lex will match the second rule first, consumingthe first two letters, then the first rule. However, if something canmatch either of two rules, order in the lex specification determines whichone is used. Some versions of lex may warn you when a rule cannot bematched.
What makes lex useful and interesting is that it can handle much morecomplicated rules. For instance, a rule to recognize a C identifier mightlook like
[a-zA-Z_][0-9a-zA-Z_]* { return IDENTIFIER; }
The syntax used is plain old regular expression syntax. There are acouple of extensions. One is that you can give names to common patterns.In the first section of the lex program, before the first %%
, you can define names for a few of these:
DIGIT [0-9]
ALPHA [a-zA-Z]
ALNUM [0-9a-zA-Z]
IDENT [0-9a-zA-Z_]
Then, you can refer back to them by putting their names in braces inthe rules section:
({ALPHA}|_){IDENT}* { return IDENTIFIER; }
Each rule has corresponding code that is executed when the regularexpression is matched. The code can do any necessary processing andoptionally return a value. The value will be used if a parser is usinglex's output. For instance, in the case of the simple line-countingprogram, there's no real need for a parser. If you're using a parser tointerpret code in a new language, you will need to return something to theparser to tell it what tokens you've found. You can just use an enum
or a series of #define
directives in a shared include file, or you can have yacc generate a listof predefined values for you.
Lex, by default, reads from standard input. You can point it atanother file quite easily; it's a bit harder to, for instance, read from abuffer. There is no completely standardized way to do this; the mostportable solution is to open a temporary file, write data to it, and handit to the lexer. Here's a sample bit of code to do this:
Listing 2. Handing a memory buffer to the lexer
This code carefully unlinks the temporary file, leaving it open butalready removed, to clean up automatically. A more careful programmer, orone not writing for an audience with limited space for sample code, mightwish to consider the implications of the user's choice of TMPDIR
.
yacc: yet another compiler compiler
So, you've broken your input into a stream of tokens. Now you needsome way to recognize higher-level patterns. This is where yacc comes in:yacc lets you describe what you want to do with tokens. A yacc grammartends to look sort of like this:
Listing 3. A simple yacc grammar
This means that an expression can take any of several forms; forinstance, a variable, a plus sign, and a number could be an expression.The pipe character (|
) indicatesalternatives. The symbols produced by the lexer are calledterminals or tokens. The things assembled from them arecalled non-terminals. So, in this example, NUMBER
is a terminal; the lexer is producing thisresult. By contrast, value
is a non-terminal,which is created only by assembling it from terminals.
Yacc files, like lex files, come in sections separated by %%
markers. As with a lex file, a yacc file comes inthree sections, the last of which is optional, and contains just plain Ccode to be incorporated in the generated file.
Yacc can recognize patterns of tokens; for instance, as in the exampleabove, it can recognize that an expression can consist of a value, eithera plus or minus sign, and another value. It can also take actions; blocksof code enclosed in {}
pairs will be executedwhen the parser reaches that point in an expression. For instance, onemight write:
expression:
value '+' value { printf('Matched a '+' expression.n'); }
The first section of a yacc file defines the objects that the parserwill manipulate and generate. In some cases, it could be empty, but mostoften, it will contain at least a few %token
directives. These directives are used to define the tokens the lexer canreturn. When yacc is run with the -d
option,it generates a header file defining constants. Thus, our earlier lexexample using:
({ALPHA}|_){IDENT}* { return IDENTIFIER; }
might have a corresponding yacc grammar containing the line:
%token IDENTIFIER
Yacc would create a header file (called y.tab.h
by default) containing a line something like:
#define IDENTIFIER 257
These numbers will be outside the range of possible valid characters;thus, the lexer can return individual characters as themselves, or specialtokens using these defined values. This can create a problem when portingcode from one system to another: generally, you are best off re-runninglex and yacc on a new platform, rather than porting the generated code.
By default, the yacc-generated parser will start by trying to parse aninstance of the first rule found in the rules section of the file. Youcan change this by specifying a different rule with %start
, but most often it's more reasonable to putthe top-level rule at the top of the section.
Tokens, types, and values, oh my!
The next question is how to do anything with the components of anexpression. The general way this is done is to define a data type tocontain objects that yacc will be manipulating. This data type is a Cunion
object, and is declared in the firstsection of a yacc file using a %union
declaration. When tokens are defined, they can have a type specified.For a toy programming language, for instance, you might do something likethis:
Listing 4. A minimalist %union declaration
This indicates that, whenever the parser gets a NUMBER
token returned by the lexer, it can expectthat the member named value
of the globalvariable yylval
has been filled in with ameaningful value. Of course, your lexer has to handle this in some way:
Listing 5. Making a lexer use yylval
Yacc allows you to refer to the components of an expression bysymbolic names. When a non-terminal is being parsed, the components thatwent into it are named $1
, $2
, and so on; the value it will pass back up to ahigher-level parser is called $$
. For instance:
Listing 6. Using variables in yacc
Note that the literal plus sign is $2
andhas no meaningful value, but it still takes up a place. There's no needto specify a 'return' or anything else: just assign to the magic name$$
. The %type
declaration specifies that an expression
non-terminal also uses the value
member of theunion.
In some cases, it may be useful to have multiple types of objects inthe %union
declaration; at this point, you haveto make sure that the types you declare in %type
and %token
declarations are the ones you really use. For instance, if you have a%union
declaration that includes both integerand pointer values, and you declare a token to use the pointer value, butyour lexer fills in the integer value... Bad Things can happen.
Of course, this doesn't solve one last problem: the startingnon-terminal expression has nowhere to return to, so you will need to dosomething with the values it produces. One way to do this is to make surethat you do all the work you need to do as you go; another is to build asingle giant object (say, a linked list of items) and assign a pointer toit into a global variable at the end of the starting rule. So, forinstance, if you were to build the above expression parser into ageneral-purpose calculator, just writing the parser would leave you with aprogram that very carefully parsed expressions, then did nothing at allwith them. This is academically very interesting and it has a certainaesthetic appeal, but it's not particularly useful.
This basic introduction should leave you qualified to fool around alittle with lex and yacc on your own time; perhaps building a simplecalculator that does do something with the expressions it parses.One thing you'll find if you play around a bit is that lex and yacc can beconfusing to troubleshoot. In the next installment, we'll look attroubleshooting techniques; as well, we will build a larger and morepowerful parser for a real-world task.
Downloadable resources
Related topics
- Part 2 of this series covers a couple of concrete applications of lex and yacc, showing how to build a simple calculator and a program to parse e-mail messages.
- The Lex & Yacc Page has a number ofinteresting historical references, as well as very good lex and yaccdocumentation.
- The GNU versions of lex and yacc are flex and bison and, as withall things GNU, have excellent documentation including complete usermanuals in a variety of formats.
Comments
Sign in or register to add and subscribe to comments.
Hello friends, recently we were introduced to lex and yacc parsers in our syllabus, but the introduction to lex and yacc [flex and bison] was made in linux environment . Personally, I didn’t felt necessary to Install the Complete Linux OS on my Laptop just because to have gcc/lex/yacc libraries which are a mere 30 MB size , and to allocate 10GB Disk Space + RAM[in Virtual Machine] to Linux is something which i am not very Fond of 🙁 . After searching on Sourceforge, I came across the MinGW [ Win 32 Port of cc/gcc/g++] and the GNU-Win32 ports of flex[lex] and bison[yacc] respectively. So in order to make it easy to use and install i Packaged all of them into a single one click installer – Flex for Windows 7/8/10
Contents / Salient Features of Flex Windows
- In-built GCC/G++/cc Libraries of Linux : The Flex Windows Package contains inbuilt Gcc And g++ libraries [c and c++ compilers] which are ported to windows officially by MinGW and are actively developed by the Linux Open Source Community
- Lex and Yacc Package Binaries : The package contains the latest updated versions of Lex and yacc binaries [flex and bison] which are developed by their developers . The original binaries are included as-it-is in the package so as to ensure smooth and error free compilation and build of your Programs.
- Pre-Configured EditPlus IDE : The package also contains EditPlus IDE which contains pre-defined Blank templates for the Lex/Yacc/C/C++/Java Files, thus each time you want to type a program you can simply use the New Lex / New Yacc template, and the basic code will be inserted thus saving your time and efforts to type :P.
- The EditPlus IDE also contains user Commands for Lex Compile,Yacc Compile,Lex Build , Lex+Yacc Build, Band for Execute. thus, saving your time to type complete commands like “lex abc.l” or cc lex.yy.c y.tab.c -o <object file>” blah blah..” you can simply click the Buttons according to the function you wanna perform and the command will be executed, the command itself will insert the filename,parameters etc 🙂 Amazing! isn’t it ? And to top it off the IDE will capture the command output , errors if any will be shown in the IDE window itself ,thus you can easily change the code,save and compile in split seconds making it easy and pretty much like Geany/Eclipse/Netbeans 🙂
- If it still doesn’t impress you , you still can open CMD and compile and build the hard-old-fashioned way 🙂
- No need to Set PATH variables , the PATH variables , for the gcc/lex/yacc are added by the installer itself, thus saving the efforts required to set PATH 🙂
- Single installer , works on all version of Windows., here’s a Snapshot of the Flex for Windows in Action
Update : For those who are facing errors with Bison/yacc Package in Flex for Windows can also install Lex/Yacc on Ubuntu/LinuxMint easily by referrig this article.
Write A Lex Program For Basic Desktop Calculator Using Yacc
Cautions/Precautions to be taken :
- Make Sure you install the Package in any folder except “Program Files” / “Program Files (x86)” The Package won’t work if installed in Program Files.
- The Windows Lex Version requires “%option noyywrap” , Please make sure this option is present , before you compile Programs.
- The “-ll” and “-ly” arguments won’t work in Windows as they are not required in Windows Environments.
- If the IDE Lags/Code doesn’t respond on compile command , Please re-run the FlexIDE as “Run as Administrator” .
Computer Program For Basic Education 2018/2019
Method to Run Programs through IDE
- Some users have reported difficulty using the package for running the programs or giving the inputs, hence we are simplifying the instructions to run the programs below. In some cases you may find that the program terminates after executing after getting inouts from console if compiled and executed through IDE, In such cases the CMD way is recommended for executing .
Case 1 : Only lex Programs need to be run and built and executed
- Click the compile lex button in the IDE.
- Click the build lex button.
- Click on Execute .
Alternate way through CMD
- Click on Execute CMD directly button in the IDE.
- Compile the Lex File by typing the command lex <filename>.l
- Build the Lex File by gcc/cc command in the CMD e.g gcc lex.yy.c -o <executable name for program>
- Execute the program by typing <executable name for the program>.exe
- The -o <executable name for program> parameter is optional, you can skip the said parameter by directly building by gcc lex.yy.c
and then directly execute your program by typing a.exe
Case 2 : Both Lex and Yacc Programs together have to be linked and Compiled – Executed
- Compile the yacc program by the compile yacc button from the IDE.
- Compile the Lex program by compile lex button.
- Build the program by clicking the “lex-yacc” build button.
- Click on Execute button.
Alternate way through CMD
- Click on the Execute CMD button in the IDE.
- Compile Yacc file by typing command yacc -dy <filename.y>
- Compile the Lex File by typing the command lex <filename>.l
- Build the Lex File by gcc/cc command in the CMD e.g gcc lex.yy.c y.tab.c -o <executable name for program>
- Execute the program by typing <executable name for the program>.exe
- The -o <executable name for program> parameter is optional, you can skip the said parameter by directly building by gcc lex.yy.c y.tab.c
and then directly execute your program by typing a.exe
The binaries are provided as it is , so they cannot contain any errors, However if you face any errors/problems while installing or running, do comment 🙂 we will try to solve it . If you have any doubts/suggestions do leave them behind in comments 🙂
UPDATE : All yacc fixed in this new version — New Update to sequence for Activations by Maria —