| PLY (Python Lex-Yacc) Version 1.2 (November 27, 2002) |
| |
| David M. Beazley |
| Department of Computer Science |
| University of Chicago |
| Chicago, IL 60637 |
| beazley@cs.uchicago.edu |
| |
| Copyright (C) 2001 David M. Beazley |
| |
| $Header: /home/stever/bk/newmem2/ext/ply/README 1.1 03/06/06 14:53:34-00:00 stever@ $ |
| |
| This library is free software; you can redistribute it and/or |
| modify it under the terms of the GNU Lesser General Public |
| License as published by the Free Software Foundation; either |
| version 2.1 of the License, or (at your option) any later version. |
| |
| This library is distributed in the hope that it will be useful, |
| but WITHOUT ANY WARRANTY; without even the implied warranty of |
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
| Lesser General Public License for more details. |
| |
| You should have received a copy of the GNU Lesser General Public |
| License along with this library; if not, write to the Free Software |
| Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
| |
| See the file COPYING for a complete copy of the LGPL. |
| |
| Introduction |
| ============ |
| |
| PLY is a 100% Python implementation of the common parsing tools lex |
| and yacc. Although several other parsing tools are available for |
| Python, there are several reasons why you might want to consider PLY: |
| |
| - The tools are very closely modeled after traditional lex/yacc. |
| If you know how to use these tools in C, you will find PLY |
| to be similar. |
| |
| - PLY provides *very* extensive error reporting and diagnostic |
| information to assist in parser construction. The original |
| implementation was developed for instructional purposes. As |
| a result, the system tries to identify the most common types |
| of errors made by novice users. |
| |
| - PLY provides full support for empty productions, error recovery, |
| precedence specifiers, and moderately ambiguous grammars. |
| |
| - Parsing is based on LR-parsing which is fast, memory efficient, |
| better suited to large grammars, and which has a number of nice |
| properties when dealing with syntax errors and other parsing problems. |
| Currently, PLY builds its parsing tables using the SLR algorithm which |
| is slightly weaker than LALR(1) used in traditional yacc. |
| |
| - Like John Aycock's excellent SPARK toolkit, PLY uses Python |
| reflection to build lexers and parsers. This greatly simplifies |
| the task of parser construction since it reduces the number of files |
| and eliminates the need to run a separate lex/yacc tool before |
| running your program. |
| |
| - PLY can be used to build parsers for "real" programming languages. |
| Although it is not ultra-fast due to its Python implementation, |
| PLY can be used to parse grammars consisting of several hundred |
| rules (as might be found for a language like C). The lexer and LR |
| parser are also reasonably efficient when parsing typically |
| sized programs. |
| |
| The original version of PLY was developed for an Introduction to |
| Compilers course where students used it to build a compiler for a |
| simple Pascal-like language. Their compiler had to include lexical |
| analysis, parsing, type checking, type inference, and generation of |
| assembly code for the SPARC processor. Because of this, the current |
| implementation has been extensively tested and debugged. In addition, |
| most of the API and error checking steps have been adapted to address |
| common usability problems. |
| |
| How to Use |
| ========== |
| |
| PLY consists of two files : lex.py and yacc.py. To use the system, |
| simply copy these files to your project and import them like standard |
| Python modules. |
| |
| The file doc/ply.html contains complete documentation on how to use |
| the system. |
| |
| The example directory contains several different examples including a |
| PLY specification for ANSI C as given in K&R 2nd Ed. Note: To use |
| the examples, you will need to copy the lex.py and yacc.py files to |
| the example directory. |
| |
| A simple example is found at the end of this document |
| |
| Requirements |
| ============ |
| PLY requires the use of Python 2.0 or greater. It should work on |
| just about any platform. |
| |
| Resources |
| ========= |
| |
| More information about PLY can be obtained on the PLY webpage at: |
| |
| http://systems.cs.uchicago.edu/ply |
| |
| For a detailed overview of parsing theory, consult the excellent |
| book "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and |
| Ullman. The topics found in "Lex & Yacc" by Levine, Mason, and Brown |
| may also be useful. |
| |
| Given that this is the first release, I welcome your comments on how |
| to improve the current implementation. See the TODO file for things that |
| still need to be done. |
| |
| Acknowledgments |
| =============== |
| |
| A special thanks is in order for all of the students in CS326 who |
| suffered through about 25 different versions of these tools :-). |
| |
| Example |
| ======= |
| |
| Here is a simple example showing a PLY implementation of a calculator with variables. |
| |
| # ----------------------------------------------------------------------------- |
| # calc.py |
| # |
| # A simple calculator with variables. |
| # ----------------------------------------------------------------------------- |
| |
| tokens = ( |
| 'NAME','NUMBER', |
| 'PLUS','MINUS','TIMES','DIVIDE','EQUALS', |
| 'LPAREN','RPAREN', |
| ) |
| |
| # Tokens |
| |
| t_PLUS = r'\+' |
| t_MINUS = r'-' |
| t_TIMES = r'\*' |
| t_DIVIDE = r'/' |
| t_EQUALS = r'=' |
| t_LPAREN = r'\(' |
| t_RPAREN = r'\)' |
| t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*' |
| |
| def t_NUMBER(t): |
| r'\d+' |
| try: |
| t.value = int(t.value) |
| except ValueError: |
| print "Integer value too large", t.value |
| t.value = 0 |
| return t |
| |
| # Ignored characters |
| t_ignore = " \t" |
| |
| def t_newline(t): |
| r'\n+' |
| t.lineno += t.value.count("\n") |
| |
| def t_error(t): |
| print "Illegal character '%s'" % t.value[0] |
| t.skip(1) |
| |
| # Build the lexer |
| import lex |
| lex.lex() |
| |
| # Precedence rules for the arithmetic operators |
| precedence = ( |
| ('left','PLUS','MINUS'), |
| ('left','TIMES','DIVIDE'), |
| ('right','UMINUS'), |
| ) |
| |
| # dictionary of names (for storing variables) |
| names = { } |
| |
| def p_statement_assign(t): |
| 'statement : NAME EQUALS expression' |
| names[t[1]] = t[3] |
| |
| def p_statement_expr(t): |
| 'statement : expression' |
| print t[1] |
| |
| def p_expression_binop(t): |
| '''expression : expression PLUS expression |
| | expression MINUS expression |
| | expression TIMES expression |
| | expression DIVIDE expression''' |
| if t[2] == '+' : t[0] = t[1] + t[3] |
| elif t[2] == '-': t[0] = t[1] - t[3] |
| elif t[2] == '*': t[0] = t[1] * t[3] |
| elif t[2] == '/': t[0] = t[1] / t[3] |
| |
| def p_expression_uminus(t): |
| 'expression : MINUS expression %prec UMINUS' |
| t[0] = -t[2] |
| |
| def p_expression_group(t): |
| 'expression : LPAREN expression RPAREN' |
| t[0] = t[2] |
| |
| def p_expression_number(t): |
| 'expression : NUMBER' |
| t[0] = t[1] |
| |
| def p_expression_name(t): |
| 'expression : NAME' |
| try: |
| t[0] = names[t[1]] |
| except LookupError: |
| print "Undefined name '%s'" % t[1] |
| t[0] = 0 |
| |
| def p_error(t): |
| print "Syntax error at '%s'" % t.value |
| |
| import yacc |
| yacc.yacc() |
| |
| while 1: |
| try: |
| s = raw_input('calc > ') |
| except EOFError: |
| break |
| yacc.parse(s) |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |