Assignment # 3: Perl
Abbreviation expansion
DUE: March 9, 2000 - 6:00pm
Many interactive programs, such as text editors, allow a user to define abbreviations for commonly used strings. Once an abbreviation has been defined, it can be automatically translated into its expanded form. For example, if the abbreviation EPA abbreviates the expanded form Environmental Protection Agency, then the sentence:
Fred's Chemical Company and Taco Shack has been fined by the EPA for illegally dumping toxic waste.
gets expanded into:
Fred's Chemical Company and Taco Shack has been fined by the Environmental Protection Agency for illegally dumping toxic waste.
Your assignment is to write a Perl program that rewrites input lines, replacing abbreviations with their expanded forms.
Each input line will contain zero or more strings (non-empty sequences of characters),
delimited by whitespace (spaces, tabs, etc.). Input lines
will be of four types:
When your program encounters a line of type 1, a new local abbreviation context is created. Abbreviations are classified as either local or global. All local abbreviation definitions become undefined when a new context is entered (i.e. a #CONTEXT# line is reached), while global abbreviations remain defined until explicitly undefined (as described below). A #CONTEXT# line is not required at the beginning of an input data file.
When your program encounters a line of type 2, it should create a new global or local abbreviation for String1 and produce no output. A global abbreviation will be applied to all input lines between the definition line and a subsequent undefinition line, while a local abbreviation line will apply to all input lines between the definition line and either a subsequent undefinition or a new context line. If any one or more of the strings in String2 String3 ··· StringN is currently an abbreviation, replace each such string with its expansion, before storing the expansion for String1. If an abbreviation of the same kind (global or local) already exists for String1, delete the old expansion after determining the new expansion.
When your program encounters a line of type 3, it should delete the specified global or local abbreviation.
No other text is allowed on lines of types 1, 2 and 3 aside from that described, but whitespace before, in between and after the required strings is allowed.
When your program encounters a line of type 4, it should print out the input line, with each abbreviation expanded. Local abbreviations take precedence over global ones, meaning that the local abbreviation expansion should be used if both a global and local abbreviation are currently defined. Any abbreviation replaced must match an entire word, not just a substring (a part of a word). Words are delimited by whitespace.
No other words should be affected. Abbreviations within expanded text should be
ignored. If the line contains no abbreviations, then it should
be printed out unchanged (except whitespace doesn't have to be the same).
Lines of types 2 and 4 also continue onto the following line, if the last character on the line is a backslash (\). This means that two or more input lines effectively translate to one abbreviation definition or output line. Remember that the newline at the end of an input line (e.g., after a line-ending backslash) counts as whitespace.
The output from your program should consist of one output line for each input line (or each set of input lines, all but the last ending in a backslash) of type 4. You can assume that strings on the left-hand side of all abbreviation definitions are single words, and that the right-hand side of a definition is a (possibly empty) sequence of words separated by whitespace. Also, don't worry if the output lines are very long.
For example:
foo
foo #LOCAL_DEF# foo bar baz
bar
foo
baz \
foobar
#CONTEXT#
foo
foo #LOCAL_DEF# foo bar baz
foo #GLOBAL_DEF# foo bar baz
bar
foo
#LOCAL_UNDEF# foo
foo
foobar bar
should output:
foo
bar
foo bar baz
baz foobar
foo
bar
foo bar baz
foo bar baz bar baz
foobar bar
Handing in the assignment
Instructions for submitting your work:
Your work may not be graded if these procedures are not followed exactly.
A large penalty will be assessed if the required output format is not followed exactly.