Assignment # 3: Perl
Abbreviation expansion DUE: October 11, 1999 - 6:00pm
Many interactive programs, such as text editors, allow a user to define abbreviations for commonly used strings. Once an abbreviation has been defined, it can be automatically translated into its expanded form. For example, if the abbreviation EPA abbreviates the expanded form Environmental Protection Agency, then the sentence:
Fred's Chemical Company and Taco Shack has been fined by
the EPA for illegally dumping toxic waste.
gets expanded into:
Fred's Chemical Company and Taco Shack has been fined by
the Environmental Protection Agency for illegally dumping toxic waste.
Your assignment is to write a Perl program that rewrites input lines, replacing abbreviations with their expanded forms.
Each input line will contain zero or more strings (non-empty sequences of characters), delimited by spaces. Input lines will be of three types:
When your program encounters a line of type 1, it should create a new abbreviation for String1 and produce no output. This abbreviation will be applied to all input lines between the definition line and a subsequent undefinition line. If any one or more of the strings in String2 String3 ··· StringN is itself a pre-existing abbreviation, replace each such string with its expansion, before storing the expansion for String1. If an abbreviation already exists for String1, delete the old expansion after determining the new expansion.
When your program encounters a line of type 2, it should delete the specified abbreviation.
When your program encounters a line of type 3, it should print out the input line, with each abbreviation expanded. Any abbreviation replaced must match an entire word, not just a substring (a part of a word). Words are delimited by whitespace.
No other words should be affected. Abbreviations within expanded text should be ignored. If the line contains no abbreviations, then it should be printed out unchanged (ignoring whitespace).
Lines of types 1 and 3 also continue onto the following line, if the line ends in a backslash (\). This means that two or more input lines effectively translate to one abbreviation definition or output line.
Finally, if the last character on a line of type 3 is a hyphen (-), then the rest of the last word on that line is at the beginning of the next line (which must be of type 3), up to the first whitespace on that line. In other words, the first word on the next line must be appended to the last word of the line with the hyphen, and the hyphenated word is part of the first line. Also, if the second line has only one word, and that word again ends in a hyphen, then the hyphenated word from the first line continues, and so on for subsequent lines A hyphen may also be the character before a line continuation character (\), in which case the last word on that line is concatenated with the first word on the continuation line.
The output from your program should consist of one output line for each input line (or each set of input lines, all but the last ending in a backslash) of type 3. You can assume that strings on the left-hand side of all definitions are single words, and that right-hand side of a definition is a (possibly empty) sequences of words separated by whitespace. Also, don't worry if the output lines are very long.
For example:
foo
foo #DEF# foo bar baz
bar
foo
baz \
foobar
foo #DEF# foo bar baz
bar
foo
baz-
foobar bar
should output:
foo
bar
foo bar baz
baz foobar
bar
foo bar baz bar baz
bazfoobar
bar
Handing in the assignment
Instructions for submitting your work:
Your work may not be graded if these procedures are not followed exactly.
A large penalty will be assessed if the required output format is not followed exactly.