The code described in this README parses latex-formatted text in search of forbidden phrases and prints error messages formatted as if from a compiler.
This code is a weapon against collaborators who would dilute your writing with ambiguous or verbose prose. It was designed to ensure that collaborative papers appear in a consistent style: mine.
I think of all activities as coding. My presentations use python scripts, not powerpoint. My writing and research posters use latex and emacs, not word or publisher. When I write code, I try to be creative and sloppy, relying on tools to identify most of my mistakes. (For example, gcc issues warnings, splint can verify some memory processes, and custom scripts can identify simple portability problems.) When writing text, I had no such tools.
The style checker is one such tool. When I notice a mistake that can be identified by a regular expression, I add a forbidden expression to the style checker's ruleset. Then, when I build my latex-formatted paper, I run the style checker to seek out such phrases.
It has saved me from submitting gramatically-sloppy last-minute edits.
Like the warnings printed by a compiler, errors from the style checker should not be taken literally. Use your own judgement to correct sentences: make them shorter, more specific, and more varied. I cannot advise you to rely on this tool in the way I do; my approach to writing is not necessarily a good one for anyone.
[ ], % syntax whitespace before comma seems wrong. ''[\.,] % syntax end quotes go outside punctuation like . and , [ ]-[ ] % syntax a hyphen surrounded by space should probably be an emdash '---'
Planetlab % capitalize PlanetLab planetlab % capitalize PlanetLab ccdf % capitalize cisco % capitalize the company name internet % capitalize unless talking about an internet other than the Internet ttl % capitalize
[^r][^c][^h] impact % phrase "effect", "result", though nsf likes "research impact" absolutely essential % phrase essential few in number % phrase few the the % phrase apparent double word. (quite|more|very|most) unique % phrase unique is. a large number of % phrase you mean "many" the way in which % phrase should be "how" or "" live in a vacuum % phrase a tired metaphor that makes me want to vomit.
experiements % spelling measurment % spelling secrurity % spelling taht % spelling teh % spelling privledge % spelling "privilege" I misspell it every way possible. privlege % spelling "privilege" I misspell it every way possible. priviledge % spelling "privilege" I misspell it every way possible. queueing % spelling I'd love to spell it this way, but spellchecker whines.
todo % ignoredcommand texttt % ignoredcommand
In addition to configurable rules, the style checker also seeks out common errors within the LaTeX source itself:
Comments using \begin{comment} and \end{comment} from comment.sty are skipped.
Math mode between unescaped $'s is skipped.
For more detail, read the ruby code. It's shorter than this file.
To run the script,
style-check.rb *.tex
Or, if you'd like a little justification with your scolding,
style-check.rb -v *.tex
This code will not teach you to use "that" and "which" properly. It will not teach you to hyphenate. It may be used for evil. You may think my rules are stupid.
There are many bugs in the code; it is not guaranteed that the style checker will find all forbidden phrases. It may become confused by nested environments.
Don't ask me to add a feature. Send me a patch.
Don't complain that it's written in Ruby. My language kicks your language's ass.
Don't complain about the ruleset. Invent a mechanism to override ones you don't care for.
Woe is I: The Grammarphobe's Guide to Better English in Plain English, by Patricia T. O'Conner.
Lake Superior State University Banished Words List
http://www.lssu.edu/banished/
Usage in The American Heritage Dictionary
http://www.bartleby.com/61/7.html
alt.usage.english FAQ
http://alt-usage-english.org/fast_faq.shtml
How To Write A Dissertation, by Doug Comer
http://www.cs.purdue.edu/homes/dec/essay.dissertation.html
Plain English Campaign
http://www.plainenglish.co.uk/
Thesis Errors
http://core.ecu.edu/psyc/wuenschk/therr.htm
How to Avoid Colloquial (Informal) Writing
http://www.wikihow.com/Avoid-Colloquial-%28Informal%29-Writing
Henning Schulzrinne's Notes
Rules for Writers, by Diana Hacker.
The New Oxford Guide to Writing, by Thomas S. Kane.
Line by Line: How to Edit Your Own Writing, by Claire Cook.
Michael Haardt wrote GNU diction, which is similar in that it finds and complains about bad phrases, but different in that it also notes questionable phrases (such as any use of "affect") and does not expect to check LaTeX source. Style-check focuses on forbidden phrases and common typographic errors in LaTeX code.
Kurt Partridge for encouraging me to release this thing.
Vibha Sazawal for reminding me often that there is more to writing than style.
Rich Wolski for introducing me to Strunk and White, the gateway drug.
Jacob Martin reported the first bug, packaged style-check for gentoo, and contributed a ruleset based on Day and Gastel, "How to Write and Publish a Scientific Paper".
Indika Meedeniya noticed a few more bugs and suggested compatibility with gedit.