UNIX Unleashed, Internet Edition
- 11 -Tools for WritersBy David B. Horvath, CCP and Susan Peppard The preceding chapters in this section described the heart of text formatting and printing (processing)--nroff, troff, and related macros. In this chapter, I show you many of the commands that UNIX provides to support writers. These commands are as follow:
Not all these commands are available with all versions of UNIX. When you're in doubt, check the man page or contact your system administrator. Preprocessors for nroff and troffA number of the tools for writers are actually preprocessors. You describe the object you are building using the language specific to that tool. The tool takes your input and creates output that nroff or troff can process. These tools evolved because the syntax of nroff and troff can be so complicated. These preprocessors are as follow:
Formatting Tables with tbltbl is a troff preprocessor used to create tables (columns of data with headings). The code you write is processed by tbl before the file is processed by troff. Often, you pipe the tbl output of a file into troff as follows: tbl filename | troff -options tbl takes as input the commands between each .TS/.TE macro pair and converts the input into a printable table. All other input is passed through without modification. The general format of a table is as follows (don't try to process this through tbl): .TS H global option; formatting options 1 formatting options 2. Column 1 title [tab] column 2 title [tab] column 3 title .TH Col 1 Item 1 [tab] Col 2 Item 1 [tab] Col 3 Item 1 Col 1 Item 2 [tab] Col 2 Item 2 [tab] Col 3 Item 2 .TE You use the H option on .TS when the table might cross page boundaries, and you want the column titles to print on each page. You place the .TH macro after the column titles to separate them from the actual data values. If the table will definitely fit on one page, you can omit the H option on .TS and the .TH. The global option; can be any one of the values shown in Table 11.1. Note that the ; line terminator is required. The default, with no global option; specified, is to create the table flush with the left margin. Table 11.1. tbl global options.
You must have at least one line of formatting options (which apply to the entire table). If you specify multiple lines of formatting options, the first applies to the column headers, and the last lines apply to the table items themselves. You can have multiple lines of column headings and multiple lines of column formatting options. The period . at the end of the last line of formatting options is required. Table 11.2 shows the available formatting options; note that they are not case sensitive. You should have one option per column. Table 11.2. tbl formatting options.
You also can use another tbl macro for formatting columns: .T&. You use it when you want to change the column format at a later time. The columns are separated through the use of the [tab] character. You can change this character by using the tab(x) global option. To print a horizontal line at any time, use the underscore character _ on a line by itself. Underscores separated by the current tab character cause a single line to be drawn under that column. You can use the equal sign (=) in place of the underscore (_) to draw a double line across the line or column. You can use them in the column heading or data areas. Listing 11.1 shows the tbl source of a simple table. Figure 11.1 shows the resulting table. Figure 11.1. Listing 11.1. tbl source: Simple table..sp 3i .TS H box tab(@); c s s c c c l l n. State Statistics State@Capital@Population _ .TH Missouri@Jefferson City@5,192,632 Montana@Helena@823,697 _ Nebraska@Lincoln@1,605,603 =@@= .TE In Listing 11.1, note that the tab character has been changed to @. The entire table is enclosed in a box. The first line of the heading spans multiple columns. The first underscore character draws the line under the column headings, and the second draws the line under Montana. The equal signs separated by two tab characters create double underlines for the state and population columns under Nebraska. TroubleshootingWhen you use tbl, keep the following points in mind:
Formatting Equations with eqn/neqneqn is a troff preprocessor used to format equations (written in a form that would make your calculus teacher happy, not like you use in a C program). The code you write is processed by eqn before the file is processed by troff. Often, you pipe the eqn output of a file into troff as follows: eqn filename | troff -options eqn takes as input the commands between each .EQ/.EN macro pair and converts that input into a printable equation. All other input is passed through without modification. neqn is used with nroff to simulate the equations when the output is a fixed format. The two general formats of equations are shown in Listing 11.2. Listing 11.2. eqn source: Simple equations..EQ a + b over d = c .EN .EQ delim ## .EN This is text with the same equation # a + b over d = c # in it. The first form requires the equation to be embedded within a .EQ/.EN macro pair, and the second form defines a pair of delimiters within which eqn or neqn recognizes equations. The results are shown in Figure 11.2. Figure 11.2.
You use the eqn delimiters for inline equations. Even if you're sure that you will not use inline equations, providing the delimiters is still a good idea. If the defined delimiters are used in your text for other things besides equations, you can always turn them off by using the following command: .EQ delim off .EN eqn Keywordseqn was designed to be easy for mathematicians to learn and use. It uses familiar words and abbreviations. For example, if you read a1=b2 aloud, you would say, "a sub one equals b sub two." You write eqn code the same way. The spaces here are important: #a sub 1 = b sub 2# The opposite of sub is sup, for superscript. Here's an example: #a sup 2 > b sup 2 The following are the eqn keywords:
The following are the eqn keywords for the uppercase Greek characters (mathematical symbols):
The following are the eqn keywords for the lowercase Greek characters (mathematical symbols):
There is no provision for the uppercase letters that are identical to their roman cousins (A, B, E, H, I, K, M, N, O, P, T, X, and Z). If you want an uppercase alpha, just type A.
eqn also includes the following terms, which are printed in roman, not italic, type:
eqn OperatorsYou have already seen some of the eqn operators--+, -, =, and >. Table 11.3 lists the others. Table 11.3. eqn operators.
Table 11.4. Diacritical marks.
eqn pays no attention to spaces or new-line characters except as delimiters. After eqn determines what you mean (or what it thinks you mean), it throws away spaces and newlines. To obtain spaces in your output, use a tilde (~) for a one-character space or a caret, also known as circumflex, (^) for a half-character space. If you say "3 plus 2 times 5," your listeners do not know whether you mean the result 25 or 13. eqn has the same problem. Like your listeners, eqn makes an assumption about # a + b * c #. If you provide no more information, eqn groups according to the order in which you enter information. In other words, it assumes parentheses. Although computers do this, mathematicians do not. They believe in precedence, which holds that multiplication always precedes addition. Therefore, 3 + 2 x 5 is 13. Period. Even mathematicians, though, sometimes need parentheses. Because parentheses are used so often in mathematical expressions, with eqn you must use curly braces--{ and }--to indicate grouping in your expressions. Therefore, if you really mean the result 13, you write the following: # 3 + {2 * 5} #
The form for the equation is # a + {b * c} #
The spaces here are important to help eqn determine the meaning of the symbols. You can also change fonts and point sizes in an equation. Table 11.5 describes the keywords used. Table 11.5. Keywords used to change fonts and point sizes.
You can use the gsize option to set a global point size. Similarly, you can use gfont. You should put these options near the top of your file; otherwise, eqn uses the default point size (usually 10). DefinesIf you use a complex term repeatedly, you can define it as something short. A good choice, for example, is &. Then, instead of typing x= sqrt {SIGMA {x sup 2}} over N
you can type x= & define works like this: .EQ
define & 'sqrt {SIGMA {x sup 2}} over N'
x = &
.EN
You can select any characters you want--fg, xy, and so on--but you should be sensible. Do not choose the special characters that eqn uses, the delimiters, and don't use eqn keywords, even though doing so is permitted. PrecedenceWithout braces to force it to look at terms as groups, eqn recognizes the following orders of precedence: dyad vec under bar tilde hat dot dotdot left right fwd back down up fat roman italic bold size sub sup sqrt over from to All operations group to the right, except for sqrt, left, and right, which group to the left. TroubleshootingSometimes, despite your best efforts, your equations just do not come out right. In this section, I provide some suggestions for detecting and correcting faulty code and for dealing with some of eqn's more arcane characteristics. At its best, eqn is quirky. So, before you print a 50-page chapter containing 70 equations, check your equations. This way, you can pinpoint syntax errors such as unmatched delimiters. The checkeq program is a good place to start. You use checkeq like this: checkeq myeqnfile If checkeq finds no errors, it displays the following: myeqnfile or myeqnfile:
New delims: ##, line 2
If you have an odd number of delimiters, you see something like the following: myeqnfile:
myeqnfile:
New delims: ##, line 2
3 line ##, lines 7-9
3 line ##, lines 9-11
3 line ##, lines 11-13
3 line ##, lines 13-15
3 line ##, lines 15-17
Unfinished ##
If, for some reason, you specified bad delimiters (#$, for example, or #), checkeq announces myeqnfile
Strange delims at Line 2
or myeqnfile
Unfinished
checkeq is good for not much more than this use. Because checkeq lets many mistakes slip by, you can also process your equation and send output to /dev/null (so that you do not clutter your directory with a lot of troff output). To do so, use the following: eqn myeqnfile > /dev/null When errors are found, you see a message like the following: eqn: syntax error near line 19, file nyeqnfile
context is
!a = (x >>> {sup <<<< 2}) + 1!
The line number is not guaranteed, but it should be close. Again, this approach is not foolproof because, if eqn can process your code, it does. You get no error message, even if your output is garbled or non-existent. If you get no output at all: If your file contains only an equation, try inserting a line of text before the equation. If you do not want text, use translated tildes as follows: .tr ~ ~ ~ ~ .EQ x sup 2 + y sup 2 = z sup 2 .EN Oddly enough, this approach does not seem to affect eqn code--even code with tildes in it. If the vertical spacing is wrong: Try printing your file with a different macro package--or with no macro package. If you're using your own package, you might have to pipe your file through eqn before you send it through troff. Try processing the eqn code and replacing the code with the processed code in your file. Try using space 0 as the first line of your eqn code: .EQ space 0 code . . . .EN If you're using the .EQ/.EN macros to delimit your equation, try using delimiters (## or !!). If your equation is garbled: Check for omitted spaces and braces. Use them, even if it looks like you really don't need them. Count braces and parentheses to make sure that you have an even number of each. Sometimes checkeq and eqn do not always find this problem. If your equation contains a sup, make sure that you use spaces around its argument, even if you have a brace. Make sure that you have keywords in the right order. For example, #x highbar tilde# produces no error message, but it prints the bar right on the tilde. The correct order is #x tilde highbar#. If your equation is long or complicated, use lots of spaces and newlines to make it easy to read. Drawing Pictures with picpic is rarely your first choice as a drawing tool. With pic, you can draw lines and a limited variety of shapes--no color, no shading--but you can create a complex and detailed picture, if you're willing to work at it. pic was developed before everyone had personal computers with sophisticated, mouse-based drawing packages. Today, troff users with graphics terminals can use mouse-based programs such as xcip. These programs provide many of the capabilities--except for color--of the sophisticated packages, and they do not require a knowledge of pic. xcip produces pic code, which you can edit if you know pic. pic is no substitute for a sophisticated drawing tool. If you can use one of those tools, do so! pic is a troff preprocessor used to create pictures (line drawings, actually). The code you write is processed by pic before the file is processed by troff. Often, you pipe the pic output of a file into troff as follows: pic filename | troff -options pic takes as input the commands between each .PS/.PE macro pair and converts that input into a line drawing. All other input is passed through without modification. The general format of a picture is as follows: .PS .box ht 1 wid 1.25 .PE ms includes a .PF macro for picture flyback. This macro restores you to your last position on the page (vertically and horizontally) before the picture--where you were located before you invoked pic. This feature is rarely used. Some pic users surround their pic code with display macros and specify no-fill mode. Here's an example: .DS .nf .PS .box ht 1 wid 1.25 . . .PE .DE This example might look like overkill, but mm likes it. You also can use the .PS macro to do the following:
Whatever you do, do not include any spacing requests--.sp, .ls, .vs, .SP, and .P--inside your pic code. pic does its own spacing, and it gets really annoyed if you interfere. Use the move command instead. pic keywordspic is built on a series of keywords to build line drawings; these keywords are shown in Table 11.6. Table 11.6. pic keywords.
All these primitives accept options and text (except for the text itself and comments, of course). The options are shown in Table 11.7. Table 11.7. pic keywords options.
pic also recognizes trigonometric and other mathematical functions, as follow:
These functions must be followed by an expression in parentheses. In the case of atan2, max, and min, two expressions must follow. rand is followed by empty parentheses and produces a random number between 0 and 1. Other options and keywords are available for the creation of more complicated items such as object blocks (used for complex objects such as hexagons) and macros (for repeated commands). Look at the man page if you need this information. More About PlacementTo avoid having to think like pic--an exercise that can be dangerous to your mental health--you can refer to parts of objects that you have already drawn. pic recognizes all of the following:
pic also understands compass points. The position notation words and the compass points enable you to specify positions like the following: line from upper right of 2nd last box to upper left of last box arrow from 1st circle.e to 2nd circle.w box at end of last line move left 1i from start of last box line from Box.c to Box.s move down 1i from bottom of 2nd last ellipse
Now you have several ways of specifying positions to draw objects. You can write .PS box "Box 1" move to last box.s down .5 box "Box 2" .PE Or you can write .PS box "Box 1" move to bottom of last box down .5 box "Box 2" .PE If you want to avoid the wordiness of bottom of last box, you can label your construct as follows: B1: box "Box 1" Labels must begin with a capital letter. Using labels enables you to specify the two boxes as follows: .PS B1:box "Box 1" B2:box with .c down 1i from B1.c "Box 2" .PE No matter which notation you use, you get the two boxes shown in Figure 11.4. Figure 11.3.
The notations left, right, .ne, .sw, and others assume that you can tell left from right and east from west. If you are directionally challenged, you should allow extra debugging time when using pic. pic comes to your rescue with a solution. It also understands Cartesian coordinates, as shown in Figure 11.3. Figure 11.4. Again, the unit is inches. The important point to remember is that your first object starts at 0,0. In other words, the coordinates are relative. No specific location on a page or in a drawing is always 0,0. This location depends on where you start. Cartesian coordinates enable you to specify the two boxes shown in Figure 11.4 as follows: .PS box at 0,0 "Box 1" box at 0,-1 "Box 2" .PE You might find working with this approach easier. Controlling Sizepic variables include several that specify the default size of pic objects. Table 11.8 lists these variables and their default values. Table 11.8. Default values of pic variables.
arrowwid and arrowht refer to the arrowhead. The arrowhead variable specifies the fill style of the arrowhead. You can easily change the value of a variable as follows: boxht = .75; boxwid = .5 Remember: The default unit for pic is inches. You also can control the size of a picture in other ways. You can specify a height or a width--or both--on the .PS line. Specifying only the width is usually better. If you specify both dimensions, your picture may be distorted.
You can also set the variable scale. By default, scale is set at 100 or 1, depending on your version of pic. You can test it by scaling a drawing to 1.5. If you get an error message or a garbled result, use 150. All the dimensions in a pic drawing are divided by the scaling factor. Therefore, if the scale is normally 1 and you set it to 4, your 1-inch lines end up a quarter-inch long. Here's an example: .PS scale = 2 box ht 2i wid 2i .PE This code produces a box scaled down to half the size of its specifications, that is, a 1-inch square.
DebuggingWhen using pic, you are not troubleshooting; you are debugging. Debugging as you code is much easier. Draw the first element of your picture. Before you print it, send the file through pic to see whether any error messages are generated. If your file contains only pic, you can use the following: pic filename If your file contains text, just use your normal troff command line. However, instead of sending the file to a printer, redirect your output to /dev/null. pic tries to help you pinpoint your errors with messages similar to the following: pic: syntax error near line 26
context is
>>> linr <<< left 1i
Occasionally, pic tells you that it has reduced the size of your picture. This result occurs almost always because you made a mistake. Most often, you left out a decimal point, and pic is trying to fit a line 1,625 inches long--you meant 1.625 inches--on an 8.5-inch page. When you get this result, your picture naturally is mangled out of all recognition. Usually, your debugging involves the placement of objects and the placement or size of text. pic Tips and TricksThere are a few things that will make your life easier:
Creating Graphs with grapgrap is a troff preprocessor used to create graphs. The code you write is processed by grap and pic before the file is processed by troff. Often, you pipe the grap output of a file into pic and then into troff as follows: grap filename | pic | troff -options grap takes as input the commands between each .G1/.G2 macro pair and converts that input into a printable graph. All other input is passed through without modification. The general format of a graph is as follows: .G1 grap command value value .G2 Because grap has a copy facility similar to that of pic, you can simplify your code even more by putting the data in a separate file. .G1 copy "test.scores" .G2 If you want the graph to have a solid line instead of scatter points, simply add a line of code that says draw solid immediately after the .G1 macro. Adding Bells, Whistles, and TicksYou can make your graph much more attractive by drawing a frame, adding labels, and specifying ticks. The code in Listing 11.3 produces a more sophisticated graph, which is shown in Figure 11.6. Listing 11.3. grap source: Simple graph..G1 frame invis ht 2 wid 3 left solid bot solid label left "1990" "Dollars" left .5 label bot "Grand Total: $210,000" ticks left out at 6000 "6,000", 9000 "9,000", 12000 "12,000", 15000 "15,000",\ 18000 "18,000", 21000 "21,000" ticks bot at 1990 "1990", 1995 "1995", 2000 "2000", 2005 "2005", \ 2010 "2010" draw solid copy "cost.child" .G2 Figure 11.5. The data file "cost.child" is shown in Listing 11.4. Listing 11.4. grap: Contents of the cost.child file.1990 4330 1991 4590 1992 4870 1993 5510 1994 5850 1995 6200 1996 6550 1997 6859 1998 7360 1999 7570 2000 8020 2001 8500 2002 10360 2003 10980 2004 11640 2005 13160 2006 13950 2007 14780 Here, the frame is shown only on the bottom and left side. The x and y coordinates have labels. Also, the ticks have been specified explicitly; they are not determined by grap.
You can specify ticks as out. This means that the ticks themselves, but not their labels, appear outside the grap frame. You can also specify ticks as in, in which case they appear inside the frame. If you have too many dates to fit across the bottom of a graph, you might want to use apostrophes in the labels, as in '10, '20, and '30. To do so, you must tell grap that your label is a literal. If you want to specify the first and last dates in full, you need to use two tick lines. The following code, for example, produces bottom labels of 1900, '05, '10, '15, and so on, up to 1950: ticks bottom out at 0 "1900", 50 "1950" ticks bottom out from 05 to 45 by 5 "'%g"
Notice the words at 0 in the preceding example. grap recognizes x,y coordinates, and unlike pic, it understands that 0,0 is the intersection of the x and y axes. To use coordinates, you use the coord command as follows: coord x first-value, y last-value Without the coord command, grap automatically pads your first and last values, giving you blank space at the beginning and at the end of your graph. coord suppresses padding. Likewise, use coord if you want an exponential graph rather than a linear graph. Adding Shapes and Other Featuresgrap does more than just draw lines. It can print a grid. It also can draw a circle, ellipse, or arrow. In fact, grap can draw just about anything that pic can. It also has a macro facility just like that of pic. In addition to data explicitly specified in a file, grap works with functions that more or less describe the data. The code in Listing 11.5 produces a sine curve graph, which is shown in Figure 11.7. Listing 11.5. grap source: Sine curve..G1
frame ht 1 wid 3
draw solid
pi=atan2(0,-1)
for i from 0 to 2*pi by .1 do { next at i, sin(i) }
.G2
Figure 11.6. Summary of grap CommandsTable 11.9 summarizes the grap commands. Square brackets indicate that an argument is optional. A pipe between arguments means that you must use only one of the arguments. Table 11.9. Summary of grap commands.
In addition to the commands listed in Table 11.9, grap provides for predefined strings and built-in functions. Predefined strings include bullet, plus, box, star, dot, times, htick, vtick, square, and delta. Built-in functions include log (base 10), exp (base 10), int, sin, cos, atan2, sqrt, min, max, and rand. Formatting Programs with cwcw is a troff preprocessor used to create constant-width text (resembling the output from line printers or text terminal screens). The code you write is processed by cw before the file is processed by troff. Often, you pipe the cw output of a file into troff as follows: cw filename | troff -options cw takes as input the commands between each .CW/.CN macro pair and converts that input into constant-width text. All other input is passed through without modification. The general format of constant-width text is as follows: .CW text to print in constant-width more text (often program code or output that should look like it was on a text screen or printed report). .CN Note that the mm and mv macro packages contain .CW/.CN macros that perform many of these functions. Building Literature References with referrefer is a troff preprocessor used to create footnotes from a citation database. The citation you include is processed by refer before the file is processed by troff. Often, you pipe the refer output of a file into troff as follows: refer filename | troff -ms -options refer takes as input the citations between each .[/.] macro pair, searches the database, and inserts standard footnotes. The citation can be incomplete, and if more than one source matches, an error message is produced along with a list of matching titles. From that list, you can select the proper one and use it to expand the citation. All other input is passed through without modification. The macro packages, such as ms, print the references from the output of refer. The location for the references to be printed is denoted by the following: .[ $LIST$ .] The general format of a refer citation is as follows: .[ author title year etc. .] The reference database is in the following form for journal articles: %T U\s-2NIX\s0 Time-Sharing System: U\s-2NIX\s0 Implementation %K unix bstj %A K. Thompson %J Bell Sys. Tech. J. %V 57 %N 6 %P 1931-1946 %D 1978 The reference database is in the following form for articles in books: %A E. W. Dijkstra %T Cooperating Sequential Processes %B Programming Languages %E F. Genuys %I Academic Press %C New York %D 1968 %P 43-112 Listing 11.6 shows a simple troff document with embedded refer macros. Figure 11.8 shows the output. Listing 11.6. refer/troff source.This is text with a reference .[ 1978 Lesk .] and another one on the same line .[ Ritchie The C Programming Language .] This is another line of text that does not have any references. .[ $LIST$ .] Figure 11.7. If you use the -S option on the refer command line, the Social Science form of references is used in the format (author, year). Building Permutated Indexes with ptx and mptx (Macros)ptx is a troff preprocessor used to create permutated indexes from a document. The plain text (see the section on deroff later in this chapter) is processed by ptx before the file is processed by troff. Often, you pipe the ptx output of a file into troff as follows: ptx filename | troff -mptx -options ptx takes the input and breaks lines up by what seem like keywords to it. On the command line, you can specify a file that contains words to ignore (using the -i option) or specify a file that contains only the keywords to work with (using the -o option). The default ignore file is usually /usr/lib/eign. The permutated file is sorted and lines are created in the following format: .xx "" "before the keyword" "keyword" "after the keyword" The mptx macro package has the definition for the .xx macro. ptx is stupid; if you feed it a troff file, it tries to build an index of all the troff primitives you used. Using spellPreparing a document that looks splendid is a lot of work, and you don't want it marred by spelling mistakes. The spell program catches most of your mistakes. An interactive version of spell, called ispell, also is available on some systems.
spell uses a standard dictionary. It checks the words in your file against this dictionary and outputs a list of words not found in the dictionary. You can also create a personal dictionary to make spell more useful (described in the next section). spell is smart enough to ignore most troff, tbl, and eqn/neqn requests and macros. If your file includes .so or .nx requests, spell searches the sourced in files. To invoke spell, type spell and your filename. All the words that spell does not recognize are displayed on your screen, one word per line. This list of unrecognized words is arranged in ASCII order. That is, special characters and numbers come first, uppercase letters come next, and then lowercase letters. In other words, the words are not in the order in which they occur in your file. Each unrecognized word appears only once. Therefore, if you typed teh for the 15 times, teh appears only once in the spell output. The list of unrecognized words can be very long, especially if your text is full of acronyms or proper names or if you make frequent typographical errors. The first few screens will speed by at what seems like 1,000 miles per hour, and you will not be able to read them at all. To read all the screens, redirect the output of spell to a file as follows: $ spell filename > outputfilename
After you create the file of unrecognized words, you can handle it in several ways:
Now correct your mistakes. The list probably contains a number of words that are perfectly legitimate. For example, spell refuses to recognize the words diskette and detail. There is no good reason for this refusal, but it may spur you to create a personal dictionary. To correct your mistakes, first edit your file. Next, do one of the following:
Some risk is associated with using the global method. For example, if I run spell on this chapter, teh would appear on the list of unrecognized words. Then if I globally change all occurrences of teh to the, this chapter, or at least this section, would be virtually incomprehensible. The moral is, use global substitutions wisely, and never use them on someone else's files. After you correct your file, run it through spell once more just to be sure. The new output overwrites the old file.
Occasionally, spell finds something like ne. Searching for all instances of ne is not fun, especially in a file with 2,000 lines. You can embed spaces in your search as follows: /[space]ne[space]. However, this approach is rarely helpful because spell ignores punctuation marks and special characters. If you type This must be the o ne, /[space]ne[space], spell does not find the error. You can try searching with one embedded space--for example, /[space]ne and /ne[space]--but you still may not find the offender. Try /\<ne\>. This example finds ne as a complete word--that is, surrounded by spaces, at the beginning or end of a line, or followed by punctuation.
Creating a Personal DictionaryIf your name is Leee--with three e's--and you get tired of seeing it in the list of unrecognized words, you can add Leee to a personal dictionary. To create a personalized dictionary, follow these steps:
Your personal dictionary does not have to be in the same directory as your input files. If it is not, you must specify a path on the command line, as follows: $ spell +/dict/mydict inputfile > w Creating Specialized DictionariesPersonalized dictionaries are a great help if you're working on several writing projects, each of which has a specialized vocabulary. For example, if you're working on the XYZZY project, and the same words keep turning up in your w file--words that are perfectly okay in the context of the XYZZY system but not okay in any other files--you can create an xyzzy.dict. An easy way to automate some of the steps necessary for creating a specialized dictionary is to run spell on your first file. Here's an example: $ spell ch01 > w Then you can run it on all the rest of your files. Append the output to w, instead of replacing w. For example, $ spell ch02 >> w At this point, you will have a long file that contains all the words that spell does not recognize. First, you need to sort the file and get rid of the duplicates, as in the following example. (Refer to the sort command in Chapter 5 of UNIX Unleashed, System Administrator's Edition, "General Commands"). $ sort w -u>sorted.w Here, the -u option stands for unique. sort drops all the duplicates from the list. Now edit sorted.w, deleting all the misspelled words and all words not specific to your XYZZY project. The words that remain form the basis of xyzzy.dict. Change the name of sorted.w to xyzzy.dict by typing mv sorted.w xyzzy.dict. You can add words to or delete words from this file as necessary. Repeat this process to create additional specialized dictionaries. And if you want to be nice, you can share your specialized dictionaries with your colleagues. Using ispellispell is an interactive version of spell. It works like the spell checkers that come with word processing applications. That is, it locates the first word in your file that it does not recognize and stops there. Then you can correct the word or press Enter to continue. ispell uses the same dictionary as spell. To invoke ispell, do one of the following:
Although some people prefer ispell, the unadorned, ordinary spell is more useful if you want to create personal or specialized dictionaries or if you want to make global changes to your input file. /dev/null: The Path to UNIX LimboAs you are surely tired of hearing, UNIX views everything as a file, including devices (such as your terminal or the printer you use). Device files are stored neatly in the /dev directory. Occasionally, you specify devices by their filenames (for example, when you're reading a tape or mounting a disk drive), but most often you do not bother to think about device files. You might want to use is one device file, however: /dev/null. The null file in the /dev directory is just what it sounds like: nothing. It is the equivalent of the fifth dimension, the incinerator chute, or the bit bucket. If you send something there, you cannot get it back--ever. Why would you want to send output to /dev/null? If you just created a complex table (or picture, graph, or equation), you can process your creation without wasting paper. Just direct the output to /dev/null: tbl filename> /dev/null eqn filename> /dev/null pic filename > /dev/null You then see any error messages on your screen. This approach is usually more reliable than using checkeq. And you can use it for text files. Counting Words with wcSometimes you need to count the words in a document. UNIX has the tool for you. The wc command counts lines, words, and characters. It can give you a total if you specify more than one file as input. To count the words in ch01, enter wc -w ch01. You can count lines by using the -l option or characters by using the -c option. Bear in mind, however, that wc counts all your macros as words. Refer to Chapter 5 in UNIX Unleashed, System Administrator's Edition for more details on wc. Using diction, explain, and style to Check GrammarIn addition to the tools you use to develop a document physically, some versions of UNIX provide tools to help develop the English within that document. Many word processors provide these features; they were originally created in the UNIX environment. Using dictiondiction is a processor of troff formatted files. It finds those sentences in the document that have phrases that match the database of wordy or obfuscated writing. Each of the phrases that match is enclosed within brackets ([ and ]) to highlight them to the writer. Prior to checking the file, diction runs deroff to remove the troff requests and macros. By default, ms macro package macros are removed, and the -mm option switches to the mm macro package. The use of other macro packages or your own could cause diction to break sentences improperly and miss phrases that match the database. By default, the pattern database is contained in /usr/lib/dict.d. The typical use of diction is as follows: diction trofffile > diction.out When run on the preceding description of the diction command, diction produces the following: trofffile *[ Prior to ]* checking the file, diction runs deroff to remove the troff requests and macros. number of sentences 9 number of phrases found 1 Using explainexplain is an online thesaurus. It takes the phrases identified by diction and suggests more readable replacements. Some versions accept the output of diction as piped input, whereas others require interactive input. explain takes each phrase entered and looks it up in the thesaurus (usually /usr/lib/explain.d). When it finds a match, it displays the suggested replacement. explain has no command-line arguments. When you're done checking phrases, press Ctrl+D to quit. The preceding diction command found one phrase, and explain has the following suggestion: $ explain phrase?prior to use "before" for "prior to" phrase? ^D $ Using stylestyle is a processor of troff formatted files. It processes the input file and reports the readability, length and structure of sentences, length and usage of words, verb types, and openers of sentences. When run on the description of the preceding diction command, style produces the following: inputfilename
readability grades:
(Kincaid) 8.3 (auto) 9.2 (Coleman-Liau) 11.2 (Flesch) 8.8 (62.1)
sentence info:
no. sent 8 no. wds 119
av sent leng 14.9 av word leng 4.93
no. questions 0 no. imperatives 0
no. nonfunc wds 70 58.8% av leng 6.46
short sent (<10) 25% (2) long sent (>25) 0% (0)
longest sent 23 wds at sent 6; shortest sent 8 wds at sent 1
sentence types:
simple 62% (5) complex 12% (1)
compound 0% (0) compound-complex 25% (2)
word usage:
verb types as % of total verbs
tobe 31% (5) aux 6% (1) inf 19% (3)
passives as % of non-inf verbs 23% (3)
types as % of total
prep 11.8% (14) conj 4.2% (5) adv 0.8% (1)
noun 34.5% (41) adj 11.8% (14) pron 5.0% (6)
nominalizations 4 % (5)
sentence beginnings:
subject opener: noun (2) pron (2) pos (0) adj (0) art (2) tot 75%
prep 25% (2) adv 0% (0)
verb 0% (0) sub_conj 0% (0) conj 0% (0)
expletives 0% (0)
Prior to checking the file, style runs deroff to remove the troff requests and macros. By default, ms macro package macros are removed, and the -mm option switches to the mm macro package. The use of other macro packages or your own could cause style to break sentences improperly and calculate readability incorrectly. The -a option prints the individual sentences with their readability information. The typical use of style is as follows: style inputfilename > style.out Using grepThe grep command is an invaluable aid to writers. You use it primarily for checking the organization of a file or collection of files, and for finding occurrences of a character string. Checking the Organization of a DocumentIf you're writing a long, complex document--especially one that uses three or more levels of headings--you can make sure that your heading levels are correct and also produce a rough outline of your document at the same time.
For example, if your heading macros take the form .H n "heading" a first-level heading might be .H 1 "Introduction to the XYZZY System" If your chapters are named ch01, ch02, and so on through chn, the following command searches all your chapter files for all instances of the .H macros. It also prints the filename and the line that contains the .H macro in a file called outline. $ grep "\.H " ch* > outline You need the backslash to escape the special meaning of the period. You need the space after H so that you don't inadvertently include another macro or macros with names such as .HK or .HA. The quotation marks are used to include that space. You can view your outline file by using vi, or you can print it. At a glance, you can see whether you mislabeled a heading in Chapter 1, omitted a third-level heading in Chapter 4, and so forth. You also have an outline of your entire document. Of course, you can edit the outline file to produce a more polished version. Finding Character StringsIf you just finished a 1,000-page novel and suddenly decide--or are told by your editor--to change a minor character's name from Pansy to Scarlett, you might edit every one of your 63 files, search for Pansy, and change it to Scarlett. But Scarlett is not in every chapter--unless you just wrote Scarlett II. So why aggravate yourself by editing all 63 files when you need to change only six? grep can help you. To use grep to find out which files contain the string Pansy, enter the following: $ grep "Pansy" ch* > pansylist Here, the quotation marks are not strictly necessary, but always using them is a good habit. In other situations, such as the previous example, you need them. This command creates a file called pansylist, which looks something like this: ch01:no longer sure that Pansy was ch01:said Pansy. ch07:wouldn't dream of wearing the same color as Pansy O'Hara. ch43:Pansy's dead. Pansy O'Hara is dead. ch57:in memory of Pansy. The flowers were deep purple and yellow Now you know which chapters you have to edit: 1, 7, 43, and 57. To change Pansy to Scarlett globally, edit one of the files that contains the string Pansy and enter the following command. Make sure that you are in Command, not Insert mode. :1,$ s/Pansy/Scarlett/g The g at the end of this code line is important. If the string Pansy occurs more than once in a line, as it does in Chapter 43, g ensures that all instances are changed to Scarlett.
Searching the Spelling Dictionary for WordsIf you are stuck with a word or are unsure of the spelling of a word, you can search the same dictionary that the spell and ispell commands use. The dictionaries are usually stored in /usr/dict/words or /usr/share/lib/dict/words. If you're trying to remember the spelling of the word queue, for example, and can remember only that it begins with que, you can use the following grep command: grep "^que" /usr/dict/words Because grep searches the entire line, and you're looking for a word that begins with que, you use the caret symbol (^) to force the search only from the beginning of the line. You can also search the explain command's thesaurus the same way (usually /usr/lib/explain.d). For more information about grep, see Chapter 5, Volume 1, "General Commands." Using sedThe UNIX stream editor, sed, provides another method of making global changes to one or more files.
You can use sed in two ways: with the editing commands on the command line or with a sed script. You can create a script that changes all occurrences of the first argument to the second argument, for example. However, you probably don't want to go to all this trouble just to change Pansy to Scarlett. Because you can specify more than one command with sed--in the command-line form and in the sed script form--sed is a useful and powerful tool. Instead of your having to manually edit each chapter requiring the change (with vi), you can set up sed to change all the appropriate chapters. Using diffmkdiffmk comes from the many diff commands offered by the UNIX system. Its purpose is to apply difference marks to text--that is, to mark text that has changed from one version of a file to another. The text is marked with a vertical bar in the right margin. Sometimes, other characters creep in, especially with tables. Use diffmk as follows: $ diffmk oldfile newfile difffile The order is important. If you get it wrong, diffmk blithely prints your old file with the difference marks on it. That probably is not what you want. Often your files are in two different directories--possibly because the files have the same names. Suppose that you have a ch01 in the draft2 directory and in the draft3 directory. You can specify a pathname for diffmk, and you can even write the difference-marked files into a third directory. The third directory must already exist; diffmk does not create it for you. The following command difference-marks files in two directories and writes them into a third directory. It assumes that your current directory is draft3. $ diffmk ../draft2/file1 file1 ../diffdir/dfile1 If you have many chapters, you might want to consider a shell script. To create a shell script that difference marks all the files in the draft3 directory against the files in the draft2 directory, follow these steps:
$ ls > difflist
for i in ´´cat difflist´´ do diffmk ../draft2/$i $i ../diffdir/d$i done
$ chmod +x diffscript
$ mv diffscript $HOME/bin
$ diffscript The man CommandThe man command consults a database of stored UNIX system commands--basically everything that is in the system reference manuals--uses nroff to process, and then writes it to your screen. If you do not have all the documentation on a shelf in your office, the man command can save the day. man is simple to use: man commandname The output is far from beautiful, and is slow. It is paged to your screen, so you must press Enter when you're ready to go on to the next page. You cannot backtrack, though. After you leave the first screen--that is, the one with the command syntax on it--the only way you can see it again is to run the man command a second time. If your terminal has windowing or layering capabilities, man is more useful because you can look at it and type on your command line at the same time. You can also print the output from man, but you might not know which printer the output is going to. If you work in a multi-printer environment, knowing which printer to use can be a nuisance. Check with your system administrator. Using SCCS to Control DocumentationAlthough the Source Code Control System--SCCS for short--was written to keep track of program code, it also makes a good archiving tool for documentation. It saves each version of a text file--code, troff input, and so on--and essentially enables only the owner to change the contents of the file. SCCS is described in detail in Chapter 26, "Introduction to SCCS." You can use SCCS to control versions of a document that you often revise. You can also use SCCS on drafts of a document. If you work with a publications group, and your group does not have a good archiving and document control system, look into SCCS. deroff, or Removing All Traces of nroff/troffSometimes documentation customers want electronic files as well as hard copy. If they cannot handle troff, they may request ASCII files from you. You can comply with this request in one simple way: use deroff. deroff removes all troff requests, macros, and backslash constructs. It also removes tbl commands (that is, everything between the .TS and the .TE), equation commands (everything between the .EQ and the .EN or between the defined eqn delimiters). It can follow a chain of included files, so if you sourced in a file with .so or .nx, deroff operates on these files, too. You can suppress this feature by using the -i option, which simply removes the .so and .nx lines from your file. Other options are -mm and -ml. -mm completely deletes any line that starts with a macro, which means that all your headings disappear. The -ml option invokes -mm and removes all lists. This may be just as well. deroff does not work well with nested lists. deroff, like nroff and troff, can process multiple files. To use deroff, enter the following: $ deroff options inputfilename > outputfilename By default, ms macro package macros are removed, and the -mm option switches to the mm macro package. If you forget to redirect the output to a file, you see the denuded, deroffed file streaking across your screen. SummaryUNIX provides a wide variety of tools for the software developer. It also provides many tools for the documentation developer. Some of the tools are useful to both. Developing documentation is no easier than anything else within the UNIX system, but these powerful tools can enable you to print just about anything you can imagine. From the simplest print command (lp) to the complexities of preprocessors, troff, and postprocessors, you can control the process and achieve outstanding results. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|