.\" Automatically generated by Pod::Man v1.37, Pod::Parser v1.32 .\" .\" Standard preamble: .\" ======================================================================== .de Sh \" Subsection heading .br .if t .Sp .ne 5 .PP \fB\\$1\fR .PP .. .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. | will give a .\" real vertical bar. \*(C+ will give a nicer C++. Capital omega is used to .\" do unbreakable dashes and therefore won't be available. \*(C` and \*(C' .\" expand to `' in nroff, nothing in troff, for use with C<>. .tr \(*W-|\(bv\*(Tr .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' 'br\} .\" .\" If the F register is turned on, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . nr % 0 . rr F .\} .\" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .hy 0 .if n .na .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ======================================================================== .\" .IX Title "TEXTMAIL 1" .TH TEXTMAIL 1 "2007-08-02" "perl v5.8.8" "User Contributed Perl Documentation" .SH "NAME" \&\fItextmail\fR \- mail filter to replace MS Word/HTML attachments with plain text .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 28 \& usage: textmail [options] \& options: \& -h - Print the help message then exit \& -m - Print the manpage then exit \& -w - Print the manpage in html format then exit \& -r - Print the manpage in nroff format then exit \& -M - Output in mailbox format (mboxrd) \& -T - Output in raw mail format (for smtp) \& -W - Don't replace MS Word attachments with text \& -E - Don't replace MS Excel attachments with csv \& -H - Don't replace HTML attachments with text \& -R - Don't replace RTF attachments with text \& -P - Don't replace PDF attachments with text \& -U - Don't translate winmail.dat attachments \& -L - Don't reduce appledouble attachments \& -I - Don't delete image attachments \& -A - Don't delete audio attachments \& -V - Don't delete video attachments \& -X - Don't delete MS Windows executable attachments \& -B - Don't recode text that was base64-encoded \& -S - Don't replace spaces in filenames with underscores \& -Z - Do translate signed content (discards signatures) \& -O - Delete all application/octet-stream attachments \& -! - Delete all application/* attachments \& -D hdrs - Delete headers (list of header prefixes and filenames) \& -K types - Keep attachments (list of mimetypes and filenames) \& -f - On translation error, keep translation, not original \& -? - Print paths of helper applications then exit .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" \&\fItextmail\fR filters a mail message or mbox, replacing \s-1MS\s0 Word, \s-1MS\s0 Excel, \s-1HTML\s0, \s-1RTF\s0 and \s-1PDF\s0 attachments with the plain text contained therein. By default, the following attachments are also deleted: image, audio, video and \s-1MS\s0 Windows executables. \s-1MS\s0 \f(CW\*(C`winmail.dat\*(C'\fR attachments are replaced by any attachments contained therein which are then replaced by text or deleted in the same fashion. Any of these actions can be suppressed with the command line options. Mail headers can also be selectively deleted. .PP This is useful for increasing the accessibility of mail messages (by reducing their dependence on proprietary file formats), for dramatically reducing their size (and the time it takes to download them and the time it takes to read them), and for dramatically reducing the risk of mail-borne viruses. Its intended use is as a preprocessor for mailing lists. This is more friendly than a strict \*(L"No Attachments\*(R" policy. .SH "OPTIONS" .IX Header "OPTIONS" .ie n .IP """\-h""" 4 .el .IP "\f(CW\-h\fR" 4 .IX Item "-h" Print the help message then exit. .ie n .IP """\-m""" 4 .el .IP "\f(CW\-m\fR" 4 .IX Item "-m" Print the manpage then exit. This is equivalent to executing \f(CW\*(C`man textmail\*(C'\fR but this works even when the manpage isn't installed. .ie n .IP """\-w""" 4 .el .IP "\f(CW\-w\fR" 4 .IX Item "-w" Print the manpage in html format then exit. This lets you install the manpage in html format with a command like: .Sp .Vb 2 \& mkdir -p /usr/local/share/doc/textmail/html && \& textmail -w > /usr/local/share/doc/textmail/html/textmail.1.html .Ve .ie n .IP """\-r""" 4 .el .IP "\f(CW\-r\fR" 4 .IX Item "-r" Print the manpage in nroff format then exit. This lets you install the manpage with a command like: .Sp .Vb 1 \& textmail -r > /usr/local/share/man/man1/textmail.1 .Ve .ie n .IP """\-M""" 4 .el .IP "\f(CW\-M\fR" 4 .IX Item "-M" This option causes the output to be in mboxrd format by adding a mailbox \&\f(CW\*(C`From\*(C'\fR line at the top if there isn't one already and ensures that there is a blank line at the bottom of the output. It also performs mailbox quoting on any lines in the body that look like mailbox \f(CW\*(C`From\*(C'\fR headers. Use this when the output is to be stored directly in a mailbox file. It is not necessary when \fItextmail\fR is being used as a mail filter by \fI\fIprocmail\fI\|(1)\fR. .ie n .IP """\-T""" 4 .el .IP "\f(CW\-T\fR" 4 .IX Item "-T" This option causes the output to be in raw mail format by removing any mailbox \f(CW\*(C`From\*(C'\fR line and by not performing mailbox quoting. Use this when the output is to be sent directly to an \s-1SMTP\s0 server. It is not necessary when \fItextmail\fR is being used as a mail filter by \fI\fIprocmail\fI\|(1)\fR. .ie n .IP """\-W""" 4 .el .IP "\f(CW\-W\fR" 4 .IX Item "-W" By default, \fItextmail\fR replaces \s-1MS\s0 Word attachments with inline plain text attachments that contain just the plain text within the original document. This option leaves \s-1MS\s0 Word attachments intact. .ie n .IP """\-E""" 4 .el .IP "\f(CW\-E\fR" 4 .IX Item "-E" By default, \fItextmail\fR replaces \s-1MS\s0 Excel attachments with \s-1CSV\s0 file attachments that contain just the data within the original document. This option leaves \s-1MS\s0 Excel attachments intact. .ie n .IP """\-H""" 4 .el .IP "\f(CW\-H\fR" 4 .IX Item "-H" By default, \fItextmail\fR replaces \s-1HTML\s0 attachments with inline plain text attachments that contain just the text within the original document. It also reduces text-versus-html alternative attachments to just the text attachment. This option leaves \s-1HTML\s0 (and alternative) attachments intact. .ie n .IP """\-R""" 4 .el .IP "\f(CW\-R\fR" 4 .IX Item "-R" By default, \fItextmail\fR replaces \s-1RTF\s0 attachments with inline plain text attachments that contain just the plain text within the original document. This option leaves \s-1RTF\s0 attachments intact. .ie n .IP """\-P""" 4 .el .IP "\f(CW\-P\fR" 4 .IX Item "-P" By default, \fItextmail\fR replaces \s-1PDF\s0 attachments with inline plain text attachments that contain just the plain text within the original document. This option leaves \s-1PDF\s0 attachments intact. .ie n .IP """\-U""" 4 .el .IP "\f(CW\-U\fR" 4 .IX Item "-U" By default, \fItextmail\fR replaces \s-1MS\s0 \s-1TNEF\s0 (i.e. \f(CW\*(C`winmail.dat\*(C'\fR) attachments with the attachments contained therein which are then translated to text as normal. This option leaves \f(CW\*(C`winmail.dat\*(C'\fR attachments intact. This option, together with the \f(CW\*(C`\-!\*(C'\fR option will cause winmail.dat attachments to be deleted rather than translated. .ie n .IP """\-L""" 4 .el .IP "\f(CW\-L\fR" 4 .IX Item "-L" By default, \fItextmail\fR replaces \f(CW\*(C`multipart/appledouble\*(C'\fR attachments with just the data fork attachment contained therein which is then translated to text as normal. This option leaves appledouble attachments intact. However, the data fork attachment will still be translated as normal resulting in a probably inappropriate and possibly broken resource fork attachment. Therefore, this option should probably only be used in conjunction with other options that suppress the translation of the data fork attachment. .ie n .IP """\-I""" 4 .el .IP "\f(CW\-I\fR" 4 .IX Item "-I" By default, \fItextmail\fR deletes image attachments. This option leaves image attachments intact. .ie n .IP """\-A""" 4 .el .IP "\f(CW\-A\fR" 4 .IX Item "-A" By default, \fItextmail\fR deletes audio attachments. This option leaves audio attachments intact. .ie n .IP """\-V""" 4 .el .IP "\f(CW\-V\fR" 4 .IX Item "-V" By default, \fItextmail\fR deletes video attachments. This option leaves video attachments intact. .ie n .IP """\-X""" 4 .el .IP "\f(CW\-X\fR" 4 .IX Item "-X" By default, \fItextmail\fR deletes attachments containing \s-1MS\s0 Windows executables. That means \f(CW\*(C`application/octet\-stream\*(C'\fR attachments with the following filename extensions: \f(CW\*(C`com\*(C'\fR, \f(CW\*(C`exe\*(C'\fR, \f(CW\*(C`pif\*(C'\fR, \f(CW\*(C`dll\*(C'\fR, \f(CW\*(C`ocx\*(C'\fR, \&\f(CW\*(C`scr\*(C'\fR, \f(CW\*(C`vbs\*(C'\fR and \f(CW\*(C`js\*(C'\fR. This option leaves \s-1MS\s0 Windows executable attachments intact. To delete \f(CW\*(C`zip\*(C'\fR files as well, you could use either the \&\f(CW\*(C`\-O\*(C'\fR option or the \f(CW\*(C`\-!\*(C'\fR option. .ie n .IP """\-B""" 4 .el .IP "\f(CW\-B\fR" 4 .IX Item "-B" By default, when text is encountered that is \f(CW\*(C`base64\*(C'\fR\-encoded, \fItextmail\fR will recode it as either \f(CW\*(C`7bit\*(C'\fR or \f(CW\*(C`quoted\-printable\*(C'\fR, whichever is appropriate. This option suppresses this recoding. Note that if the text is large enough and contains a high enough proportion of non-ASCII characters, it will remain \f(CW\*(C`base64\*(C'\fR\-encoded to minimise space. .ie n .IP """\-S""" 4 .el .IP "\f(CW\-S\fR" 4 .IX Item "-S" When translating attachments, \fItextmail\fR replaces bad filename characters such as space characters with the underscore character. This option causes underscore characters to subsequently be converted into space characters. In other words, you can use this option to preserve space characters in attachment filenames (other bad filename characters will then be converted to spaces as well). .ie n .IP """\-Z""" 4 .el .IP "\f(CW\-Z\fR" 4 .IX Item "-Z" By default, \fItextmail\fR will not translate \f(CW\*(C`multipart/signed\*(C'\fR attachments. This option causes \f(CW\*(C`multipart/signed\*(C'\fR attachments to be replaced by the signed attachment contained therein, discarding the signature control data. The no-longer-signed data is then translated to text as normal. Note that \&\f(CW\*(C`multipart/encrypted\*(C'\fR attachments are never translated. .ie n .IP """\-O""" 4 .el .IP "\f(CW\-O\fR" 4 .IX Item "-O" Delete all \f(CW\*(C`application/octet\-stream\*(C'\fR attachments, not just \s-1MS\s0 Windows executables. Note that this overrides \f(CW\*(C`\-X\*(C'\fR but \f(CW\*(C`\-K\*(C'\fR overrides this. .ie n .IP """\-!""" 4 .el .IP "\f(CW\-!\fR" 4 .IX Item "-!" Delete all \f(CW\*(C`application/*\*(C'\fR attachments. Note that this overrides \f(CW\*(C`\-X\*(C'\fR but \&\f(CW\*(C`\-K\*(C'\fR overrides this. Also note that translated documents are no longer \&\f(CW\*(C`application/*\*(C'\fR attachments so they aren't deleted unless their translation is suppressed with the appropriate command line option. .ie n .IP """\-D""\fR \fIhdrs" 4 .el .IP "\f(CW\-D\fR \fIhdrs\fR" 4 .IX Item "-D hdrs" Delete particular headers. The \fIhdrs\fR argument is a comma separated list of header name prefixes and/or the names of files containing header name prefixes (blank lines, whitespace and shell style comments are ignored). For example, \f(CW\*(C`textmail \-DX\-\*(C'\fR deletes all headers whose names begin with \f(CW\*(C`X\-\*(C'\fR. .ie n .IP """\-K""\fR \fItypes" 4 .el .IP "\f(CW\-K\fR \fItypes\fR" 4 .IX Item "-K types" By default, \fItextmail\fR deletes several types of non-text attachment. The \&\f(CW\*(C`\-O\*(C'\fR and \f(CW\*(C`\-!\*(C'\fR options delete even more. This option specifies, by mimetype and/or filename extension, a list of attachments not to delete. This overrides all deletions. .Sp The \fItypes\fR argument is a comma separated list of mimetypes and/or filename extensions and/or the names of files containing mimetypes and/or filename extensions (blank lines, whitespace and shell style comments are ignored). Note that the elements are interpreted as a complete mimetype, if they contain a slash character, or as either the \f(CW\*(C`*\*(C'\fR in \f(CW\*(C`application/*\*(C'\fR or as a filename extension if they do not contain a slash character. For example, \&\f(CW\*(C`textmail \-Wf!Kdoc\*(C'\fR deletes all \f(CW\*(C`application/*\*(C'\fR attachments except \s-1MS\s0 Word documents. .ie n .IP """\-f""" 4 .el .IP "\f(CW\-f\fR" 4 .IX Item "-f" Whenever \fItextmail\fR is unable to translate any attachment into text, it will leave the attachment intact. This happens when the requisite translation software can't be found, when it runs but returns an error code, and when it produces an empty file. It also happens when \f(CW\*(C`winmail.dat\*(C'\fR attachments are corrupt. This option causes the empty translation to take the place of the original attachment. Only the name of the attachment is preserved. This is needed to ensure plain text even in the face of an \s-1MS\s0 Word document that contains no text (e.g. only images). .ie n .IP """\-?""" 4 .el .IP "\f(CW\-?\fR" 4 .IX Item "-?" Print the paths of all helper applications then exit. .SH "EXAMPLES" .IX Header "EXAMPLES" A \fI\fIprocmail\fI\|(1)\fR recipe that insists on pure text and no \f(CW\*(C`X\-\*(C'\fR headers (with output in mailbox format): .PP .Vb 2 \& :0 fw \& | textmail -Mf!DX- .Ve .PP Do the same but to an existing mailbox file: .PP .Vb 1 \& textmail -Mf!DX- < mailbox > mailbox-as-text .Ve .PP Delete all \f(CW\*(C`application/*\*(C'\fR attachments except for PostScript and \s-1PDF\s0 (and don't translate \s-1PDF\s0 into text): .PP .Vb 1 \& textmail -!PKps,pdf .Ve .PP Delete all \f(CW\*(C`application/*\*(C'\fR attachments except for zip files and gzipped tar files: .PP .Vb 1 \& textmail -!Ktar.gz,zip .Ve .PP A \fI\fIprocmail\fI\|(1)\fR recipe that just unpacks winmail.dat attachments but doesn't translate the attachments contained therein into text and doesn't delete windows executables (with output in mailbox format): .PP .Vb 2 \& :0 fw \& | textmail -MWEHRPLIAVXS .Ve .SH "REQUIREMENTS" .IX Header "REQUIREMENTS" \&\s-1MS\s0 Word and \s-1RTF\s0 documents are translated into plain text using \&\fI\fIantiword\fI\|(1)\fR or \fI\fIcatdoc\fI\|(1)\fR. If \fItextmail\fR can't find \fI\fIantiword\fI\|(1)\fR or \&\fI\fIcatdoc\fI\|(1)\fR, then \s-1MS\s0 Word and \s-1RTF\s0 attachments are left intact. So make sure that \fI\fIantiword\fI\|(1)\fR or \fI\fIcatdoc\fI\|(1)\fR is installed and in the \f(CW$PATH\fR. .PP \&\s-1MS\s0 Excel documents are translated into csv files using \fI\fIxls2csv\fI\|(1)\fR. If \&\fItextmail\fR can't find \fI\fIxls2csv\fI\|(1)\fR, then \s-1MS\s0 Excel attachments are left intact. So make sure that \fI\fIxls2csv\fI\|(1)\fR is installed and in the \f(CW$PATH\fR. .PP \&\s-1HTML\s0 documents are translated into plain text using \fI\fIlynx\fI\|(1)\fR. If \&\fItextmail\fR can't find \fI\fIlynx\fI\|(1)\fR, then \s-1HTML\s0 attachments are left intact. So make sure that \fI\fIlynx\fI\|(1)\fR is installed and in the \f(CW$PATH\fR. .PP \&\s-1PDF\s0 documents are translated into plain text using \fI\fIpdftotext\fI\|(1)\fR. If \&\fItextmail\fR can't find \fI\fIpdftotext\fI\|(1)\fR, then \s-1PDF\s0 attachments are left intact. So make sure that \fI\fIpdftotext\fI\|(1)\fR is installed and in the \f(CW$PATH\fR. .PP \&\fItextmail\fR also requires \fI\fIperl\fI\|(1)\fR and \fI\fIpod2man\fI\|(1)\fR and \fI\fIpod2html\fI\|(1)\fR (which come with \fI\fIperl\fI\|(1)\fR) and \fI\fImktemp\fI\|(1)\fR. .PP If \fItextmail\fR fails to create a temporary directory, or if it is instructed to do nothing (i.e. \f(CW\*(C`\-WEHRPULIAVX\*(C'\fR), then it degenerates into \fI\fIcat\fI\|(1)\fR. .SH "CAVEAT" .IX Header "CAVEAT" The latest version of \fI\fIxls2csv\fI\|(1)\fR at the time of writing (i.e. catdoc\-0.93.3) loses data. .PP If \fItextmail\fR is unable to create a temporary directory (in \f(CW\*(C`/tmp\*(C'\fR), then it degenerates into \fI\fIcat\fI\|(1)\fR. Without a temporary directory, no attachments will be translated or deleted no matter what options (even \f(CW\*(C`\-f\*(C'\fR) were given to \fItextmail\fR. So make sure that \f(CW\*(C`/tmp\*(C'\fR is writable. Also make sure that \&\fI\fImktemp\fI\|(1)\fR is available otherwise an insecure temporary directory will be created. .SH "SEE ALSO" .IX Header "SEE ALSO" \&\fI\fIprocmail\fI\|(1)\fR, \&\fI\fIantiword\fI\|(1)\fR, \&\fI\fIcatdoc\fI\|(1)\fR, \&\fI\fIxls2csv\fI\|(1)\fR, \&\fI\fIlynx\fI\|(1)\fR, \&\fI\fIpdftotext\fI\|(1)\fR, \&\fI\fIpod2man\fI\|(1)\fR, \&\fI\fIpod2html\fI\|(1)\fR, \&\f(CW\*(C`http://raf.org/minimail/\*(C'\fR .SH "AUTHOR" .IX Header "AUTHOR" 20070803 raf .SH "URL" .IX Header "URL" \&\f(CW\*(C`http://raf.org/textmail/\*(C'\fR