cs.utexas.edu!convex!convex!tchrist Thu Jul 16 18:12:32 CDT 1992

>From the keyboard of nlane@well.sf.ca.us (Nathan D. Lane):
:Hello all,
:	Having struggled with this all weekend and not figured it out
:even with the help of the Camel Book, the manpage, or the FAQ, I've
:decided it's time to post.  Could someone please tell me how to remove
:characters such as the bullet and foreign characters from a file?  I'm
:trying to convert files from a CPT 8525 word processing system into a
:format that makes more sense on an IBM RS/6000 220 or 340 or a Sun 3
:or Sparc.  I need to remove invalid control characters and characters with
:the 8th bit set.  I don't want to remove linefeed (^J), however.  I'd
:LOVE to have it convert the codes to troff (or so my husband tells me :-)
:..any ideas?  If this is too trivial a question to answer with a post, I'd
:still really appreciate email.  Thanks in advance for *any* replies!

In general, it's hard to know how to fix up a file with wordprocessing
magic in it unless you've specialized in said magic.

But if all you want to do is throw away the stuff you don't recognize,
you can in-place edit files using perl this way:

    perl -i.bak -p -e 'y/\000-\200-\377//d'  file1 file2 file3 ...

which strips high-bit characters.  If you want to remove all the
nonprintables except for space and tab, you could do this:

    y/\000-\010\013-\037\177-\377//d;

I skipped characters 010 and 011 because they're \t and \n.

--tom

-- 
    Tom Christiansen      tchrist@convex.com      convex!tchrist

    signal(i, SIG_DFL); /* crunch, crunch, crunch */
        --Larry Wall in doarg.c from the perl source code


cs.utexas.edu!convex!convex!tchrist Thu Jul 16 18:12:45 CDT 1992

I wrote:

:    perl -i.bak -p -e 'y/\000-\200-\377//d'  file1 file2 file3 ...

But that won't even compile.  I meant to write something more like:

:    perl -i.bak -p -e 'y/\000-\037\200-\377//d'  file1 file2 file3 ...

--tom

-- 
    Tom Christiansen      tchrist@convex.com      convex!tchrist

    Real programmers can write assembly code in any language.   :-)  
                    --Larry Wall in  <8571@jpl-devvax.JPL.NASA.GOV>


