Article 7876 of comp.lang.perl:
Xref: feenix.metronet.com comp.lang.perl:7876
Path: feenix.metronet.com!news.utdallas.edu!wupost!howland.reston.ans.net!spool.mu.edu!caen!msuinfo!netnews.upenn.edu!mjd
From: mjd@saul.cis.upenn.edu (Mark-Jason Dominus)
Newsgroups: comp.lang.perl
Subject: perl code to read and parse GNU `info' files
Message-ID: <MJD.93Nov12114811@saul.cis.upenn.edu>
Date: 12 Nov 93 16:48:11 GMT
Sender: news@netnews.upenn.edu
Organization: University of Pennsylvania
Lines: 134
Nntp-Posting-Host: saul.cis.upenn.edu


This is very hacky, but seems work, at least in a limited domain.

To use the functions:  Call

	&open_info_file(INFO_FILENAME);

to open the filehandle `INFO' to the named info file.
Then call 

	&get_next_node;

repeatedly to read the next node in the info file; the variables
	$info_file
	$info_node
	$info_prev
	$info_next
	$info_up

are set if the corresponding fields appear in the node's header, and if
the node has a menu, it is loaded into %info_menu.  When `get_next-node'
returns false, you have reached end-of-file or there has been an error. 

It's normal for an info file to be split into parts; these functions
understand that.
	
    I am using it here to provide help-by-keyword: We have a texinfo
users' guide, and if a user enters the command `help foo' at the shell
prompt, a perl program based on these functions reads the users' guide
info files, finds the main index, looks up `foo' in the index, and
starts up the info browser pointing to the relevant node, if there is
one.

    Almost everything about these functions could be improved.  My
answer to any criticism of the form ``Your functions are bad,
because...'' is `Yes!'.  I post them because I am so lazy that I hope to
con interested people into improving them for me, and because I am so
humble I don't mind the whole world seeing my cruddy code.

Caveat delector.

# Read next node into global variables.  Assumes that file pointer is
# positioned at the header line that starts a node.  Leaves file
# pointer positioned at header line of next node.
# Programmer: note that nodes are separated by a "\n\037\n" sequence.
# Reutrn true on success, false on failure
sub read_next_node {
    undef %info_menu;
    $_ = <INFO>;		# Header line
    if (eof(INFO)) {
	return &start_next_part && &read_next_node;
    }

    ($info_file) = /File:\s*([^,]*)/;
    ($info_node) = /Node:\s*([^,]*)/;
    ($info_prev) = /Prev:\s*([^,]*)/;
    ($info_next) = /Next:\s*([^,]*)/;
    ($info_up)   = /Up:\s*([^,]*)/;

    $_ = <INFO> until /^(\* Menu:|\037)/ || eof(INFO);
    if (eof(INFO)) {
	return &start_next_part;
    } elsif (/^\037/) { 
	return 1;		# end of node, so return success.
    }

    # read menu
    local($key, $ref);
    while (<INFO>) {    
	return 1 if /^\037/;    # end of node, menu is finished, success.
	next unless /^\* \S/;   # next unless lines is a menu item
	if (/^\* ([^:]*)::/) {
	    $key = $ref = $1;
	} elsif (/^\* ([^:]*):\s*([^.]*)[.]/) {
	    ($key, $ref) = ($1, $2);
	} else {
	    print STDERR "Couldn't parse menu item\n\t$_";
	    next;
	}
	$info_menu{$key} = $ref;
    }
    # end-of-file also terminates the node successfully.
    # start up the next file before continuing.
    &start_next_part;
    return 1;
}
	
# Discard commentary before first node of info file
sub start_info_file {
    $_ = <INFO> until (/^\037/ || eof(INFO));
    return &start_next_part if (eof(INFO)) ;
    return 1;
}

# Look for next part of multi-part info file.  
# Return 0 (normal failure) if it isn't there---that just means
# we ran out of parts.  die on some other kind of failure.
sub start_next_part {
    local($path, $basename, $ext);
    if ($info_filename =~ /\//) {
	($path, $basename) = ( $info_filename =~ /^(.*)\/(.*)$/ );
    } else {
	$basename = $info_filename;
	$path = "";
    }
    if ($basename =~ /-\d*$/) {
	($basename, $ext) = ($basename =~ /^([^-]*)-(\d*)$/);
    } else {
	$ext = 0;
    }
    $ext++;
    $info_filename = "$path/$basename-$ext";
    close(INFO);
    if (! (open(INFO, "$info_filename")) ) {
	if ($! eq "No such file or directory") {
	    return 0;
	} else {
	    die "Couldn't open $info_filename: $!";
	}
    }
    return &start_info_file;
}

sub open_info_file {
    ($info_filename) = @_;
    (open(INFO, "$info_filename")) || die "Couldn't open $info_filename: $!";
    return &start_info_file;
}

--

--
 If you never did, / you should.  /  These things are fun / and fun is good.
Mark-Jason Dominus 	  			    mjd@central.cis.upenn.edu 


