The Charniak and Charniak-Johnson Reranking parsers are syntactic parsers of natural languages (e.g. English), both developed at the Brown Laboratory for Linguistic Information Processing (BLLIP); see their resource page for more information (the parsers can be downloaded directly from ftp://ftp.cs.brown.edu/pub/nlparser).
Both parsers, implemented in C++, are distributed as source code and, in order to use them, one obviously needs to compile them first. On modern machines this may now raise some problems and this note is supposed to help to address the issues.
On Linux platforms (I tested on Ubuntu 11.04) a possible solution, reported on the corpora mailing list, is to use the g++ 3.3 compiler instead of the now commonly used 4.x version. An alternative is to use the patch prepared by Nitin Madnani; this changes the source code of the reranking parser so that it can compile on modern 64-bit Linux distributions.
On Mac OS X we can compile both parsers by making small changes to the source code
(thanks to Brett Powley for investigating this with me).
In the case of the Charniak parser (parser05Aug16) you need to add
#include "GotIter.h"
to parser05Aug16/PARSE/BchartSm.C
To compile the content of the TRAIN directory,
you need to change a part of the line 310 in rCounts.C from
(int)sbrk(0)
to
(long)sbrk(0)
This is necessary because we now (by default) compile a 64-bit version
and a (void*) doesn't cast nicely into an (int) in 64 bits,
but conversion to (long) and then to (int) compiles well.
In the case of the reranking parser (reranking-parserAug06)
change the line 128 of reranking-parser/first-stage/PARSE/parseIn.C from
int id = (int)arg;
to
int id = (long)arg;
However, before compiling the parser, we need to clean a bit first,
otherwise we are likely to get the following error when trying to run the parser:
./parse.sh: line 6: second-stage/programs/features/best-parses: cannot execute binary file
as second-stage/programs/features/best-parses is not recompiled;
so run make clean before executing make.
Then, when trying to run the reranking parser I got two error messages from zcat:
zcat: second-stage/models/ec50spfinal/features.gz.Z: No such file or directory
zcat: second-stage/models/ec50spfinal/cvlm-l1c10P1-weights.gz.Z: No such file or directory
although the file reranking-parser/parse.sh specified features.gz and cvlm-l1c10P1-weights.gz
and these files existed in the relevant directory.
This did not seem to influence the parsing results, nevertheless renaming the files (adding the .Z extension)
and making the corresponding changes to reranking-parser/parse.sh stopped zcat to complain.
Alternatively, you can use gzcat instead of zcat in Makefiles.
If you want to run the parser from a directory other than reranking-parser,
edit the parse.sh script and add `dirname $0`/ before
each call and directory, so that the complete command is:
`dirname $0`/first-stage/PARSE/parseIt -l399 -N50 `dirname $0`/first-stage/DATA/EN/ $* | `dirname $0`/second-stage/programs/features/best-parses -l `dirname $0`/$MODELDIR/features.gz.Z `dirname $0`/$MODELDIR/$ESTIMATORNICKNAME-weights.gz.Z