The files are in the repository under ldawn. The repository is named wnp. To access it, follow the instructions here. You'll need to be approved by the repository owner, who is JBG. You'll need gsl installed. If you don't have root on a machine and can't add to the normal include directory, look at the "make jbg" entry in the Make file to see how to point to a different directory. In MSVC, you'll need to look at [this http://www.sourceware.org/ml/gsl-discuss/2004-q2/msg00000.html] to get GSL linked up.
You'll also need some data files, which can be found [here http://www.cs.princeton.edu/~jbg/wn/ldawn/]. They also require some libraries from the py-evo-feat directory in the wnp archive, which can be accessed by adding it to the python path.
- Creates a mixture model of topic walks; still not working completely
- Given the stem of inference synsets (e.g. "inf-synset."), creates a report on the accuracy, report.out.
- The main file, from which all other functions are called
- Reads in the WordNet information and serves as the basis for the topic walks
- The topic walk parameters that exist on top of the WN class
- An individual path through WN that ends in a synset
- The BNC corpus split into paragraphs. Words occurring fewer than 10 times were excluded, as were paragraphs with fewer than five terms (although those terms were counted toward the frequency ... this was done because some headers were counted as paragraphs). Uses bnc-par
- The SemCor corpus split into paragraphs. Uses the same vocab and word files as bnc-par.dat.
- The entropy after the N th round
- The alpha parameter of the model
- The beta parameter of the model
- The Nth topic parameters of the TopicWalk
How to Run
I'll add more soon. Until then, after compiling with make, run ./ldawn -help to see all the options.
It's easier just to show examples.
- ./ldawn -modelName five -numTopics 5
Run the LDA topic walk with five topics and write the output to "five"
A part of Wordnet_plus