Main Page

From NL-Soar

Jump to: navigation, search

Welcome to the NL-Soar Research Group of the Brigham Young University Department of Linguistics


Contents

News

  • Nothing so far...

Bibliography

Publications

Presentations

People

Faculty

Students

  • Joshua Heaton
  • Jeremiah McGhee
  • Ross Hendrickson
  • Carl Christensen
  • Julie Ingleby

Alumni

  • Merrill Hutchison
  • Warren Casbeer
  • LaReina Hingson
  • Jamison Cooper-Leavitt
  • Jon Dehdari
  • Rebecca Rees Madsen
  • Clint Tustison
  • Jason Smith
  • Jarren Bodily
  • Mike Manookin
  • Aric Bills
  • Rudy Smith
  • Nick Stetich
  • William Taysom
  • Tim Richards
  • Anton Rytting
  • Nate Blaylock

Projects

Location

We meet in 4186 JFSB (English Department Conference Room) on Mondays from 12-1. Come join us!

Resources

  • Cognitive modeling of language use...
  • CoNLL 2007 - with links to some recent papers on incremental parsing

Links

Tools

List of tools installed on our server

Data

Help

  • Consult the User's Guide for information on using the wiki software.
  • Using math on this wiki: [1]

Getting started

Useful NLP resources

Java-based


  • Morphology
    • Extended Porter stemmer for English [2]



  • POS tagging
    • Chinese: ICTCLAS (the Chinese Lexical Analysis System developed by the Institute of Computing Technologies, the Chinese Academia, Beijing), which is available (source codes and binary compiled for Windows) [3]
    • A TBL-based POS tagger as standalone java implementation built at UMIST (ref?).
    • The CLAWS tagger at the UCREL website at Lancaster: [4]



  • Semantics
    • Utool, the Swiss Army Knife of Underspecification is a tool for performing a variety of tasks related to underspecified processing of scope ambiguities. See [5].
    • Clustering and document classification [6].



  • Graph/structure unification
    • Vlado Keselj's report [7] has Java source code


  • Linguistic experiments
    • MiniJudge, a software tool for designing, running, and analyzing

small-scale linguistic judgment experiments. [8]. There are now two versions: MiniJudgeJS 1.0 and MiniJudgeJava 0.9.9, a version that runs in Java. Both versions pass the statistics over to R, the free software package.



  • Shallow parsers
    • Spejd [9], a shallow parser which allows for simultaneous partial syntactic parsing and rule-based morphosyntactic disambiguation. The release contains Java sources and binaries for French, English, Polish, and German.
    • UIMA (IBM's Unstructured Information Management Architecture) example code (Java) on recognizing dates, time, room numbers in many different formats. Based on regular expressions and patterns.



  • Bitext/alignment
    • A tool to extract bilingual translation memories for all 231 DGT-TM language pairs (JRC-Acquis corpus) is now available as Java byte code. See [10] and [11].




  • Pronoun resolution
    • JavaRAP (does only pronouns) [12]
  • Social network mining
    • IBM LanguageWare Multi-dimensional Miner for Socio-Semantic Networks is a

Java library which provides functionalities to mine data from multi-dimensional networks such as ontologies and social networks (or, more generally, multidimensional networks of people, concepts and digital artifacts) as well as traditional WordNet type thesauri. In addition to these low level functionalities tied into the semantic-based miner allow your applications to perform lexical analysis, context sensitive concept disambiguation, keyword extraction and focus identification. [13]

    • CMU's AutoMap [14] is software for Network Text Analysis. It supports the extraction and analysis of the structure of social and organizational systems from texts, and individual and team mental models.



NL-Soar Group Private Wiki

Personal tools