Niels Ott

Computational Linguist
Phantom Readability Library

Traditional Readability Formulas in Java Easily

Phantom is a library for computing readability measures developed by Niels Ott. It was originally a modification of Larry Ogrodnek's Java Fathom but the released version does not share too much with the original any more. Phantom includes a Java port of a syllable counter written in Perl by Laura Kassner. There is some regular expression-based language processing on board for sentence counting and tokenization. The library is designed for maximum flexibility, allowing you to make use of your own NLP analysis. It is not a requirement to use the built-in analysis components.

To play with a demo GUI, check out this little Java Web Start demo! If your browser is unsure what to do with the file, tell it to feed the file into javaws. The following measures can be computed:

  • Automated Readability Index (ARI)
  • Coleman-Liau Index
  • Flesch-Kincaid
  • Flesch Reading Ease
  • FORCAST
  • Gunning Fog Index
  • Läsbarhetsindex (LIX)
  • Simple Measure of Gobbledygook (SMOG)

The formulas in this library have all been checked with the corresponding original publications during the process of writing my MA thesis. In the thesis, there are explanations and references for each and every formula, just skim through chapter 2.2

Usage Examples

The simplest way if computing readability scores is the following:
String text = "...";
Readability r = new Readability(text);
System.out.println(r.calcFlesch());

Readability can also be instantiated with an instance of TextStats instead of a string. This can be obtained from TextAnalyzer which accepts various levels of given analyses. For example, it can be fed with a list of tokens.

System Requirements

The Phantom Readability Library requires Java 1.5 or newer.

Download

Please be aware that this package is released under the terms of the General Public License v.2.

TODO/Open Issues

  • The JavaDoc needs some reworking, it is incomplete and sometimes lousy.
Posted by Niels Ott • 2009-09-25