aboutsummaryrefslogtreecommitdiff

languagetool

LanguageTool image with ngrams auto-download

Usage

The Server is running on port 8010, this port should exposed.

docker pull docker.io/chn2guevara/languagetool
[...]
docker run --rm -p 8010:8010 docker.io/chn2guevara/languagetool

Route information can be found at https://languagetool.org/http-api/swagger-ui/#/default, an easy route to test that it's running is /v2/languages.

Configuration

Java heap size

You can set any Java related option using the JAVAOPTIONS environment variable.

docker run --rm -it -p 8010:8010 -e JAVAOPTIONS="-Xmx382M" docker.io/chn2guevara/languagetool

HTTPServerConfig

Any environment variable prefixed with LT_ is interpreted as an HTTPServerConfig option.

docker run --rm -it -p 8010:8010 -p 9301:9301 \
  -e LT_prometheusMonitoring=true \
  docker.io/chn2guevara/languagetool
[...]

curl -s localhost:9301 | grep -v '^\s*$\|^\s*\#'
languagetool_check_matches_total{language="en",mode="ALL",} 1.0
languagetool_threadpool_queue_size{pool="lt-server-thread",} 0.0
[...]

n-gram dataset support

To support ngrams you need an additional volume or directory mounted to the /ngrams directory.

docker run ... -v /foo:/ngrams ...

Automatic download

This image can take care of the initial download of any ngram supported language as well as updates. Mount a directory or volume to /ngrams and use the NGRAM_LANGUAGES environment variable to pass a comma separated string with languages:

docker run ... -v /path/to/ngrams:/ngrams -e NGRAM_LANGUAGES="en,es" ...

Manual download

Download and unzip any language with the commands:

mkdir ngrams
wget https://languagetool.org/download/ngram-data/ngrams-en-YYYYMMDD.zip
(cd ngrams && unzip ../ngrams-en-YYYYMMDD.zip)
rm -f ngrams-en-YYYYMMDD.zip

It's important that the directory structure ends up looking like:

ngrams/
 en/
  ...
 es/
  ...