Clustering engine for millions of documents and gigabytes of textFree trial
Meaningful insights from large quantities of text
Lingo4G identifies clearly-labelled topics in millions of documents and gigabytes of text. In near-real-time, fully automatically, without external knowledge bases.
Explore topics and clusters
Get a birds-eye view of the topics discussed in your documents. Use the topics and clusters to plan, execute and refine your research.
Build custom apps
Use the REST API to build more complex apps, such as finding content-wise similar documents or nearest-neighbor classification.
Fast, automatic, easy to integrate
Once Lingo4G indexes your collection, it can extract topics, themes and document clusters within seconds.
Topic discovery takes seconds regardless of whether you're processing a hundred, a hundred of thousands of documents or the whole indexed collection.
No external taxonomies
Lingo4G processes documents based only on their textual content, no external dictionaries, taxonomies or databases required.
Stop word discovery
Lingo4G will automatically identify the meaningless words and phrases specific to your data, such as present invention for patent data.
Lingo4G exposes a JSON REST API you can call from any programming language.
The Lingo4G Explorer application will let you get started and tune clustering quickly.
Questions & Answers
The natural use case is exploration of large volumes of fairly static human-readable text, such as scientific papers, business or legal documents.
Out of the box, Lingo4G can give an instant overview of the topics discussed in the whole collection or in the requested subset of it and thus help the analysts to plan, execute and report on their research.
You can also use Lingo4G REST API to build more complex applications, such as recommendation of content-wise similar documents or nearest-neighbor classification.
The early adopters of Lingo4G have been successfully using it with collections of millions of documents spanning over 100 GB of text. If your collection is larger than that, please do get in touch for an evaluation license to see if Lingo4G can handle your data.
One important factor to consider is that currently Lingo4G does not offer distributed processing. This means that the maximum reasonable size of the project will be limited by the amount of RAM, disk space and processing power available on a single virtual or physical server.
Currently, Lingo4G can only process English text. If you'd like to apply Lingo4G to content written in a different language, please contact us.
Lingo4G can run on any platform supporting Java 1.8 or later. While processing cannot currently be distributed to multiple machines, a high-end workstation with fast SSD storage should be capable of handling collections of several tens of gigabytes. For most data sets not exceeding gigabytes, any computer with 4GB of memory and some disk space will be sufficient. We very much recommend using SSD drives to store Lingo4G indices. Please see the Requirements section of Lingo4G manual for more details.
We require one Lingo4G license per one physical or virtual server that runs Lingo4G binaries, regardless of the number of cores on the server, the number of users and number of collections handled by the server.
For large-scale or non-typical deployment scenarios, such as OEM distribution, please get in touch.
There are no restrictions on the number of Lingo4G instances running on one physical or virtual server. The only limit may be the capacity of the server, including RAM size, disk space and the number of CPUs.
Absolutely! Please get in touch for a free evaluation package.
No. Lingo3G and Lingo4G are two separate products we intend to offer and maintain independently. Lingo3G will remain an engine for real-time clustering of small and medium collections, while Lingo4G will address clustering of large data sets. Therefore, Lingo4G is not an upgrade to Lingo3G, but a complementary offering.
Having said that, if you would like to switch from Lingo3G to Lingo4G, we offer a license trade-in option and count the initial Lingo3G license purchase fee towards the Lingo4G license fee.