Skip to content


Awesome Linguistics


A curated list of anything remotely related to linguistics, sorted in alphabetical order.


Libraries, frameworks and applications useful for developing applications.

Platforms and toolkits

  • CLARIN-D web tools - Tools for Analysing Research Data
  • CorpusExplorer - Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 50 interactive visualizations under a user-friendly interface.
  • Haxe-linguistics - Early linguistical analysis and natural language processing library for Haxe.
  • Natural - General natural language tools for Node.js.
  • Natural Language ToolKit (NLTK) - The most complete platform for building Python programs to work with human language data.
  • Snowball - Snowball is a language in which stemming algorithms can be easily represented.
  • Spacy - Industrial-strength National Language Processing in Python.
  • Mate Tools, webservice via WebLicht
  • UBIAI - Easy-to-use text annotation tool for teams with most comprehensive auto-annotation features. Supports NER, relations and document classification as well as OCR annotation for invoice labeling.
  • textblob-de - Nice alternative for spacy (see above).
  • UralicNLP - An open source Python library for processing morphologically rich and, for the most part, endangered Uralic languages. It can do morphological analysis, generation, lemmatization, disambiguation and lexical lookup for a great many Uralic languages.


Data sets


  • How To Label Data - Guide on managing large scale linguistic annotation projects.
  • Low Resource Languages - A list of resources for conservation, development, and documentation of low resource (human) languages.
  • Language Science Press - Language Science Press is a born-digital scholar-led open access publisher in linguistics.

Deep learning models and transformers

On Wikipedia

On Youtube


Some of the more interesting and complete books.


Non free