The Daily Corpora is a platform for evaluating and exploring linguistically annotated text corpora. Currently, it
contains an annotated version of the German Wikipedia - depending on available hardware, maybe more corpora will be
added in the future.
You can easily explore the functionality by clicking through the entries in the top menu. Main functions are:
Articles: Displays the plain text of a Wikipedia article. You can apply sentiment analysis, display the part of speech tags in colors or analyze the named entities in an article.
Search: Similar to Google & Co., you can enter a single word or a phrase in a simple search field and search through a corpus.
Search a corpus
Words: Display details and statistics about a certain lemma in the corpus.
Analyze a word
Text categorization: "What is this person's profession?" Here you can enter the name of a famous person and via machine learning models in the backgroud, the system will answer which profession the person has.
Guess a profession
Wordcloud: draws a nice WordCloud for a given lemma while taking the surrounding tokens and their part of speech tags into account.
Draw a WordCloud