Florian Stieler

Tagging at the touch of a button

With chunkx, we want to sort large amounts of learning content in a user-specific way. To do this, it is essential to automatically analyze the learning content and relate it to each other. We use Natural Language Processing techniques to make this possible. Find out exactly how this works and what it means for our users, our learning app, and corporate training in this article. Have fun reading!

User-specific learning, without programming effort for the authors

chunkx supplements or replaces existing learning with continuous micro-learning. But what does continuous learning mean in an operational context? Well, typical trainings, e-learnings, learning videos, webinars, etc. end at a certain point – sometimes with and sometimes without a knowledge test. This limitation is a high risk in terms of time for the individual and economically for the paying company: because after just a few days, we forget a large part of what we have learned. Ebbinghaus figures that after 6 days we only know 23% of what we learned. Regardless of the significance of this specific figure, with chunkx we are taking on the challenge of flattening the forgetting curve.

Repetition and linkage

A first step towards this is the repetition of learned content. For this purpose, we use learning tasks followed by explanatory feedback in chunkx. he content we learn from a wide range of different areas – be it subject-specific content, regulations, new skills, or perennial favorites such as security, safety, and compliance – must be brought together in user-specific feeds. However, every user has limited time, which is why their respective feed should ideally prioritize content that they want to or should learn, but don’t yet know as well as others. With 10 topics with learning content of one hour each and a learner who can invest just a few minutes per week in continuous repetition, the question is how to succeed in this challenge? And how can additional recommendations for new content be made to learners based on this?

chunkx creator: tagging the content

In order to compile learning content as appropriately as possible and individually, it is necessary to relate it to each other. On the first level, this is done via channels: All learning content, e.g. on the topic of tax law, is assigned to the corresponding channel. But what does it look like within the channel, which content belongs together there and how? And maybe there is suitable content outside this channel?

For this, we already enable our authors to tag learning content. Different learning tasks are linked via individual keywords. This enables specific evaluations and allows our algorithm to better understand which content is didactically relevant for users.

While manual tagging is useful, it can also be time-consuming. That’s why we developed taggingatthe push of a button: Our authors can automatically generate suitable keywords after entering the task texts, remove them again if necessary and continue to add manual keywords. For this function, we use Natural Language Processing techniques. But what does that mean exactly?

How does auto-tagging through Natural Language Processing work?

The field of Natural Language Processing deals with the analysis of our language. Depending on the requirement area, sometimes simple statistics are enough for this. However, to produce intelligent results as needed in our case, more complicated methods must be used.

Word embeddings through high dimensional vectors

At the heart of our auto-tagging are so-called “word embeddings”. These translate words into high-dimensional vectors so that texts can also be understood by computers. Such vectors then allow us to make calculations and determine the proximity of words and texts to each other. For example, one can imagine that the terms “tax advisor” and “tax law” are relatively close. Exactly this proximity can be described by vectors. Of course, not all comparisons are so clear-cut. For example, which word should be closer to “tax advisor”: “lawyer” or “finance”? Both words have a different reference to “tax consultant”, which is why a clear, unambiguous answer is difficult.

Therefore, to capture all possible properties of a word, our word vectors consist of several hundred dimensions. In order for these dimensions to be meaningful, the model that translates the words into vectors must be trained beforehand – in our case with several hundred million words.

From vectors to tags

For generating the tags in chunkx, we first combine all the explanatory fields of a channel (i.e. title, description, questions, the correct answer, feedback) as a single vector. For thematically meaningful tags and lower computation time, all unnecessary filler words are filtered out of the texts. Now, to generate tags for a single task, the words are compared to each other as well as to the associated channel. The different sections are weighted differently and continuously readjusted.

The words whose distance from the channel vector is the lowest and which do not exceed a certain threshold are ultimately proposed to the author as tags .

Why are tags so important now?

The tag generation technology is not only a time saver for our authors, but also enables intelligent linking of content. For example, consider our learning content on the topic of tax law: A learner has knowledge gaps on the topic of sales tax. We can now repeat this selected content accordingly. However, we can also have our algorithm capture what learning content from other channels available to the learner has content proximity to the topic of sales tax. We can then suggest and prioritize these to him as he learns about one of these channels. The selection of content thus adapts to the learner and his learning time is used as efficiently as possible: Namely, with the content for which there is a need to learn.

While classic web-based training or learning videos offer the same content in the same order for all learners, with chunkx the most appropriate content can be selected – and not just within one channel, but across an infinite number of channels.

The more channels and topics are made available to employees of a company in chunkx, the more content overlaps arise and the more continuous micro-learning with chunkx shows its full strength.

Contact us

Have we aroused your curiosity or do you feel bored by our example around tax law? Well, let’s also rather talk about your topics and how we can best support learning about them with chunkx. Write to us and we’ll get back to you shortly.

back to the article overview