Aggregation and personalization of news text content

Published in Master Thesis, 2019

This work is devoted to exploratory search. At the moment there are many search engines, but they do not satisfy all the needs of users. In particular, users are not always comfortable using well-known large search engines for educational purposes – for example, to study a specific topic from scratch, if the user does not know anything about the topic yet. Exploratory search is devoted to help in such tasks. In this work we explore the purposes and the obstacles of the usage of different exploratory search systems. In addition, we create an index upon the collection of Russian language articles from Habr. Based on the index, we test several popular search algorithms for their appropriateness for exploratory search tasks: TD-IDF indexing, BM25, fasttext embeddings and topic modeling. Below you can see the example of created user interface.

Figure. Example of created user interface for exploratory search based on Habr collection: 1) User can intercat with his personal collection, add or delete document, add his own text. 2) Also, he can see the whole feed from the source. 3) After choosing the search algorithm, he can get recommendations based on his collection.