Portfolio

Text Style Transfer: Detoxification

Published:

The task of text style transfer for texts is not so explored as for images. The application of text style transfer can be quite broad. For social good purposes, we explored unsupervised methods for texts detoxification for Russian and English languages. Also, we continue the work collecting parallel corpus for future possibility to address this problem as seq2seq.

Figure. Example of use cases where the detoxification technology can be applicable. (a) Offering the user a more civil version of a message. (b) Preventing chatbots from being rude to users when trained on open data.

Fake News Detection using Multilingual Evidence

Published:

Misleading information spreads on the Internet at an incredible speed, which can lead to irreparable consequences in some cases. As a result, it becoming essential to develop fake news detection technologies. While substantial work has been done in this direction, one of the limitations of the current approaches is that these models are focused only on one language and do not use multilingual information. In this work, we propose the new technique based on multilingual evidence that can be used for fake news detection and improve existing approaches. This approach imporved baseline systems for fake news detection and added more explainability for the users. Below you can examine graphical abstract of this work.

Figure. The approach containes the follwong steps: 1) Text Extraction from the new coming article. 2) Text Translation into several languages. 3) Cross-lingual News Retrieval based on translated text. 4) Content Similarity Computation between the retrieved articles and the original one. 5) News Classification into true if there is enough evidence, or fake if there is contradiction.

Aggregation and personalization of news text content

Published:

This work is devoted to exploratory search. At the moment there are many search engines, but they do not satisfy all the needs of users. In particular, users are not always comfortable using well-known large search engines for educational purposes – for example, to study a specific topic from scratch, if the user does not know anything about the topic yet. Exploratory search is devoted to help in such tasks. In this work we explore the purposes and the obstacles of the usage of different exploratory search systems. In addition, we create an index upon the collection of Russian language articles from Habr. Based on the index, we test several popular search algorithms for their appropriateness for exploratory search tasks: TD-IDF indexing, BM25, fasttext embeddings and topic modeling. Below you can see the example of created user interface.

Figure. Example of created user interface for exploratory search based on Habr collection: 1) User can intercat with his personal collection, add or delete document, add his own text. 2) Also, he can see the whole feed from the source. 3) After choosing the search algorithm, he can get recommendations based on his collection.

WIFI Single Access Point Positioning

Published:

The majority of WiFi positioning methods use several access points to estimate users locations. But different interferences such as walls, some obstacles, signal reflection make it sometimes impossible to use several access points. Moreover, private and public places where WiFi positioning can be useful homes, cafes, malls have only one access point covering almost all of the space. So it is crucial to make positioning only with one access point because it will release us from quantitative restriction and make WiFi positioning available in all public and private places. Several probabilistic methods exist to determine the position of the user from the input of the signal strength sequence. We used statistical models as Markov Chains to track users paths. You can see the results of experiments in real-life premises.

Figure. Example of our appoach results inside the real-life building with only one WIFI access point.