Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2016, Director: Jordi Vitrià i Marca
News articles are pieces of Natural Language that comply with the model of 5W1H, meaning, they should answer to the following six questions: What, Who, Where, When, Why and How. This project takes advantage of that assumption to create an algorithm capable of building a representation of a news article and a distance between such representations for any pair of politics news. With that knowledge, a global dis-
tance between entries based on similarity of content is built. That algorithm is assessed in comparison with the topic modeling algorithm Latent Dirichlet Allocation (LDA). Applications of the system with their corresponding visualisations are presented too.