Learnings from European Conference on Information Retrieval 2018
Search Engine Optimisation

Learnings from European Conference on Information Retrieval 2018

12th April 2018

The 40th European Conference on Information Retrieval 2018 – Grenoble, France

At the end of March 2018 I attended the annual European Conference on Information Retrieval (now in its 40th year), at The Minatec, in the beautiful city of Grenoble, France.  Information retrieval, of course, is the field which explores the crawling, analysis (and attempted understanding) and indexing of text and information (including the exploration of entities and entity-mapping), so is therefore extremely relevant to those working in the digital marketing industry.

It considers the two main sides of information retrieval.  ‘Push’ (recommender systems), where the platform or application has prior knowledge of the user, or similar users, and pushes recommendations to meet users’ informational needs, and ‘Pull’ (search engines) where the user steers the course mostly alone by querying and browsing.  That’s not to say search engines do not use recommender systems or push information retrieval of course, but the point is to illustrate that it was not only search engines who were the area of interest but also any system which uses these two main methods to meet informational needs of users within their systems.  For example. e-commerce applications or social media platforms.

The conference is one of the main events in the information retrieval research calendar.

Others major ones to watch out for throughout the year include CHIR (pronounced ‘cheer’), WSDM (pronounced ‘wisdom’) (Web Search and Data Mining Conference), and SIGIR (The ACM’s ‘Special Interest Group on Information Retrieval’).

A more comprehensive list of some of the other related conferences can also be found here.

ECIR hosts presentations and discussions of relevant current and emerging research papers from academia and industry with specialist tracks such as deep learning, neural networks, exploration of user behaviour in social media and broad topic analysis, natural language processing, and information retrieval for news dispersal and interpretation.

Long and short papers were presented, along with keynotes and invited talks, workshops and tutorials focusing on key areas of research.  Much of the researchers’ works and presentations go beyond the arena of online marketing toward exploring solutions for societal problems; such as understanding and detecting social media behaviour in the run-up to violence surrounding elections, or identifying ways to utilise information retrieval and deep learning in health research.

Topics overall could broadly be classified as such:

  • Topic modelling
  • Information retrieval in the medical and health space
  • Search engine results evaluation
  • Social media text mining
  • Information retrieval for news
  • The analysis of broad, dynamic topics in social media
  • Search engine user behaviour analysis
  • Recommender systems (RecSys)
  • Social aspects and personalised search (SOAPS)
  • Neural networks for information retrieval (NN4IR)
  • Deep learning

Given these workshops and talks are delivered by PhD researchers furthering the body of knowledge, leading academics and industry pioneers in this field we can get some idea of the direction of trends, current knowledge, and understand more of the still ‘open’ problems.  We know there are still many challenging issues in the world of search with natural language understanding still presenting a difficult problem for instance.  However, the research papers illustrate some of the steps in place toward a greater understanding in the field.

Some of the presenting researchers provided slides, and several papers are available to view for either indefinite periods or for a full month following the conference.  These papers might otherwise be locked behind academic paywalls (and some will be, once the month is out).

A major focus was around intent-understanding, and how the most relevant result might be retrieved quickly, and in the right context-led ‘moment’ (contextual-search) for the users.  Therefore, it is worth taking up the opportunity to read some of the papers and gain more understanding of where the buck is headed to next as search engines and IR researchers alike seek to increase understanding of semantics, and interpretation of user interaction with results; and their informational needs.

There is certainly way too much to go through in any great depth in this blog post but I have added  further resources throughout so you can continue your own exploration into this intriguing area.  This is more of an overview with a little more focus in some areas where I either attended the talk or had the opportunity of speaking with the researcher directly to get more information on their work.  It is certainly not comprehensive by any means.  Hopefully the many links to further reading materials will provide a bridge for future learning.

Note: this post contains many summaries of paper extracts. For a complete understanding, it’s recommended to read the full extract and paper.

Co-located workshops and tutorials ahead of the main ECIR 2018 conference

Ahead of the main conference there were several workshops and tutorials covering specific areas of interest.

These included:

BroDyn 2018 (Broad Dynamic Topics over Social Media)

BroDyn 2018
This workshop considered topics in social-media attracting long-standing user interest as opposed to quickly emerging, then disappearing, short-term interest topics.  These broad, dynamic topics might include such social media conversations around “Brexit”, “Syrian crisis” or “North Korea”.  Topics which may be of interest to social media users for months, or even years, but also may include topics lasting for only a few weeks such as “Hurricane Irma”.  The study of sub-topics and conversations which emerge within these major broad-topics is also covered in this area of research.

The BroDyn workshop is explained in more detailed overview here.

The workshop was made up of a keynote and several papers explained by the researchers behind them.

BroDyn Keynote

A keynote was delivered by Professor Michalis Vazirgiannis

Graph-Based Event Detection in Streams: The Twitter Case (Professor Michalis Vazirgiannis)

Full Paper: http://ceur-ws.org/Vol-2078/keynote.pdf

Paper Abstract Overview: Exploration of solutions to dissect and map the often real-time nature of tweets around real-world major events such as natural disasters, political campaigns, sporting events and terrorist attacks.  The work presented looks at modelling a stream of tweets related to the event as an evolving graph of words and then identifying the major events as their evolutionary patterns emerge.  Identifying these important moments is achieved via detection of rapid graph changes.  The events are then summarised via the extraction of a few tweets solely from Twitter, describing the chain of events.  The researchers aimed to illustrate their proposed system was able to also capture sub-events and outperforms current dominant sub-event detection methods.

The full keynote for BroDyn is available to read here.

BroDyn Workshop 2018

Real-time collection of reliable and representative tweets datasets related to news events‘ (B´eatrice Mazoyer1, Julia Cag´e2, C´eline Hudelot3, and Marie-Luce Viaud1, 2018)

Full Paper: http://ceur-ws.org/Vol-2078/paper2.pdf

Paper Abstract Summary: This paper looks to extract both the tweets from Twitter around news events whilst also extracting information from the more traditional journalistic news reporting to gain a dual-sided view from both Twitter’s social media users (via the Twitter API) and the reported news simultaneously.

Contradiction in Reviews: Is it strong or low?‘ (Ismail Badache, S´ebastien Fournier, and Adrian-Gabriel Chifu, 2018, Marseille University France)

Full Paper: http://ceur-ws.org/Vol-2078/paper1.pdf

Paper Abstract Summary: Aims to detect and measure the strength of polarities (opposing opinions) in contradictory user reviews in text online.

Social Media Based Analysis of Refugees in Turkey (Abdullah Bulbul, Cagri Kaplan, and Salah Haj Ismail, 2018)

Full paper: http://ceur-ws.org/Vol-2078/paper3.pdf

Paper by research team at Ankara University, Turkey for ‘Social Media Based Analysis of Refugees in Turkey’ is here. (Abdullah Bulbul, Cagri Kaplan, and Salah Haj Ismail, 2018)

Paper Abstract Summary: “A method is proposed to identify refugee’s public facing social media accounts with a view to understanding their needs for potential future solution planning, and tracing back events.  The paper aims to gain understanding which might otherwise not be available due to refugee’s fears in recalling or expressing experiences and needs during inquests or interviews.   This first paper initially looks at discussion of the retrieval method and ways to analyse the data and a discussion around future uses and solutions.” (Abdullah Bulbul, Cagri Kaplan, and Salah Haj Ismail, 2018)

Two datasets were also released by the BroDyn 2018 workshop for those looking to carry out research with some social media broad topics.  The two topic datasets covered the UK election and the US election.

The two datasets (one for the UK election and one for the US election) can be downloaded here

The BroDyn 2018 Workshop papers are all published online here.

You can also download the full workshop proceedings here.

NewsIR 2018

Another workshop covered the vertical of news information retrieval and again comprised of a keynote and invited talks, along with paper presentations.

NewsIR Keynote & Invited Talks

NewsIR 2018 Keynote

The keynote for NewsIR was around AI and automated news and the implications this held for issues around trust, bias and credibility.  The keynote was delivered by Edgar Meij of Bloomberg.

Full Paper: http://ceur-ws.org/Vol-2079/intro1.pdf

AI & Automated News: Implications on Trust, Bias, and Credibility (Edgar Meij, Bloomberg)

Paper Abstract Summary: Potential societal implications around (either partly or fully) automatic and algorithmically generated news and the related matters around trust and bias (as algorithms cannot be held accountable) are discussed in the context of news search and recommendations.  Sentiment analysis and detection of polarity in opinion and automatic media monitoring is also explored.

AI & Automated News: Implications on Trust, Bias, and Credibility (Edgar Meij, Bloomberg)

Every tool is better than nothing”?: The use of dashboards in journalistic work (Peter Tolmie)

Paper Abstract Summary: This paper looked at the many tools which are available to journalists as dashboards in the news industry and the often ‘magpie-like’ effect from the adoption and use of these tools, whereby the tool is used once or for a while and then set aside.

NewsIR Workshop

All proceedings for the NewsIR 2018 workshop are here

A Plan for Ancillary Copyright: Original Snippets (Martin Potthast, Wei-Fan Chen, Matthias Hagen, Benno Stein)

Paper Abstract Summary: “This paper looked at a method by which search engines could potentially create unique text snippets (original snippets) from web pages for search engine results without breaching otherwise problematic copyright when the content is otherwise taken directly from the web pages themselves.”

Visualizing Polarity-based Stances of News Websites (Yoshioka, M., Allan, M.J.J. and Kando, N., 2018)

Paper Abstract Summary: This paper looks at the development of a framework which helps to identify a bias in a news website toward a particular political leaning based on whether news is published with a positive or negative stance regarding a particular topic.  The framework utility was demonstrated in the paper via a case study of the recent US Presidential election.

Shaping the Information Nutrition Label (Tim Gollub, Martin Potthast, Benno Stein)

Paper Abstract Summary This paper looks at a method to simplify the nutrition labels on products so they are unambiguous.

Estimating Credibility of News Authors from their WIKI Validated Predictions (Yarrabelly, N., DSAC, I. and Karlapalem, K., 2018)

Paper Abstract Summary This paper looks at understanding and estimating the credibility of news authors based on their predictions coming true.  The news events which they report on are validated with this proposed model via Wikipedia to gain a measure of the percentage of correct predictions or reported incidents being accurate.

Social Media and Information Consumption Diversity (José Devezas, Sérgio Nunes)

Paper Abstract Summary: This paper considers whether users of social media still consume a diverse range of information given their ability to personalise and create individual feeds versus random news consumers and reveals research investigating this issue.

Cross-Reading News (Shahbaz Syed, Tim Gollub, Marcel Gohsen, Nikolay Kolyada, Benno Stein, Matthias Hagen)

Paper Abstract Summary: This paper proposes an application called CrossReading News which aims to provide a means by which journalists can look to find easily related news pieces curated using formulaic information retrieval methods from a range of sources quickly and easily to get a more rounded view.

Qlusty: Quick and Dirty Generation of Event Videos from Written Media Coverage (Alberto Barr´on-Cede˜no, Giovanni Da San Martino, Yifan Zhang, Ahmed Ali, and Fahim Dalvi)

Paper Abstract Summary: This paper presents Qlusty which is a video application built with an aim of moving toward breaking the news information bubble by generating a video of news from various sources and four individual modules collated.”

Named Entity Recognition for Telugu News Articles using Naïve Bayes Classifier (SaiKiranmai Gorla Sriharshitha Velivelli N L Bhanu Murthy Aruna Malapati)

Paper Abstract Summary: Proposes to use Named Entity Recognition of ‘personal, location, organisation in sentences or documents in the Telugu language using part-of-speech (POS) tagging and classifying of textual content.

Exploring Significant Interactions in Live News (Erich Schubert, Andreas Spitz, Michael Gertz)

Paper Abstract Summary: This paper seeks to detect significant events appearing in news (live) as a result of the identification of co-occurrences of terms when compared to a background corpus (normal).  The researchers visualised the resulting semantic word cloud between related terms as significant events emerge.  They crawled dozens of news sites to give examples of their prototype.

Neural Content-Collaborative Filtering for News Recommendation (Dhruv Khattar, Vaibhav Kumar∗, Manish Gupta†, Vasudeva Varma)

Paper Abstract Summary: This paper looks at utilising neural networks to provide recommendations for user news reading combining past user interactions and past content preferences.  The paper claims to beat state of the art recommender systems for user news recommendations.

On Temporally Sensitive Word Embeddings for News Information Retrieval (Yoon, T.W., Myaeng, S.H., Woo, H.W., Lee, S.W. and Kim, S.B., 2018)

Paper Abstract Summary: This paper argues, explores and considers the word embeddings in news information retrieval and claims that co-occurrence vectors in news information retrieval are different and change when compared with other types of word embedding scenarios.  Research was carried out from data provided by Naver and the researchers claim findings were that word embeddings need to be expanded and built for news IR.

TREC 2018 News Track

SoAPS 2018 – (Social Aspects in Personalization and Search)

Social Aspects in Personalization and Search explores the emerging areas around user-influenced recommender systems – for example, reviews and ratings, and influencing of opinion via social media with likes and shares.  This aspect of IR also looks at the ways in which users interact with social media and recommendations.   There are some good overviews on how recommender systems have evolved in crowded marketplaces and search.

SoAPS Keynote

Denis Parra explored some of the social aspects in recommender systems in his SoAPS keynote and provided his slides via Slideshare below.

All the slides can be viewed here:

Here are some more of the slides from a presentation which looked at time-aware evaluations, implicit feedback from users and the measurement of freshness in results evaluation:

Lorena Recalde shared her work undertaken with Ricardo Baeza-Yates on the different types of content which users on Twitter are prone to tweet.

Venue Suggestion Using Social Centric Scores (Aliannejadi, M. and Crestani, F., 2018.)

Professor Fabio Crestani presented work on using past visited locations by users to recommend future venues.  A set of relevance scores was presented based on gathered data from the location and venue preferences of users.

Text2Story 2018

The Text2Story workshop looks at mapping text in corpora to events in order to map storylines and understand the intent and needs behind queries.

The workshop aimed to explore ways of understanding emerging stories and event timelines, and the mapping of these to text and semantics as well as understanding sentiment and dual-sided arguments.  This area is complicated much further because data is received from many sources almost simultaneously.  Fact and credibility of authors and sources of storylines in text bodies is considered an important part of this area, as well as the personalisation of events and stories and the recommendation of information based upon personalisation.

Here are some of the areas explored in this workshop:

  • Event Identification
  • Narrative Representation Language
  • Sentiment and Opinion Detection
  • Argumentation Mining
  • Narrative Summarization
  • Storyline Visualization
  • Temporal Aspects of Storylines
  • Evaluation Methodologies for Narrative Extraction
  • Big data applied to Narrative Extraction
  • Resources and Dataset showcase
  • Personalization and Recommendation
  • User Profiling and User Behavior Modeling
  • Credibility
  • Fact Checking
  • Bots Influence

All the papers presented from this workshop can be found here.

Text2Story Keynote

Users2Story – On the Importance of Understanding Searchers’ Information Needs (Udo Kruschwitz, University of Essex)

Paper Abstract Summary: This keynote paper looked at the challenging problem of trying to understand the intent and needs behind queries in both web search and professional search and emphasised the importance of understanding searchers’ information needs.

Word embeddings, information retrieval and textual entailment (Eric Gaussier, University of Grenoble)

Paper Abstract Summary: This keynote paper and talk looked to review current types of word embeddings popularly used in natural language processing and information retrieval and discussed the potential of extending word embeddings with further syntactic information and explored whether this had an improvement for information retrieval.

IREvent2Story: A Novel Mediation Ontology and Narrative Generation (Kattagoni, V. and Singh, N., 2018)

Paper Abstract Summary: This paper looks at a means of detecting and classifying events using an ontology and narrative entity mapping process, along with identification of international actors in the events.

Gossip is more than just story telling Topic modeling and quantitative analysis on a spontaneous speech corpus (Pápay, B., Kubik, B.G., Cleverbridge, A.G. and Galántai, J.)

Paper Abstract Summary: This paper aims to identify gossip taking place in bodies of text corpora as well as identifying the number of participants and the type of topics discussed and sentiment displayed. It also explores how gossip evolves.”

Job Recommendation based on Job Seeker Skills: An Empirical Study (Valverde-Rebaza, J., Puma, R., Bustios, P. and Silva, N.C., 2018)

Paper Abstract Summary: This paper proposes an improved framework application for job recommendations based on the job seeker skills.

Neural Networks for Information Retrieval Tutorial 2018

This was a full workshop / tutorial which looked at many aspects around current deep-learning practices. including some industry insights and semantic matching using co-occurrence vectors.

A learning to rank tutorial was presented by Bhaskar Mitra of Microsoft.  Entities were explored by Tom Kenter and Christophe Van Gysel of the University of Amsterdam.

Maarten De Rijke presented work analysing various user click models in search which he had developed alongside other researchers (Ilya Markov, University of Amsterdam) and Alexander Chuklin (Google Switzerland and University of Amsterdam)

You can find the slides from not only this tutorial at European Conference on Information Retrieval 2018 (ECIR) but also slides from the Neural Networks for Information Retrieval tutorials and workshops at Web Search and Data Mining Conference 2018 (WSDM), and SIGIR (Special Interest Group on Information Retrieval) 2017  at the NN4IR website.  You can also find the link to the Click Models book further down in this post.

There was a huge amount of information presented and the session was very well attended, as you can imagine.

Direct links to each of the sectional slides and tutorials from ECIR 2018 tutorial are below and I would recommend going through ALL of these more than once.  There are some great learnings here around research into click models, and several 2Vec areas explored, including Word2Vec, User2Vec, Prod2Vec and more.  Industry insights are also provided by Bhaskar Mitra and relate primarily to Bing but we can presume to some extent these cover some ‘industry favourites’:

‘Neural Networks For Information Retrieval’ tutorial all slides are available here.

Extreme Multi-label Classification for Large-scale Text Mining Tutorial (XMLC-LSTC)

I did not attend this workshop but the tutorial page and slides are available below.

Tutorial page here

Website here

Tutorial slides here

Main Conference

With regards to the main conference, papers on a wide range of IR topics were presented and only 23% of those submitted were accepted.  Papers on word embeddings were accepted most as a percentage of papers submitted, indicating a ‘hot topic’ nature.  Neural network was also very well received.

The main conference papers covered the following topics:

  • Neural network
  • Word embedding
  • Recommender system
  • Collaborative filtering
  • Computational linguistics
  • Web search
  • Natural language processing
  • News articles
  • Search tasks
  • Evaluation metrics
  • Query terms
  • Sentiment analysis
  • Social medium
  • Deep learning
  • Topic models
  • Knowledge base
  • Deep neural network
  • User study
  • Retrieval performance
  • Learning to rank

There were, of course, many papers presented over the course of the three conference days and I have listed many of them toward the end of the document.  I’ve focused on providing a bit more of an overview on a few which I was present for and which were of particular interest to me, but that is not to detract from the impressive work of all of the presenters. As previously mentioned, I also had opportunity to speak to some of the researchers about their work and I have provided more detail here on those.

Authorship Verification in the Absence of Explicit Features and Thresholds (Oren Halvani, Lukas Graner and Inna Vogel, 2018)

One interesting paper in particular was by Oren Halvani from Fraunhofer Institute for Secure Information Technology SIT

Oren has been in the cyber security and fraud-detection space for a number of years.

His paper looks at the important area around author credibility and authenticity-checking using deep-learning by recognising the unique writing style of one author over another.

Oren’s paper, is entitled ‘Authorship Verification in the Absence of Explicit Features and Thresholds’ (Halvani, 2018).  Oren explained Authorship verification (AV) technique can be used to link authors to papers across different platforms and help to authors who generate hoaxes and deliberate misinformation in news.  The method appears to not rely on the same levels of training data other systems need in order to determine when two papers or more are produced by the same author.  Oren’s team tested their results across a range of different text corpora (including recipes) and found their results were competitive against current state of the art noteworthy authorship verification baselines.  Oren’s algorithm is also a very lightweight code (only around 8 lines of code).

Information Scent, Searching and Stopping: Modelling SERP Level Stopping Behaviour (David Maxwell and Leif Azzopardi, 2018)

Another interesting paper was David Maxwell and Leif Azzopardi’s paper on search engine user behaviour.  Presented on colourful animated slides.  The paper looks at user behaviour based on the ‘information foraging theory’.

David Maxwell proposed current user models don’t consider people skipping search engine results, and this should be taken into account.  It explored the ‘stopping during search’ behaviour.  It’s recommended the slides are explored in more detail:

This full paper looked at the different scent following behaviours of naive versus more experienced searchers and the scent paths these users take.  David Maxwell suggested a revision of the Complex Searcher Simulation User model.

David Maxwell and Leif Azzopardi also have developed an interactive information retrieval framework for simulation so you can run experiments for yourself.

That can be accessed here on Github.

A Keynote on one of the conference days was given by Radim Rehurek, founder of popular open source Gensim Python Library.  He shared his experiences, realities, challenges and learnings taken from founding, and leading the building of the Open Source application and talked about the reasons why it is still relevant as a platform in 2018.

Fabrizio Silvestri, Software Engineer at Facebook also spoke on Industry Day of the nature of problem driven research on very very big machines using Search2Vec.

Web2Text: Deep Structured Boilerplate Removal (Thijs Vogels, Octavian Eugen Ganea and Carsten Eickhoff, 2018)

Another interesting paper was that delivered by Thijs Vogels who looked at a method for the removal of boilerplate areas in web page text.

Paper Abstract Summary: This paper looks at improving ways to detect and remove boilerplate text from web pages such as header, footer, advertisements by classifying html blocks as either main content or boilerplate. – Full Paper: https://arxiv.org/pdf/1801.02607.pdf

Local is Good: A Fast Citation Recommendation Approach (Haofeng Jia and Erik Saule, 2018)

Local is Good - Eric SauleAnother researcher presenting was Erik Saule who presented an alternative academic paper search engine to Google Scholar.

I spoke with Erik Saule, researcher from UNC Charlotte, US who is behind ‘Local is Good: A Fast Citation Recommendation Approach’, and here is what Erik had to say of his paper:

“We developed several years ago a paper search engine (for academic research papers), called ‘TheAdvisor’ as a starting point for our research.  It was designed to order and retrieve academic papers.  It was based on seed-lists based on papers which the searcher already knows of, rather than ‘keyword driven’, like the more traditional search engines, and like Google Scholar.  The search engine works by performing a ‘random walk’ (modelled on how users are likely to traverse the search engine) process on the citation network of papers.  Edges are citations and reference and vertices are papers.”

This search engine is targeted at academics and researchers who it is presumed have some knowledge of papers in their space in the first place.  The intended use case of this system is to speed up focused research in a particular direction to exclude irrelevant and unconnected but to ensure anything which is relevant (and connected via citations or references is not missed).

The target users usually do three types of things:

  1. When they have a paper in hand, either they will read one of the references of the paper (in the reference section on the paper)
  2. They will read a citation of the paper
  3. They will go back to a paper they already know

Our paper at ECIR2018 is about improving upon ‘TheAdvisor’ from a speed of retrieval perspective, without compromising the quality of the results.  This is because previously the existing algorithm was a bit slow and we wanted to make it near real-time.

Now the search engine is still using random walks but instead of retrieving everything it works on a pruned citation graph only presenting papers which are connected to the papers the user already knows.

The way in which the speed is improved is by preventing the machine from considering any paper that does not have a direct connection to the paper you already you know.  Essentially a recommender system based on papers already known to the user.   Therefore pruning the considered search set of papers and discounting irrelevant edges.

This enabled a dramatic increase in speed of retrieval which enabled real-time queries.  The pruning took speed down from circa 2.5 seconds to 0.2 seconds.  Around a x 15 increased improvement in retrieval time.

Erik and his team plan to bring the system back online with the new real-time retrieval capabilities and wants to not only work on retrieving academic papers but also in organising the papers into some type of relevance clusters rather than lists of titles.  They are planning to return around 100 papers but these will be organised in such a way as to avoid information overload issues by clustering and also so that the researcher can look at different directions within the topical area of study they could take, or different approaches given the ordering of in-topic themes.

Erik is also hoping the system will become used by three types of people:

  1. Researchers who are trying to discover new papers
  2. Editors looking for experts to review papers
  3. Academic paper reviewers who wish to do a cursory check to ensure they have not missed any crucial or important existing work which they should consider in their paper review process

The full paper is available here: https://link.springer.com/chapter/10.1007/978-3-319-76941-7_73

There were many more papers delivered and these are listed further in the post.  All in all, a huge amount of learnings and it was clear deep-learning and semantic understanding using co-occurrence and word embeddings as well as the verification of facts and authenticity in news was a strong focus.  This was evident in the many papers in the news IR section looking to cross-reference and gain valuable double-sided or multi-sided perspectives and in the social media areas explored.

I also met with researchers from the University of Glasgow, who were presenting a range of talks and whose work is behind an interesting project around the use information retrieval and datasets to explain and mitigate Election Violence.  As part of their work they mapped tweets from one language to another via state of the art convolutional neural networks (CNN) without having to build additional training datasets.  Their work aims to be able to build a view of what the build up to election violence may look like in social media with a view to assisting with mitigation in the future.

Their work can be explored further here: http://www.electoralviolenceproject.com

Some of their papers presented at ECIR are:

On Refining Twitter Lists as Ground-Truth Data for Multi-Community User Classification (Ting Su, Anjie Fang, Richard McCreadie, Craig Macdonald and Iadh Ounis, 2018) – Social Media, Deep Learning, Natural Language Processing

On the Reproducibility and Generalisation of the Linear Transformation of Word Embeddings (Xiao Yang, Iadh Ounis, Richard Mccreadie, Craig Macdonald and Anjie Fang, 2018) – Natural Language Processing

Active Learning Strategies for Technology Assisted Sensitivity Review (Graham Mcdonald, Craig Macdonald and Iadh Ounis, 2018)

Dinner Above The City of Grenoble

It was not all formal paper presentations however.  There was time for some socialising and networking with a city tour of Grenoble (fun fact: Grenoble has 60,000 students in residence from an overall population of 160,000), and dinner high up on the mountain – accessed via this cable car.

Not great for me due to my fear of heights and even escalators.   I closed my eyes throughout the ascent and it was well worth the trip as a lovely dinner with the most interesting of company was the reward.

Here are a few of the tweets from the conference attendees at the dinner above the mountains.

2019 ECIR

The location for next year’s ECIR was announced and is confirmed as Cologne, Germany in April 2019.  I hope to attend this as well as CHIR which will be in Glasgow next year and SIGIR which will be in Paris.

Other Presented Papers & Resources ECIR 2018

Here are some of the many papers and images shared via social media (I have added some labels to give some guidance as to the topical natural of the papers for ease when perusing:

Deep Learning for Detecting Cyberbullying Across Multiple Social Media Platforms (Agrawal & Awekar, 2017) – Deep Learning & Social Media

Paper Abstract – “This paper looked at a method for overcoming data bottlenecks in identifying cyberbullying using deep learning across multiple social media platforms simultaneously.” (Agrawal & Awekar, 2017)

To Cite, or Not to Cite? Detecting Citation Contexts in Text (Färber, M., Thiemann, A. and Jatowt, A., 2018) – Natural Language Processing, Deep Learning

Affective Neural Response Generation (Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang and Lili Mou, 2018) – Deep Learning, Natural Language Processing

Attention-based Neural Text Segmentation (Pinkesh Badjatiya, Litton J Kurisinkel, Manish Gupta and Vasudeva Varma, 2018) – Deep Learning, Natural Language Processing, User Behaviour

Predicting Topics in Scholarly Papers (Seyed Ali Bahrainian, Ida Mele and Fabio Crestani, 2018) – Deep Learning, Natural Language Processing

Cross-lingual Document Retrieval using Regularized Wasserstein Distance (Georgios Balikas, Charlotte Laclau, Ievgen Redko and Massih-Reza Amini, 2018)

Learning to Leverage Microblog for QA Retrieval (Jose Miguel Herrera, Barbara Poblete and Denis Parra, 2018) – Natural Language Processing, Deep Learning

Employing Document Embeddings to Solve the “New Catalog” Problem in User Targeting, and Provide Explanations to the Users (Ludovico Boratto, Salvatore Carta, Gianni Fenu and Luca Piras, 2018) – Deep Learning, Natural Language Processing

Spatial Statistics of Term Co-occurrences for Location Prediction of Tweets (Özer Özdikiş, Heri Ramampiaro and Kjetil Nørvåg, 2018) – Natural Language Processing, Social Media, Deep Learning

Towards Maximising Openness in Digital Sensitivity Review using Reviewing Time Predictions (Graham Mcdonald, Craig Macdonald and Iadh Ounis, 2018)

Inverted List Caching for Topical Index Shards (Zhuyun Dai and Jamie Callan, 2018) – Topic Modelling

Topical Stance Detection for Twitter: A Two-Phase LSTM Model Using Attention (Kuntal Dey, Ritvik Shrivastava and Saroj Kaushik, 2018) – Social Media

Generating High-Quality Query Suggestion Candidates for Task-Based Search (Heng Ding, Shuo Zhang, Darío Garigliotti and Krisztian Balog, 2018) – Query Formulation

A Comparative Study of Native and Non-Native Information Seeking Behaviours (David Brazier and Morgan Harvey, 2018) – User Behaviour

Indiscriminateness in representation spaces of terms and documents (Vincent Claveau, 2018)

A Hybrid Embedding Approach to Noisy Answer Passage Retrieval (Daniel Cohen and W. Bruce Croft, 2018) – Deep Learning, Natural Language Processing

A Neural Passage Model for Ad-hoc Document Retrieval (Ai, Q., O’Connor, B. and Croft, W.B., 2018) – Deep Learning, Natural Language Processing

A Text Feature Based Automatic Keyword Extraction Method for Single Documents (Campos, R., Mangaravite, V., Pasquali, A., Jorge, A.M., Nunes, C. and Jatowt, A., 2018) – Text Mining & Information Extraction

Concept Embedding for Information Retrieval (Abdulahhad, K., 2018) – Deep Learning, Natural Language Processing

Inverted List Caching for Topical Index Shards (Dai, Z. and Callan, J., 2018)  – Topic Modelling

Topical Stance Detection for Twitter: A Two-Phase LSTM Model Using Attention (Dey, K., Shrivastava, R. and Kaushik, S) – Topic Modelling, Deep Learning and Social Media

Generating High-Quality Query Suggestion Candidates for Task-Based Search (Ding, H., Zhang, S., Garigliotti, D. and Balog, K., 2018) – Task-Based Search, User-Behaviour & User-modelling

Stopword Detection for Streaming Content (Fani, H., Bashari, M., Zarrinkalam, F., Bagheri, E. and Al-Obeidat, F., 2018) – Social Media, Natural Language Processing, Deep Learning

Topic Lifecycle on Social Networks: Analyzing the Effects of Semantic Continuity and Social Communities (Kuntal Dey, Saroj Kaushik, Kritika Garg and Ritvik Shrivastava, 2018) – Social Media, Natural Language Processing

Reproducing a Neural Question Answering Architecture applied to the SQuAD Benchmark Dataset: Challenges and Lessons Learned (Alexander Dür, Andreas Rauber and Peter Filzmoser, 2018) – Deep Learning, Natural Language Processing

Modelling Randomness in Relevance Judgments and Evaluation Measures (Marco Ferrante, Nicola Ferro and Silvia Pontarollo, 2018) – Results Evaluation, Topic Modelling

Explicit Modelling of the Implicit Short Term User Preferences for Music Recommendation (Kartik Gupta, Noveen Sachdeva and Vikram Pudi, 2018) – User Behaviour, Recommender Systems, Personalization

Multi-Task Learning for Extraction of Adverse Drug Reaction Mentions from Tweets (Shashank Gupta, Manish Gupta, Vasudeva Varma, Sachin Pawar, Nitin Ramrakhiyani and Girish Keshav Palshikar, 2018) – Health IR, Social Media, Natural Language Processing, Deep Learning

Efficient Context-Aware K-Nearest Neighbor Search (Mostafa Haghir Chehreghani and Morteza Haghir Chehreghani, 2018) – Deep Learning, Natural Language Processing

Stopword Detection for Streaming Content (Hossein Fani, Masoud Bashari, Fattane Zarrinkalam, Ebrahim Bagheri and Feras Al-Obeidat, 2018)

To Cite, or Not to Cite? Detecting Citation Contexts in Text (Michael Färber, Alexander Thiemann and Adam Jatowt, 2018)

Biomedical Question Answering via Weighted Neural Network Passage Retrieval (Ferenc Galkó and Carsten Eickhoff, 2018) – Health IR, Neural Networks

Towards an Understanding of Entity-Oriented Search Intents (Dario Garigliotti and Krisztian Balog, 2018) – Entities, User Intent

Proposing Contextually Relevant Quotes for Images (Shivali Goel, Rishi Madhok and Shweta Garg, 2018) – Image Search

Co-training for Extraction of Adverse Drug Reaction Mentions from Tweets (Shashank Gupta, Manish Gupta, Vasudeva Varma, Sachin Pawar, Nitin Ramrakhiyani and Girish Keshav Palshikar, 2018) – Social Media, Health IR

Neural Multi-Step Reasoning for Question Answering on Semi-Structured Tables (Till Haug, Octavian-Eugen Ganea and Paulina Grnarova, 2018) – Neural Networks

Medical Forum Question Classification Using Deep Learning (Raksha Jalan, Manish Gupta and Vasudeva Varma, 2018) – Health IR, Deep Learning, Natural Language Processing

Choices in Knowledge-Base Retrieval for Consumer Health Search (Jimmy Jimmy, Guido Zuccon and Bevan Koopman, 2018) – Health IR, Entities, Knowledge Graph

Investigating Result Usefulness in Mobile Search (Jiazin Mao, Yiqun Liu, Noriko Kando, Cheng Luo, Min Zhang and Shaoping Ma, 2018) – Mobile Search, Results Evaluation

Bringing Back Structure to Free Text Email Conversations with Recurrent Neural Networks (Tim Repke and Ralf Krestel, 2018) – Deep Learning, Natural Language Processing, Email Text Analysis

An Optimization Approach for Sub-event Detection and Summarization in Twitter (Giannis Nikolentzos, Christos Ksipolopoulos, Polykarpos Meladianos and Michalis Vazirgiannis, 2018) – Social Media, Deep Learning, Natural Language Processing

Time-aware novelty metrics for recommender systems (Pablo Sanchez and Alejandro Bellogin, 2018) – Recommender Systems

Benefits of using Symmetric Loss in Recommender Systems (Gaurav Singh and Sandra Mitrovic. Ben, 2018) – Recommender Systems

Topic-Association Mining for User Interest Detection (Trikha, A.K., Zarrinkalam, F. and Bagheri, E., 2018) – Recommender Systems, Topic Modelling

Document Ranking Applied to Second Language Learning (Wilkens, R., Zilio, L. and Fairon, C., 2018)

Discriminative Path-based Knowledge Graph Embedding for Precise Link Prediction (Maoyuan Zhang, Qi Wang, Wukui Xu, Wei Li and Shuyuan Sun, 2018) – Entities, Deep Learning

Aggregating Neural Word Embeddings for Document Representation (Ruqing Zhang, Jiafeng Guo, Yanyan Lan, Jun Xu and Xueqi Cheng, 2018) – Deep Learning, Natural Language Processing

Spherical Paragraph Model (Ruqing Zhang, Jiafeng Guo, Yanyan Lan, Jun Xu and Xueqi Cheng, 2018) – Natural Language Processing

Unsupervised Sentiment Analysis of Twitter Posts Using Density Matrix Representation (Yazhou Zhang, Dawei Song, Xiang Li and Peng Zhang, 2018) – Natural Language Processing, Social Media

Concept Embedding for Information Retrieval (Karam Abdulahhad, 2018) – Deep Learning, Natural Language Processing

A Neural Passage Model for Ad-hoc Document Retrieval (Qingyao Ai, Brendan O’Connor and W. Bruce Croft, 2018) – Deep Learning, Natural Language Processing

A Text Feature Based Automatic Keyword Extraction Method for Single Documents (Ricardo Campos, Vítor Mangaravite, Arian Pasquali, Alipio M. Jorge, Célia Nunes and Adam Jatowt, 2018)

Collection-Document Summaries (Witt, N., Granitzer, M. and Seifert, C., 2018) – Natural Language Processing


Written By
Dawn is an SEO and digital marketing consultant, Managing Director of Move It Marketing and a lecturer in digital marketing strategy at Manchester Metropolitan University
  • This field is for validation purposes and should be left unchanged.