Cookies: This website makes use of cookies and external analytics. Using it requires your information and consent.

Research & Development

Previous projects

AnnoMarket, Cloud-based Text Annotation Marketplace

A two-year European project funded by the European Commission through the Seventh Research Framework Program (FP7-SME) and under Project No 296322. The project started in June 2012.

AnnoMarket aims to revolutionize the text annotation market, by delivering an affordable, open marketplace for pay-as-you-go, cloud-based extraction resources and services, in multiple languages.

Project website…Our contribution:

  • Large scale web crawls and focus web crawls.
  • Providing multi-lingual web corpora resources.
DOPA, Data Supply Chains for Pools, Services and Analytics in Economics and Finance

A two-year European project funded by the European Commission through the Seventh Research Framework Program (FP7-SME) and under Project No 296448. The project started in May 2012.

DOPA achieves breakthroughs in large scale, high-quality information sourcing and processing on a distributed platform: it helps bring together related data from disparate sources thanks to automated entity linkage while making sens of this wealth of data through visualization tools.

Project website… Our contribution:

  • Creating large-scale multilingual time series of economic and financial information from the Web and online social networks.
  • Strictly respecting legal framework, Intellectual Property and Privacy rights.
  • Selecting active sources (RSS feeds, news, forums, blogs,...) that focus on various aspects relevant in this domain (E-Reputation, customers opinion, stock trading, company news etc.)
  • Achieving large-scale coverage without forgoing the quality of the data.
TrendMiner, Large-scale, Cross-lingual Trend Mining and Summarization of Real-time Media Streams

A three-year European project funded by the European Commission through the Seventh Research Framework Program (FP7-ICT) and under Project No 287863. The project started in November 2011.

The goal of this project is to deliver innovative, portable open-source real-time methods for cross-lingual mining and summarization of large-scale stream media. This is achieved through an inter-disciplinary approach, combining deep linguistic methods from text processing, knowledge-based reasoning from web science, machine learning, economics, and political science. Scalability and affordability will be addressed through a cloud-based infrastructure for real-time text mining from stream media.

Project website…Our contribution:

  • Providing a scalable infrastructure to partners, with support for integration and experiment.
  • Designing and developing an application-aware crawler mechanism for social media.
Rethink Big, Roadmap for European Technologies in Hardware Networking for Big Data

A two-year European project funded by the European Commission through the Seventh Research Framework Program (FP7-ICT / Support Actions) and under Project No 619788. The project started in March 2014.

The objective of the Rethink Big project is to bring together the key European system architects and the consumers of big Data systems, to describe a coherent vision that will highlight the various business and technical challenges within the EU, and to achieve success by addressing the need of EU for the processing and the analysis of Big Data over the next 10 years.

Project website…Our contribution:

  • Large-scale data management.
  • Controlling and optimizing the processing chain from hardware to software, in line with the datacenter as a computer approach with the goal to minimize costs to the limit of current possibility.
  • Experimenting with cool-free datacenter and co-designing together with No Rack unconventional server settings to achieve higher performance per unit of euro spent for both compute and storage of heterogeneous unstructured data from the web.