Loading...

Follow Big Data Europe on Feedspot

Continue with Google
Continue with Facebook
or

Valid

The BigDataEurope consortium would like to thank all stakeholders for following the project, attending our activities and contributing to the results. Your endorsement and participation contributed to a successful outcome! 

Summary of Major Results

One can summarise the results of the project in three categories:

Strengthening the Societal Communities: the large number of project activities provided a platform for stakeholders interest in the seven societal challenges could come together, network, identify common issues and identify common solutions – the latter, at least the ones of a technical nature, largely supported by the project. We augur that a large part of the communities strengthened by the project will continue to work closely together.

BigDataIntegrator Platform: perhaps the most concrete project result, the BDI is a flexible and open-source platform that can be more easily deployed and customised to build Big Data pipelines that address open-ended challenges. Based on the Docker virtualization, the base platform is enriched with a layer of services that support the workflows’ setup, creation and maintenance. Supported by a simple graphical UI, it offers basic building blocks (e.g. Apache Spark, Hadoop HDFS, Apache Flink, etc.) to get started with common Big Data technologies. The BDI continues to be maintained (on Github) beyond the project, and is being used in various external projects and initiatives. Given it’s impact in the big data technical area, it is also being proposed as an Apache Incubator.

Pilot Demonstrationsthe value of the BDI has been demonstrated in 7 separate pipelines that target a selected big data problem from each of the 7 societal challenges. These pipelines extend a predefined BDI pipelines designed for each of the societal challenges, based on the requirements identified for each domain. The pilots have been useful to demonstrate the flexibility of the BDI platform and its ability to target big data challenges of a varying from any domain. Some of the final pilots have taken a life of their own, and are being extended in follow-up projects and initiatives.

In addition to the above, the large amount of valuable material generated by the project is available for posterity on the project profiles:

Youtube Channel (50 videos): The recorded technical webinars (e.g. final BDI launch, relevance in the context of activities like the Big Data Value Association) and societal hangouts; demonstrations of BDI and its pilots, and interviews with the developers. 
SlideShare (237 slides): All project presentations, from a wide variety of physical and online activities (including all workshops, webinars, hangouts and external events).
Flickr: Pictures from the physical events (workshops, conferences and other events).

Follow-up Projects

Below is a list of H2020 and other projects that build on the BigDataEurope results, most notably by adapting a BDI pipeline for their needs. The list is valid until early 2018 and might not be updated at a later date.

Project

Timeframe

BigDataEurope take-up 

BigDataOcean

2017 – 20

Re-use and customisation of the BDE Platform.

SPECIAL

2017 – 19

Re-use and customisation of the BDE Platform (in particular SANSA and Ontario).

NextGEOSS

2016 – 20

The SatCen pilot builds on results of H2020 projects, including the BDE SC7 pilot.

BETTER

2017 – 20

A share of the BETTER big data pipelines targeting a total of up to 36 challenges (12 per year) will exploit the project results, particularly the BDI and the SC7 pilot instance. 

BigDataGrapes

2018 – 20

Re-use and customisation of the BDE Platform, specifically the extension of the software stack that has been produced in the SC2 pilot instance.

DARE

2018 – 20

Re-use and customisation of the BDE Platform as a basis for the DARE platform.

SUMA-I-BDA

2018 – 21

Smarter use of mobility assets through innovative big data Analytics, using the know-how gained in the SC4 pilot.

Precision Medicine

2018 – 19

Re-use and customisation of the BDE Platform a basis for the new Greek national precision medicine infrastructure. These efforts are also related to the IASIS project.

LAMBDA

2018-21

This CSA shall re-use results from the BDE project, in particular the platform.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Big-data Earth observation Technology and Tools Enhancing Research and development is an EU-H2020 research and innovation project, started in November 2017 to the end of October 2020.

The project’s main objective is to implement Big Data solutions (denominated as Data Pipelines) based on the usage of large volumes and heterogeneous Earth Observation datasets. This should help addressing key Societal Challenges, so the users can focus on the analysis of the extraction of the potential knowledge within the data and not on the processing of the data itself.

To achieve that, BETTER is improving the way Big Data service developers interact with end-users. After defining the challenges, the promoters validate the pipelines requirements and co-design the solution with a dedicated development team in a workshop. During the implementation, promoters can continuously test and validate the pipelines. Later, the implemented pipelines will be used by the public in the scope of Hackathons, enabling the use of specific solutions in other areas and the collection of additional user feedback. www.ec-better.eu  

⇒ SUBSCRIBE HERE ⇐ for major project updates

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

We are happy to announce SANSA 0.4 – the fourth release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs.

You can find the FAQ and usage examples at http://sansa-stack.net/faq/.

The following features are currently supported by SANSA:

  • Reading and writing RDF files in N-Triples, Turtle, RDF/XML, N-Quad format
  • Reading OWL files in various standard formats
  • Support for multiple data partitioning techniques
  • SPARQL querying via Sparqlify
  • Graph-parallel querying of RDF using SPARQL (1.0) via GraphX traversals (experimental)
  • RDFS, RDFS Simple, OWL-Horst, EL (experimental) forward chaining inference
  • Automatic inference plan creation (experimental)
  • RDF graph clustering with different algorithms
  • Terminological decision trees (experimental)
  • Anomaly detection (beta)
  • Knowledge graph embedding approaches: TransE (beta), DistMult (beta)

Noteworthy changes or updates since the previous release are:

  • Parser performance has been improved significantly e.g. DBpedia 2016-10 can be loaded in
  • Support for a wider range of data partitioning strategies
  • A better unified API across data representations (RDD, DataFrame, DataSet, Graph) for triple operations
  • Improved unit test coverage
  • Improved distributed statistics calculation (see ISWC paper)
  • Initial scalability tests on 6 billion triple Ethereum blockchain data on a 100 node cluster
  • New SPARQL-to-GraphX rewriter aiming at providing better performance for queries exploiting graph locality
  • Numeric outlier detection tested on DBpedia (en)
  • Improved clustering tested on 20 GB RDF data sets

Deployment and getting started:

  • There are template projects for SBT and Maven for Apache Spark as well as for Apache Flink available to get started.
  • The SANSA jar files are in Maven Central i.e. in most IDEs you can just search for “sansa” to include the dependencies in Maven projects.
  • Example code is available for various tasks.
  • We provide interactive notebooks for running and testing code via Docker.

We want to thank everyone who helped to create this release, in particular the projects Big Data EuropeHOBBITSAKEBig Data OceanSLIPOQROWDBETTERBOOST and SPECIAL.

Spread the word by retweeting our release announcement on Twitter. For more updates, please view our Twitter feed and consider following us.

Greetings from the SANSA Development Team

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

On 2nd February 2018, the BigDataEurope (BDE) societal challenge on transport (SC4) hosted its third and final hangout. Entitled “The Way Forward for Big Data in Transport”, the hangout presented the results of the work done in the BDE project and discussed the advancements of big data in the transport sector.

Maxime Flament (ERTICO – ITS Europe) opened the webinar and welcomed participants. He went through the agenda points and opened the first poll of the webinar, in order to identify the background of the participants. It was clear that an equal amount of software engineers, researchers, and transport experts were present, as well as some public and local authorities.

Josep Maria Salanova Grau (CERTH) then introduced the BDE project and explained how it sought to harness big data for transport. His presentation is available below.

Josep Maria Salanova Grau (CERTH) – BDE Project Introduction

This presentation was followed by the second poll, asking participants about their hesitations in using big data. For most participants, it seemed that the financial aspects of acquiring data sets was the primary worry, followed by lack of expertise in data science or related fields and privacy issues.

Then Maxime Flament (ERTICO – ITS Europe) gave a brief overview of the workshop that SC4 organised in September. It was the third, and final, workshop for BDE SC4. In contrast to the first workshop, which aimed to identify user needs and requirements, and the second workshop, which presented the offerings of the BDE platform and explained how it can apply to the transport pilot site in Thessaloniki, the third workshop presented the results of BDE. When the BDE project started out, integrating data into the transport sector was still a novel idea. Now it has become a game-changer with ever-increasing importance. The workshop also highlighted the need to have the right data at the right time and at the right place. More information is available in the presentation below.

Maxime Flament – BDE SC4 Third Workshop Report

Another interesting poll was launched after this presentation, showing that vehicle information is by far the most used data set amongst participants.

After this poll, Luigi Selmi (Fraunhofer IAIS) presented the Big Data Integrator Platform (BDI). The Platform is the key outcome of the BDE project. It provides an integrated stack of tools that allow large-scale data resources to be processed, analysed and published without the need for native installations of additional tools for data processing at scale. As an ecosystem of specifications and reference implementations, the BDI has tremendous potential for deployment across sectors and for a range of players in the data value chain.

Luigi Selmi – Big Data Integrator Platform

The final presentation in the webinar was from the Rajendra Akerkar (Vestlandsforsking) of the LeMO project, with which BDE has already established a good collaboration. The LeMO project participated in the BDE SC4 workshop in September and has contributed a blog entry on the future of big data in transport to the BDE website. Their presentation at the webinar focused on the objectives of the LeMO project. LeMO will carry out a set of case studies and, based on their results, develop recommendations and a roadmap, which policy makers will be able to use to make better-informed decisions. You can find out more about the LeMO project in the presentation below.

Rajendra Akerkar – LeMO Project

After the LeMO presentation, webinar participants were asked to cast their vote in the last poll, this time concerning the LeMO case studies. As could have been expected, most of the BDE followers present at the webinar would like to learn more about the open data and transport case study that LeMO will undertake.

Finally, the webinar also left room for a question and answer session. Participants wanted to know how they could get involved in BDE and get help from BDE for their big data projects. Since the BDE project officially concluded in December 2017, this is a very pertinent question. Interested participants can contact the BDE consortium, which will continue to be available to assist in training people to use the framework.

Another interesting question was raised about whether or not EU funded projects such as BDE can have a real impact and compete with technology giants like Google or Apple. According to the BDE consortium, Google and Apple will always be competitors and, with the coming General Data Protection Regulation (GDPR), there will be more limitations. Nonetheless, it is important to continue competing and pushing for advancements and results, even though the competition might be fierce.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

On 13 December, 2017 we hosted the third workshop in the domain of Societal Challenge 1 (Health, Demographic Change and Wellbeing). This was also the final workshop of the Big Data Europe project, and our aim was to look back over everything we’d achieved not only in SC1, but in the BDE project as a whole.

16 people from a variety of backgrounds attended the workshop, including many participants from the European Commission, and three invited speakers who presented other ongoing and upcoming projects related to big data in the domain of health.

Following an introduction to Big Data Europe from Simon Scerri, Jonathan Langens presented a live demo of the BDE project’s Big Data Integrator. He showed participants the fundamentals of how the BDI makes it easier for big data users to build, set up, deploy and monitor projects, using the Stack Builder, Workflow Builder, and Swarm UI. Many resources such as manuals, webinars, screencasts, and template containers are also available to help reduce barriers to entry in big data across all domains and Societal Challenges.

Kiera McNeice then presented the SC1 pilot, which has replicated the functionality of the Open PHACTS Discovery Platform with open components in the BDE infrastructure. The Open PHACTS Discovery Platform provides an API for users to query multiple linked life science data sources, and answer real research questions more efficiently and cost-effectively, as well as making entirely new kinds of queries possible.

Our invited speakers then demonstrated three other European big data projects in the health domain.

Michaela Black presented MIDAS (Meaningful Integration of Data Analytics and Services), which aims to connect big data in health and present it to public health policy makers in accessible and useful ways; this will help inform better policy-making decisions at all levels across Europe. (slides)

Supriyo Chatterjea presented BigMedilytics, a broad consortium which aims to take a holistic view of healthcare in order to improve productivity in the healthcare sector by 20% by addressing the major themes of chronic disease, oncology, and industrialisation of healthcare. (slides)

Guillermo Palma presented iASiS, a project focussing on personalised medicine. The project aims to develop pilots in lung cancer and Alzheimer’s disease by connecting electronic health records, genomic data, bibliographic data, and public pharmacological databases, and is based on the BDE framework. (slides)

In the discussion that followed, several key questions and challenges were raised regarding big data in healthcare. In particular issues of privacy, ethics and consent wherever patients’ personal data are concerned came up repeatedly in the context of improving and personalising healthcare. The importance of linking data across different domains will also be crucial to improving health outcomes, as health depends on many factors beyond the individual, across many different Societal Challenges – for example infrastructure, food, environment, and more.

Building up trust among all stakeholders was thought to be crucial to addressing these concerns, as was finding win-win scenarios where all parties benefit from sharing data with each other. The need for “future-proofing” was also discussed – that is, ensuring users can be certain of ongoing support for any big data infrastructure.

As the Big Data Europe infrastructure and BDI are fully open and available for anyone to use, we hope that big data communities both in the health sector and across all the Societal Challenges will continue to use and adapt it to their own needs, building on the success of the BDE project. Big Data Europe has now come to a close, but we encourage everyone to learn more about the platform and start planning your big data project today!

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The BigDataEurope (BDE) project came to a close at the end of 2017 with a number of technical results to show after three years of development and pilot testing. A key outcome of the project was the Big Data Integrator Platform(BDI), which provides an integrated stack of tools that allow large-scale data resources to be processed, analysed and published without the need for native installations of additional tools for data processing at scale.

As an ecosystem of specifications and reference implementations, BDI has tremendoius potential for deployment across sectors and for a range of players in the data value chain.

During the course of the project, BDE’s Thessaloniki-based transport pilot, which represents the EU’s fourth societal challenge (SC4), carried out important development and testing of the BDI Platform.

In our final webinar entitled The Way Forward for Big Data in Transport we will present:

  • results from the transport pilot’s use of the BDI Platform
  • conclusions from BDE’s SC4 workshop: BigDataEurope and the Societal Challenge on Transport. (The workshop, held in September 2017 collected feedback from stakeholders requiring transport solutions on how the platform can be extended to increase its value and facilitate further applications.)
  • next steps for big data in transport (in cooperation with LeMO – another EU project focussing on big data in the transport sector)

Agenda: The Way Forward for Big Data in Transport

  • Introduction to the BigDataEurope project
    • Josep Maria Salanova –  CERTH (10:00 – 10:05)
  • Summary and conclusions from BDE SC4 Workshop: BigDataEurope and the Societal Challenge on Transport
    • Maxime Flament – ERTICO-ITS Europe (10:05 – 10:15)
  • Final Report on the Transport Pilot Using the Big Data Europe Integrator Platform
    • Luigi Selmi – Fraunhofer IAIS(10:15-10:25)
  • Big Data in Transport: Next Steps with LeMO project
    • Rajendra Akerkar – Vestlandsforsking (10:25 – 10:35)
  • Conclusions and Q&A (10:35 – 11:00)

We look forward to your participation!

  • WHAT: Webinar: The Way Forward for Big Data in Transport
  • WHEN: 2 February 2018, 10:00 – 11:00
  • WHO: Josep Maria Salanova (CERTH), Maxime Flament (ERTICO-ITS Europe), Luigi Selmi (Fraunhofer IAIS), Rajendra Akerkar (Vestlandsforsking)
  • HOW: Register here
  • WHY: Find out about the value of the Big Data Integrator Platform for the EU’s transport societal challenge and the way forward for big data in transport.

     To register for the webinar, please click here.

Sign up to our newsletter to find out more on how you can get involved.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The EU SatCen organised on Wednesday 13th of December 2017 (15.00-16.00 CET) a Hangout session on “Big Data for Secure Societies” in the framework of the BigDataEurope project and with regard to the “Secure Societies” Horizon 2020 Societal Challenge. The Hangout was the fifth of a scheduled series in the BigDataEurope project for the Secure Societies domain and it has been followed by attendees coming from EC, EU entities, ESA, private companies, universities and other public entities from a variety of domains in Secure Societies.

The opening presentation was delivered by Sergio Albani (EU SatCen, Secure Societies Domain Leader).  
The presentation made a recap of the BigDataEurope project, highlighting the role of the EU SatCen and reviewed the related community building activities addressing the “Secure Societies” challenge (the series of “Big Data in Secure Societies” workshops, the “Big Data for Secure Societies” webinars and hangouts, and the participation at international events). 

The second presentation was delivered by Giorgos Argyriou (University of Athens, Secure Societies Technical Leader), who made an overarching picture of implementation aspects of the Secure Societies pilot, reviewing the architecture, data sources, and implemented workflows, focusing on the Change Detection Workflow. The presentation ended with a demo of the Secure Societies pilot. 

The Hangout continued with the presentation of George Giannakopoulous and Nikiforos Pittaras (NSCR “Demokritos”, Secure Societies Technical Support), who illustrated the Event Detection workflow with respect to its components, the algorithm and its scalability and how the Location Extraction, Entity Extraction and the Image Extractor have been implemented.  

The Hangout was concluded by the presentation of Michele Lazzarini (EU SatCen, Project Officer), who illustrates others H2020 projects where SatCen is involved (e.g. EVER-EST, NextGEOSS and BETTER) and the possible scenarios where the activities started with BigDataEurope will continue in the future.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

We are happy to announce SANSA 0.3 – the third release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs.

You can find the FAQ and usage examples at http://sansa-stack.net/faq/.

The following features are currently supported by SANSA:

  • Reading and writing RDF files in N-Triples, Turtle, RDF/XML, N-Quad format
  • Reading OWL files in various standard formats
  • Support for multiple data partitioning techniques
  • SPARQL querying via Sparqlify (with some known limitations until the next Spark 2.3.* release)
  • SPARQL querying via conversion to Gremlin path traversals (experimental)
  • RDFS, RDFS Simple, OWL-Horst (all in beta status), EL (experimental) forward chaining inference
  • Automatic inference plan creation (experimental)
  • RDF graph clustering with different algorithms
  • Rule mining from RDF graphs based AMIE+
  • Terminological decision trees (experimental)
  • Anomaly detection (beta)
  • Distributed knowledge graph embedding approaches: TransE (beta), DistMult (beta), several further algorithms planned

Deployment and getting started:

  • There are template projects for SBT and Maven for Apache Spark as well as for Apache Flink available to get started.
  • The SANSA jar files are in Maven Central i.e. in most IDEs you can just search for “sansa” to include the dependencies in Maven projects.
  • There is example code for various tasks available.
  • We provide interactive notebooks for running and testing code via Docker.

We want to thank everyone who helped to create this release, in particular the projects Big Data EuropeHOBBITSAKEBig Data OceanSLIPOQROWD and BETTER.

Greetings from the SANSA Development Team

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

[BigDataEurope (BDE) SC4 and the LeMO project have established a cooperation, as both focus on big data in the transport sector. LeMO project representatives Arnaud Burgess and Maria Rodrigues (Panteia) already participated in the workshop, organised by BDE SC4, in September to present their project. Now, Tharsis Teoh (Panteia) has kindly contributed a blog entry.]

By Tharsis Teoh (written on 15th December 2017)

At a recent convention on urban logistics, I had the opportunity to meet some of the most innovative start-ups in the field. The products on offer ranged from an ultra-fast laundry delivery service to clothing e-retailers. One consistent theme in each of their presentations was the obsessive need for collecting data about their customers, their partners, the traffic system, their fleet, the weather, and about everything else they could get their hands on. Were they talking about big data or just a lot of data? In some cases, the line was blurred. In most cases, regardless of whether they currently used it, they definitely spoke as if they needed BIG data. 

Similar discussions are had all across the transport industry: from providers of transport services, integrators, transport infrastructure providers, up to the regulators. On one hand, it is clear that there are significant benefits that big data promises; the enthusiasm and running business models of the start-ups made that clear. One clear example of how big data enhances business models (or in this case, cooperative business models) is in the sharing of public transport fare revenue among different transport operators for subscription holders. Previous models had to be recalibrated by extensive surveys, but can now instead be done via big data techniques and boatloads of transit card data. But on the other hand, it is still early days on how significant, we can expect these benefits to be. The sceptics always ask the same question: “Is the hype real?” Regardless of our opinions on this, the industry is moving in this direction; and entrepreneurs and businesses need to keep up. 

The interest in the promise of big data by regulators and public bodies overlap somewhat with the industry, i.e. to achieve sustainability objectives by using the technology. But, more importantly regulators need to protect the privacy of their citizens (see the soon-to-be-formally binding General Data Protection Regulation (GDPR) or similar regulations at the national level), and guard their institutions, systems, and businesses from cyber-attacks or manipulation (a bigly theme in 2016 and 2017). Ethical questions also hover over the regulation (that is, the question of jurisdiction) and ownership of the data (by the harvester, collector, or the source). In other words, big questions still remain on this big data thing.

With that in mind, we are very excited to announce our recently launched research project Leveraging Big Data to Manage Transport Operations (LeMO) that aims to address these issues (among many others). One of the main outputs, which we will ultimately develop is a research and policy roadmap for the European Union. The aim is provide a sustainable, ethical, efficient, and effective approach to tackle the difficulties that lie in the use of big data. The main project partners are leaders and major stakeholders in the field: Western Norway Research Institute, Goethe University of Frankfurt, Confederation of Organisations in Road Transport Enforcement, Bird & Bird, and PANTEIA. Our key focus will be the analysis of state-of-the-art applications in the transport industry itself (see our list of case studies), where we examine the institutional, regulatory, and technical barriers and facilitators of using big data, as experienced by the industry.

In many ways, this takes over, where BigDataEurope leaves off. While the focus in the Transport Pilots focused on developing Big Data techniques and current applications, the LeMO project looks across the horizon to build a strategic roadmap. Many of the insights into the opportunities and challenges will feed into the development of the right research and policy roadmaps. Certainly, BigDataEurope’s mark will be present in defining the big data transport strategy of Europe in the coming decades.

We’ll continue to update our project page, with our findings during the coming months, so please follow our updates via twitter and get in touch with us.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The 3rd edition of the Big Data from Space Conference (jointly organised by ESA, SatCen and JRC, and hosted by CNES) brought together in Toulouse around 500 participants interested in exploiting massive spatio-temporal Earth and Space observation data.

The Conference was co-chaired by Sergio Albani (SatCen), Jean-Pierre Gleyzes (CNES), Pier Giorgio Marchetti (ESA) and Pierre Soille (JRC).

The main challenges and issues in the Space and Security domain, pointing out the added value coming from Big Data and related technologies as well as the need of operational solutions tailored to user needs, were introduced during one of the opening talk given by SatCen Deputy Director Giuseppe D’Amico. 

The key role that cooperation activities and international projects participation play in the Space and Security domain, and the efforts made by the SatCen RTDI Unit to increase R&I capacities in support to operations, were highlighted by Sergio Albani, Responsible for the RTDI Unit and BDE SC7 “Secure Societies” Domain Leader. The presentation emphasized the internal and external initiatives that are taking place (e.g. the participation in H2020 projects such as BigDataEurope) to collect user requirements from key stakeholders in the Space and Security domain and evaluate new trending technologies. 

Conference materials are available on the conference website: www.bigdatafromspace2017.org.

Read Full Article

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview