Bio Stefan Plantikow
Stefan is a language standards engineer and product manager for the Cypher implementation at Neo4j and has a research background in distributed fault-tolerant systems and transaction processing. Stefan is passionate about computer language design and how languages as a medium enable access to new technology. As a principal designer of the Cypher language at Neo4j, he built the first cost-based query planner for property graph databases, designed the architecture of Cypher for Apache Spark (the first big data implementation of Cypher), and helped introducing labels to the property graph world. He is currently working towards the creation of an International Standard for a GQL Graph Query Language. Stefan has a degree in computer science from Humboldt University, Berlin. He is still based in Berlin and nowadays continuously works on expanding the scope, the ease of use, and the applicability of graph technology for users.
Session: Commit2Data: Learning from Data
Cypher for Apache Spark – The future of graph querying meets big data
Graph pattern matching is one of the most interesting and challenging operations in graph analytics. Industry query languages like Cypher, implemented in systems like Neo4j, SAP HANA Graph and Redis Graph, and developed by the openCypher project, allow the intuitive definition of graph patterns including structural and semantic predicates.
Today’s graph query languages are most prominent in graph database management systems such as Neo4j. However, graph querying is also a valuable addition to systems for the distributed processing of complex analytical workloads over large volumes of data, like Apache Spark.
To bring the benefits of Cypher from the graph database realm into the world of Big Data, we at Neo4j started developing Cypher for Apache Spark (CAPS). CAPS is primarily focused on graph-powered data integration and graph analytical query workloads within the Spark ecosystem. In addition, CAPS serves as our testbed for next-generation language extensions like multiple graphs, graph transformations and construction, and query composition.
The talk will give an overview of graph querying, the CAPS system, and the recently decided adoption of CAPS into the Apach Spark project.
Abstract Peter Boncz (CWI)
Querying changing graph data in Spark
On querying changing graph data in Spark (IndexedDataframe: interactive queries in Apache Spark on changing data). My PhD student Dean submitted a poster abstract on the topic of Packed Memory Arrays and their use for storing and querying changeable graphs.
Bio Frank Blaauw
Dr. Frank Blaauw is a Postdoc researcher at the University of Groningen, where he is a member of the Distributed Systems group of the Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence and the Groningen Digital Business Center. His research focuses on the data science, machine learning, time series analysis, and scalable computing, with a particular focus on bringing research on these topics to production.
ECiDA – Evolutionary Changes in Data Analysis
Modern data analysis platforms offer virtually no support for dynamic reconfiguration. Instead, modifying such systems involves restarting, which is typically expensive and may even be hazardous. We present the philosophy of the ECiDA platform, which is designed to support dynamic reconfiguration in a safe and transparent fashion. We’ll discuss how ECiDA can offer a data processing solution for rapidly developing fields such as data science and smart industry.
Bio Simon Dalmolen
Simon Dalmolen is a scientist and consultant within the TNO Data Science department. Simon is a Ph.D candidate in the Industrial Engineering & Business Information Systems (IEBIS) department at the University of Twente. His Ph.D research is about developing Cross Chain Control Centers. He holds a MSc. in Computing Science (Distributed Systems & Software Engineering). He is an expert in the field of supply chain collaboration and has international experience. Simon has executed as a senior business consultant during his employment at a software integrator in various roles e.g. Enterprise Architect, Solution Architect, Management Consultant and Data Scientist. In addition , Simon is also active in the academic world in which he has published several articles including “Serious Gaming” and “Orchestration as a Service”. He has also been involved in the setting up of horizontal collaboration between competing logistic service providers.
‘IDS – Demonstration, Applicability for Industry and Logistics and Next Steps’
The Smart Industry Fieldlab ‘Smart Connected Suppliers Network’ works together with a large number of involved players, users and IT suppliers to achieve standardization. Suppliers in a high-tech, low volume, high complexity chain exchange order and product information with many parties. This information must be exchanged faster and more reliably. In order to achieve this, various ICT systems for ERP and CAD / CAM are linked together This standardization must in time comply with international standards such as those of International Data Spaces. This session will discuss the requirements and needs, what is already possible today and what the developments of an IDS are.
Bio Lars Nagel
Lars Nagel is Managing Director of the International Data Spaces Association. He holds a Diploma in Mechanical Engineering (Dipl.-Ing.) from Technical University of Dortmund. He worked as a research fellow at the Chair of Materials Handling and Warehousing at Technical University of Dortmund. His fields of experience were material flow systems and organisation of large-scale networks. For six years he was Head of strategic development of Fraunhofer-Institute for Material Flow and Logistics IML. Parallel he was Member of the executive board of EffizienzCluster Management GmbH from May 2010 until August 2013, managing one of 15 leading-edge clusters of the BMBF. From 2013 until 2016 he was CEO of GlobalGate GmbH, a consulting company for corporate learning and knowledge transfer.
International Data Spaces (IDS) – Ambition, Scope and Relevance for the Data Economy’
Digital responsibility is evolving from a hygiene factor to key differentiator and source of competitive advantage. Future data platforms and markets will be built on design principles that go beyond the traditional understanding of cybersecurity and privacy. This presentation describes the ambition of the International Data Spaces (IDS) initiative. Based on strong data ethics principles the IDS Reference Architecture Model puts the user in its center to ensure trustworthiness in ecosystems and sovereignty over data in the digital age as its key value proposition.