headline-think-tank

Zaloni presents the Big Data Think Tank

The Big Data Think Tank video series highlights insights and innovations from some of today’s leading experts in big data and analytics. We are happy to share their knowledge with you. Go deep into technologies such as Julia, Apache Spark, and Kafka to then pan out to topics about holistic data lake management, metadata management, and big data use cases that touch everyday lives.

Insights About the Latest Technologies

The world of big data isn’t slowing down anytime soon. There are new systems debuting at the speed of light but it’s important to keep a keen eye on the ones that are forging the way:

  • Apache Kafka – a horizontal, scalable messaging system with a unique design that provides a unified, high-throughput, low-latency platform for handling real-time data feeds.
  • Apache Spark – provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance. Project Tungsten is meant to bring Spark closer to bare metal for most of the computations through native memory management and on-time code generation.
  • Data Lake Architecture – data lakes are becoming increasingly central to enterprise data strategies – they address today’s data realities: much greater data volumes and varieties, higher expectations from users, and the rapid globalization of economies.
  • Ground – regardless of which services you’re using, you will use metadata to combine information types in different ways over time. From business to our daily lives, big data is changing the way we live.
  • BI-on-Hadoop – Hadoop is no longer relegated to a platform for data science or batch processing – now companies are also pushing traditional Business Intelligence workloads to Hadoop as well.
  • Julia – a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy and an extensive mathematical function library.
tech insights

Insights About the Latest Technologies

The world of big data isn’t slowing down anytime soon. There are new systems debuting at the speed of light but it’s important to keep a keen eye on the ones that are forging the way:

  • Apache Kafka – a horizontal, scalable messaging system with a unique design that provides a unified, high-throughput, low-latency platform for handling real-time data feeds.
  • Apache Spark – provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance. Project Tungsten is meant to bring Spark closer to bare metal for most of the computations through native memory management and on-time code generation.
  • Data Lake Architecture – data lakes are becoming increasingly central to enterprise data strategies – they address today’s data realities: much greater data volumes and varieties, higher expectations from users, and the rapid globalization of economies.
  • Ground – regardless of which services you’re using, you will use metadata to combine information types in different ways over time. From business to our daily lives, big data is changing the way we live.
  • BI-on-Hadoop – Hadoop is no longer relegated to a platform for data science or batch processing – now companies are also pushing traditional Business Intelligence workloads to Hadoop as well.
  • Julia – a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy and an extensive mathematical function library.
Insights About the Latest Technologies
Jay Kreps discusses Kafka

Apache Kafka and the Stream Data Platform

Jay Kreps discusses Kafka, a horizontal scalable messaging system. He explains in detail how this software acts as a replacement for traditional enterprise message brokers and how it syncs data between different systems and databases, allowing for real time analytics.

Apache Kafka and the Stream Data Platform
Matei Zaharia

Apache Spark 2.0

In this video, Matei Zaharia will talk about the computations he believes people care about – Project Tungsten. Project Tungsten is meant to bring Spark closer to bare metal for most of the computations through native memory management and on-time code generation. Project Tungsten focuses on substantially improving the efficiency of memory and CPU for Spark applications.

Apache Spark 2.0
ben sharma on the importance of data lake management

Managing the Data Lake

Not sure what a data lake is and what a well-managed data lake is capable of? Renowned speaker Ben Sharma, covers the fundamentals of a data lake in this video. He explains the process and purpose of a data lake and the importance of proper management.

Managing the Data Lake
Vikram Sreekanti discusses Ground

Ground: Managing Metadata in the Big Data Ecosystem

Today’s data has changed drastically and we look at data much differently. Depending on who you are and what services you’re providing, you may combine data pieces in different ways. In this video, Vikram Sreekanti explains how data is used today and how data assists us in our common daily tasks.

Ground: Managing Metadata in the Big Data Ecosystem

Performance Benchmark for Business Intelligence on Hadoop

In this video, Josh Klahr talks about the “everybody wins” scenario for business intelligence on Hadoop from a recent AtScale benchmark study. Bottom line, the results are great news for any company looking to analyze their big data in Hadoop because you can now do so faster, on more data, for more users than ever before.

Performance Benchmark for Business Intelligence on Hadoop
Jeff Bezanson on new features in Julia

What’s New In Julia?

This video features Jeff Bezanson talking about the new features in Julia, a high-level, high-performance dynamic programming language for technical computing with syntax that is familiar to users of other technical computing environments. He goes in detail about how the language has been designed to be amenable to various kinds of static-type analysis to get better performance.

What’s New In Julia?