Blog posts tagged
"apache spark"

7 posts


robgibbon
15 October 2024

Apache Spark 4.0 beta release – try it now

Article Data Platform

Apache Spark is a popular framework for developing distributed, parallel data processing applications. Our solution for Apache Spark on Kubernetes has made significant progress in the past year since we launched, adding support for Apache Iceberg, a new GPU accelerated image using the NVIDIA Spark-RAPIDS plugin, and...

robgibbon
15 October 2024


robgibbon
23 May 2024

Can it play Doom? Running an AI LAN party on a Spark cluster with ViZDoom

Article AI

It’s all about AI these days, so I decided to try and answer the important question: can you make a Spark cluster run AI agents that play a game of Doom, in a multiplayer LAN party? Although I’m no data scientist, I was able to get this to work and I’ll show you how so

robgibbon
23 May 2024


robgibbon
17 October 2023

Why we built a Spark solution for Kubernetes

Article Data Platform

We’re super excited to announce that we have shipped the first release of our solution for big data – Charmed Spark. Charmed Spark packages a supported distribution of Apache Spark and optimises it for deployment to Kubernetes, which is where most of the industry is moving these days. Reimagining how to work with big data

robgibbon
17 October 2023


Canonical
17 October 2023

Canonical announces supported solution for Apache Spark® on Kubernetes

Article Canonical announcements

17 October 2023 Today, Canonical announced the release of Charmed Spark – an advanced solution for Apache Spark® that provides everything users need to run Apache Spark on Kubernetes.  Apache Spark is suitable for use in diverse data processing applications including predictive analytics, data warehousing, machine...

Canonical
17 October 2023


robgibbon
3 July 2023

Charmed Spark beta release is out – try it today

Article AI

The Canonical Data Fabric team is pleased to announce the first beta release of Charmed Spark, our solution for Apache Spark. Apache Spark is a free, open source software framework for developing distributed, parallel processing jobs. It’s popular with data engineers and data scientists alike when building data...

robgibbon
3 July 2023


robgibbon
3 May 2023

Big data security foundations in five steps

Article Data Platform

We’ve all read the headlines about spectacular data breaches and other security incidents, and the impact that they have had on the victim organisations. And in some ways there’s no place more vulnerable to attack than a big data environment like a data lake.

robgibbon
3 May 2023


Tim McNamara
8 January 2020

Data Ops at petabyte scale

Article Cloud and server

Should you deploy Apache Spark to Kubernetes? Learn how model-driven operations have enabled one data engineering team to evaluate several options and come to an ideal solution.

Tim McNamara
8 January 2020