Data Pipelines with Apache Beam

Preface Many data pipeline frameworks offer very similar functionality. With this in mind, Google developed a unified data pipeline framework under the name Cloud Dataflow SDK. This framework was later donated to the Apache Software Foundation. It was then named Apache Beam. Let’s look at the following figure to understand Apache Beam better. Source: https://cloud.google.com/blog/products/gcp/dataflow-and-open-source-proposal-to-join-the-apache-incubator We create a single pipeline, which then allows us to do either Batch Processing or Stream Processing. The processing is realized with a Runtime of our choice. We can test our code locally with Direct Pipeline or send it to a cloud service such as Google Cloud Dataflow. This makes Apache Beam a very interesting framework to try. ...

OpenJDK's Project Loom: User-Level Threads in Java

Preface Nowadays it is universally known that a Java thread corresponds to a Kernel-Level Thread when looking at it in a very simplified matter. Project Loom is trying to introduce User-Level Threads to the Java ecosystem. They call these User-Level Threads Virtual Threads. Virtual Threads are supposed to increase performance and be more resource efficient than just using Kernel-Level Threads. It is planned, that Virtual Threads become a drop-in replacement, which would be a free performance boost for any multithreaded Java application. This is particularly interesting because many Big-Data-Frameworks such as Hadoop or Spark are written in Java. ...