Category Big Data

Install Hadoop on Ubuntu 16.04 LTS (Standalone Mode)

1. Overview This tutorial is going to illustrate how to install Hadoop on Ubuntu 16.04 so that you can perform simple operations using Hadoop MapReduce and the Hadoop Distributed File System (HDFS). Hadoop cluster can be deployed in one of… Continue Reading →

Apache Flume Kafka Source And HDFS Sink Tutorial

To continue the series about Apache Flume tutorials, I’d like to share an example about Apache Flume Kafka Source and HDFS Sink. One of popular use case today is to collect the data from various sources, send them to Apache… Continue Reading →

Apache Flume HDFS Sink Tutorial

Apache Flume is a distributed tool to collect and move a large amount of data from different sources to a centralized data store. Apache Flume introduces 2 basic concepts I’d like to introduce in this tutorial. The first one is… Continue Reading →

Apache Kafka Connect Example

1. Introduction to Apache Kafka Connect Apache Kafka, which is a kind of Publish/Subscribe Messaging system, gains a lot of attraction today. We can see many use cases where Apache Kafka stands with Apache Spark, Apache Storm in Big Data… Continue Reading →

Apache Kafka Command Line Interface (CLI)

Here are someĀ  commands often be used when we work with Apache Kafka command line interface (CLI). 1. Start the Kafka server We needs 2 steps: 1.1 Start the ZooKeeper

1.2. Start the Kafka server

2. List all… Continue Reading →

How To Write A Custom Serializer in Apache Kafka

To continue the series about Apache Kafka, I’d like to share how to write a custom serializer inĀ  Apache Kafka. 1. Why we need a custom serializer in Apache Kafka? Apache Kafka allows us to send the messages with different… Continue Reading →

Write An Apache Kafka Custom Partitioner

Continue the series about Apache Kafka, in this post, I’d like to share some knowledge about Apache Kafka topic partition and how to write an Apache Kafka Custom Partitioner. 1. Basic about Apache Kafka Topic Partition. There are many reasons… Continue Reading →

Create Multi-threaded Apache Kafka Consumer

In previous posts, I introduced about how to get started with Apache Kafka by installing and using Java client API 0.9 as well. In this post, I’d like to share how to create multi-threaded Apache Kafka consumer. You can take… Continue Reading →

Apache Kafka Java Client API Example

In this article, I’d like to show you how to create a producer and consumer by using Apache Kafka Java client API. 1. Overview Apache Kafka has some built-in client tools to produce and consume messages against Apache Kafka broker…. Continue Reading →

Getting Started With Apache Kafka 1.0

In this article, I’d like share some basic information about Apache Kafka, how to install and use basic client tools ship with Kafka to create topic, to produce/to consume the messages. The version used is Apache Kafka 1.0.0. 1. Concept… Continue Reading →

« Older posts

© 2025 HowToProgram — Powered by WordPress

Theme by Anders NorenUp ↑

NewsletterSubscribe To Learn More