Learn Apache Flume
Apache Flume Course Introduction
Apache flume is an open source data collecting tool used for moving the data from source to destination. In this Apache Flume tutorial, we will study how Flume helps in streaming data from various sources and why Flume became so popular.
Every company has lot of servers and applications, lot of data you can say logs are produced by the applications. To process that logs, we need a scale able, manageable, extensible and reliable data collection tools which can collect the data from one location to another location, where they will be processed (like HDFS). Apache flume is an open source data collection tool for moving the data from source to destination.
Apache Flume is usefull for moving large amounts of streaming data into the Hadoop Distributed File System (HDFS) and its highly fault-tolerant and robust. Flume can do data collection in batch and streaming mode.
Where Flume can Help?
Problem: As you can see below, we need to analyze the lot of server logs using HDFS/Hadoop but how we can send the logs to HDFS.
Apache Flume is the most scale able,manageable, extensible and reliable data collection tools for systematically collecting and moving large amounts of streaming data to the HDFS.