Apache Flume is a distributed, reliable, and available system for efficiently
collecting, aggregating and moving large amounts of log data from many different
sources to a centralized data store. Apache Flume is not only restricted to log
data aggregation. Flume can be used to transport massive quantities of event
data including network traffic data, social-media-data, email messages also.
Prerequisites
Java
Hadoop
Hadoop
Download flume tar file
Download flume 1.4.0 using below command on /usr/local directory.
$ sudo wget
http://archive.apache.org/dist/flume/1.4.0/apache-flume-1.4.0-bin.tar.gz
/usr/local
|
Unpack
apache-flume-1.4.0-bin.tar.gz file
$ cd /usr/local
$ sudo tar -xvf
apache-flume-1.4.0-bin.tar.gz
|
Rename apache-flume-1.4.0-bin folder to flume
$ sudo
mv apache-flume-1.4.0-bin flume
|
Setting up environment for flume
Edit ~/.bashrc file for set up the flume environment by appending the following lines.
$ sudo nano ~/.bashrc
|
export
FLUME_HOME=/usr/local/flume
export PATH=$FLUME_HOME/bin:$PATH
|
Reload the configuration file ~/.bashrc with the following command.
# source ~/.bashrc
|
Rename flume-env.sh.template file to flume-env.sh.
$ cd /usr/local/flume/conf
$ sudo cp flume-env.sh.template
flume-env.sh
|
Edit the flume-env.sh file
$ sudo nano flume-env.sh
|
Set value for JAVA_HOME and JAVA_OPTS environment variable with java installation directory.(default this
variable are commented. Need to uncomment and set values as per below).
JAVA_HOME=/usr/opt/jdk
JAVA_OPTS="-Xms500m
–Xmx1000m -Dcom.sun.management.jmxremote"
|
Note:If we are going to use memory channels while setting flume agents, it is
preferable to increase the memory limit in JAVA_OPTS variable. By default, the
minimum and maximum memory values are 100 MB and 200 MB respectively. Better to
increase these limits to 500 to 1000 MB respectively.
Change the ownership and permissions of the directory /usr/local/flume
$ sudo chown -R hdfs:hdfs
/usr/local/flume
$ sudo chmod -R 755 /usr/local/flume
|
Check version
Check the version of flume.
$ flume-ng version
Flume 1.4.0
Source code repository:
https://git-wip-us.apache.org/repos/asf/flume.git
Revision:
756924e96ace470289472a3bdb4d87e273ca74ef
Compiled by mpercy on Mon Jun
24 18:22:14 PDT 2013
From source with checksum
f7db4bb30c2114d0d4fde482f183d4fe
|