Apache Pig is a platform which is used to analyze larger
sets of data representing them as data flows. Apache Pig is a high-level
platform for creating programs that run on Apache Hadoop. Pig's language layer
currently consists of a textual language called Pig Latin.
Prerequisites
JavaHadoop
Download Pig Jar file
Download Pig 0.14 using below command on /usr/local directory.
$sudo wget http://archive.apache.org/dist/pig/pig-0.14.0/pig-0.14.0.tar.gz /usr/local
|
$ cd /usr/local
$ sudo tar -xvf
pig-0.14.0.tar.gz
|
$ sudo mv pig-0.14.0 pig
|
Setting up environment for Pig
Edit ~/.bashrc file for set up the Hive environment by appending the following lines
$ sudo nano ~/.bashrc
|
export PIG_HOME=/usr/local/pig
export PATH=$PIG_HOME/bin:$PATH
|
$ source ~/.bashrc
|
$ sudo chown -R hdfs:hdfs
/usr/local/pig
$ sudo chmod -R 755
/usr/local/pig
|
Check the version
Check the version of pig.
$ pig --version
Apache Pig version 0.14.0
(r1640057)
compiled Nov 16 2014, 18:02:05
|
Apache Pig Execution Modes
You can run Apache Pig in two modes, namely, Local Mode and MapReduce Mode.Local Mode
$ pig -x local
Grunt>
|
$ pig -x mapreduce
Grunt>
|