How to use TeraLab

De BIGDATA
Aller à : navigation, rechercher

1. How to install SF4H

SF4H (Scub Foundation For Hadoop) is a personalised Eclipse including options for deploying and launching a maven project on Teralab's cluster.

This platform was developed by Square Predict's team (http://square-predict.net/). [Documentation: https://drive.google.com/open?id=1zVt1TB3p2az8yOmS7nKVWB_j5O7Bh2VGJWYYjKAaUmE]

To install SF4H on Ubuntu 64 bits >= 13.04, using the APT:

   $: sudo su
   $:echo "deb http://ns209168.ovh.net/scub-foundation-for-hadoop-deb-depot ./"  >>  /etc/apt/sources.list.d/sffh.list
   $:echo "deb http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.17/repos/ubuntu12 HDP-UTILS main"  >> /etc/apt/sources.list.d/sffh.list
   $:	//use these two commands to update your SF4H
   $:sudo apt-get update
   $:sudo apt-get install scub-foundation-for-hadoop

You can also install SF4H using the Ubuntu Software Center by searching for "scub".

There is a virtualized version vbox for mac and Windows.

2. How to configure your project

In this case, the program will be configured by the file "pom.xml".

Bellow is an example of how to configure your project and pass paramaters by the "pom.xml"

   <plugin>
       <groupId>scub-foundation-for-hadoop.plugin</groupId>
       <artifactId>scub-foundation-for-hadoop-plugin-deploy</artifactId>
       <configuration>
            <type>spark</type>
            <mainClass>org.lipn.clustering.gstream.GStreamExample</mainClass>
            <args>
                  <param>yarn-client</param>
                  <param>/user/share/jobs-data/gstream/streams</param> 
                  <param>/user/share/jobs-result/${project.artifactId}</param> 
                  <param>,</param> 
                  <param>5</param> 
            </args>
            <sshHost>10.32.2.153</sshHost>
            <sshUser>sffh</sshUser>
            <hdfsJobPath>/user/share/jobs</hdfsJobPath>
            <hdfsLocalDataDist>conf/test/resources</hdfsLocalDataDist>
            <hdfsLocalData>conf/test/resources</hdfsLocalData>
            <hdfsDataPath>/user/share/jobs-data/gstream/streams</hdfsDataPath>
            <hdfsResultPath>/user/share/jobs-result/${project.artifactId}</hdfsResultPath>	
       </configuration>
   </plugin>


3. Run the program on Teralab

From the installed SF4H run "Maven deploy HDP Dist Goal"