How to use Spark on Grid5000 : Différence entre versions
m (Bigdata a déplacé la page BIGDATA:Grid5000 vers Grid5000 par-dessus une redirection) |
|||
Ligne 4 : | Ligne 4 : | ||
https://github.com/mliroz/hadoop_g5k/wiki | https://github.com/mliroz/hadoop_g5k/wiki | ||
− | ''' | + | Create file .bash_profile if it doesn't exist at '''/home/yourUserName/.bash_profile''' |
− | ''' | + | Add the following lines : |
+ | '''PATH="/home/yourUserName/.local/bin:$PATH” | ||
+ | export PATH''' | ||
+ | |||
+ | '''Initialize cluster''' | ||
+ | |||
+ | '''Reserve nodes''' | ||
+ | |||
+ | https://www.grid5000.fr/mediawiki/index.php/Getting_Started | ||
+ | |||
+ | Some examples | ||
oarsub -t allow_classic_ssh -l nodes=10,walltime=2 -r '2015-06-14 19:30:00' | oarsub -t allow_classic_ssh -l nodes=10,walltime=2 -r '2015-06-14 19:30:00' | ||
Ligne 14 : | Ligne 24 : | ||
oarsub -I -p "cluster='paranoia'" -t allow_classic_ssh -l nodes=8,walltime=12 | oarsub -I -p "cluster='paranoia'" -t allow_classic_ssh -l nodes=8,walltime=12 | ||
− | ''' | + | '''Take a reservation''' |
oarsub -C job_ID | oarsub -C job_ID | ||
− | ''' | + | '''Take nodes directly''' |
oarsub -I -t allow_classic_ssh -l nodes=6,walltime=2 | oarsub -I -t allow_classic_ssh -l nodes=6,walltime=2 | ||
− | ''' | + | '''Cluster initialization''' |
hg5k --create $OAR_NODEFILE --version 2 | hg5k --create $OAR_NODEFILE --version 2 |
Version du 29 janvier 2016 à 13:52
Welcome to spark on Grid5000
1 : Install hadoop_g5k https://github.com/mliroz/hadoop_g5k/wiki
Create file .bash_profile if it doesn't exist at /home/yourUserName/.bash_profile
Add the following lines : PATH="/home/yourUserName/.local/bin:$PATH” export PATH
Initialize cluster
Reserve nodes
https://www.grid5000.fr/mediawiki/index.php/Getting_Started
Some examples
oarsub -t allow_classic_ssh -l nodes=10,walltime=2 -r '2015-06-14 19:30:00'
oarsub -p "cluster='paranoia'" -t allow_classic_ssh -l nodes=8,walltime=12 -r '2015-07-09 21:14:01'
oarsub -I -p "cluster='paranoia'" -t allow_classic_ssh -l nodes=8,walltime=12
Take a reservation
oarsub -C job_ID
Take nodes directly
oarsub -I -t allow_classic_ssh -l nodes=6,walltime=2
Cluster initialization
hg5k --create $OAR_NODEFILE --version 2
hg5k --bootstrap /home/gbeck/public/hadoop-2.6.0.tar.gz
hg5k --initialize feeling_lucky --start
spark_g5k --create YARN --hid 1
spark_g5k --bootstrap /home/gbeck/public/spark-1.6.0-bin-hadoop2.6.tgz
spark_g5k --initialize feeling_lucky --start
Mettre les fichiers dans le hdfs
hg5k --putindfs 900k.csv /ds900.csv
Exécuter le jar
spark_g5k --scala_job mean-shift_2.10-0.1.jar spark_g5k --scala_job --exec_params executor-memory=1g driver-memory=1g num-executors=2 executor-cores=3 mean-shift_2.10-0.1.jar
Trouver les fichiers dans le HDFS
hg5k --state files
Récupérer le résultat res
hg5k --getfromdfs res /home/gbeck/reims
- list of resources of your reservation
uniq $OAR_NODEFILE
- spark 1.4.0
mkdir -p /tmp/spark/logs/events
Fin
spark_g5k --delete
hg5k --delete