Data Science or not Data Science? : Différence entre versions
Ligne 29 : | Ligne 29 : | ||
'''SlapOS cloud''' | '''SlapOS cloud''' | ||
− | General information on [ | + | General information on [http://community.slapos.org/wiki SlapOS] |
[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6650981&tag=1 BOINC as a Service for the SlapOS Cloud: Tools and Methods] | [http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6650981&tag=1 BOINC as a Service for the SlapOS Cloud: Tools and Methods] | ||
[https://hal.archives-ouvertes.fr/hal-00958012/file/SlapOS.pdf Déploiement de la plate-forme SlapOS dans l'environnement Grid'5000] | [https://hal.archives-ouvertes.fr/hal-00958012/file/SlapOS.pdf Déploiement de la plate-forme SlapOS dans l'environnement Grid'5000] |
Version du 2 février 2016 à 16:22
Welcome to LIPN Wiki on Big Data
With more and more data produced every day, we need to pay a special attention on the technologies to use in order to be able to analyze large amount of data. Big Data is often characterized by the 4 V for Volume, Variety, Velocity, Veracity that constitute challenges for the required tools.
Machine learning is to extract knowledge from data. In short it's a family of algorithms that transform data into model or description with the aim to predict or categorize data. In this field we use also analytics tools consisting to present informations in a more readable way as for the Square Predict (http://square-predict.net/) project.
The wiki is related to our experience on the Grid5000 and CIRRUS testbeds for the study of the Software, Platform, Infrastructure and Network layers that push forward the Data Science field according to an experimental scientific method.
General discussion on Systems for Big-Data
Infrastructure, programming models, frameworks
Our experience is with the following tools:
Apache Spark : http://spark.apache.org/ Apache Flink : https://flink.apache.org/ TenserFlow : https://www.tensorflow.org/ Wendelin : http://www.nexedi.com/NXD-Document.Blog.Wendelin.Release.0.4.alpha SlapOS : http://www.slapos.org Spark-notebook : http://spark-notebook.io/
Testbeds we use in conjunction with our experimental method:
Grid5000 : https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home Cirrus : http://cirrus.uspc.fr Teralab : https://www.teralab-datascience.fr/fr/ Amazon : https://aws.amazon.com/fr/
Apache Spark
Some Apache Spark implementations (since 2011/2012) How to use Spark on Grid5000
SlapOS cloud
General information on SlapOS BOINC as a Service for the SlapOS Cloud: Tools and Methods Déploiement de la plate-forme SlapOS dans l'environnement Grid'5000
TeraLab
General information on TeraLab How to use TeraLab TeraLab and SlapOS
MediaWiki a été installé avec succès.
Consultez le Guide de l’utilisateur pour plus d’informations sur l’utilisation de ce logiciel de wiki.