
Spark on hadoop expt

# Last amended: 13th Sep, 2024Data File is on Linux File System /cdata/qpaper/ucidata.csv# To View Data on Linux$ ls /cdata/qpaper# Sample Hadoop Commands to Process Data# To Dump Data in Hadoop$ hdfs dfs -put /cdata/qpaper/ucidata.csv /user/ashok/# To List Data in Hadoop$ hdfs dfs -ls /user/ashok# To View Data in Hadoop$ hdfs dfs -cat /user/ashok/ucidata.csv Working Dir: /user/ashok (on hadoop)Host: localhostPort: 9000Working Dir: Current Workflow data area/home/ashokNo configurationneeded hereFile:/user/ashok/ucidata.csvon hadoopNode 11Node 12Node 13Node 14Node 15Node 17Node 19Node 20Node 21Node 22Node 185Node 188Node 189Cluster AnalysisAttackingNode 197Node 198Node 199Node 200Node 201Node 202Node 203Node 204Node 205Node 208Node 216 HDFS Connector Create Local BigData Environment Table to Spark CSV Reader Spark Normalizer Spark Partitioning Spark Decision TreeLearner (MLlib) Spark k-Means Spark Scorer Spark Normalizer Spark Random ForestsLearner (MLlib) Spark Predictor(MLlib) Spark Scorer Spark MLlib to PMML Entropy Scorer Compiled ModelPredictor PMML Compiler Spark to Table Spark ClusterAssigner Spark Statistics Spark Row Filter Spark Statistics Spark Row Filter Spark Statistics Spark Row Filter Spark Statistics Spark Row Filter Spark Row Filter Spark Statistics Spark Predictor(MLlib) Spark Statistics # Last amended: 13th Sep, 2024Data File is on Linux File System /cdata/qpaper/ucidata.csv# To View Data on Linux$ ls /cdata/qpaper# Sample Hadoop Commands to Process Data# To Dump Data in Hadoop$ hdfs dfs -put /cdata/qpaper/ucidata.csv /user/ashok/# To List Data in Hadoop$ hdfs dfs -ls /user/ashok# To View Data in Hadoop$ hdfs dfs -cat /user/ashok/ucidata.csv Working Dir: /user/ashok (on hadoop)Host: localhostPort: 9000Working Dir: Current Workflow data area/home/ashokNo configurationneeded hereFile:/user/ashok/ucidata.csvon hadoopNode 11Node 12Node 13Node 14Node 15Node 17Node 19Node 20Node 21Node 22Node 185Node 188Node 189Cluster AnalysisAttackingNode 197Node 198Node 199Node 200Node 201Node 202Node 203Node 204Node 205Node 208Node 216HDFS Connector Create Local BigData Environment Table to Spark CSV Reader Spark Normalizer Spark Partitioning Spark Decision TreeLearner (MLlib) Spark k-Means Spark Scorer Spark Normalizer Spark Random ForestsLearner (MLlib) Spark Predictor(MLlib) Spark Scorer Spark MLlib to PMML Entropy Scorer Compiled ModelPredictor PMML Compiler Spark to Table Spark ClusterAssigner Spark Statistics Spark Row Filter Spark Statistics Spark Row Filter Spark Statistics Spark Row Filter Spark Statistics Spark Row Filter Spark Row Filter Spark Statistics Spark Predictor(MLlib) Spark Statistics


