s_401_spark_label_encoder

Spark Label Encoding - prepare the data in local Big Data environment

s_401 - prepare label encoding with spark
prepare the preparation of data in a big data environment
- label encode string variables
- transform numbers into Double format (Spark ML likes that)
- remove highly correlated data
- remove NaN variables
- remove continous variables
- optional: normalize the data