Icon

kn_​problem_​h2o_​python

Problem on Windows when transfering data from Pandas to H2O (solved)
Tried to set UTF-8 encoding in knime.ini file.Does not help.knime.ini-Dfile.encoding=UTF-8 When transfering data within the Python Script node from a pandas data frame to H2O cluster on Windows there seems to be a strange problem.The command works well in the enclosed Jupyter notebooks (in subfolder /script/) but not when started from within KNIME.Although the error message comes from H2O the problem seems somehow connected to KNIME.This workflow also demonstrates various ways to load parquet and other files within Python.Problem was solved with the introduction of "h2o.no_progress()" to switch off the progress bars inside the wrapperhttps://forum.knime.com/t/python-script-and-h2o-data-frames-error-under-windows/21099/4?u=mlauber71 train.tabletest.tablekn_problem_import_02Pyarrow import to pandas df -> H2Oinput_table_1 = pq.read_table(v_data_path + "train.parquet").to_pandas()input_table_2 = pq.read_table(v_data_path + "test.parquet").to_pandas()# import the df data into H2O data systemtrain = h2o.H2OFrame(input_table_1.copy())valid = h2o.H2OFrame(input_table_2.copy())=> refer to Jupyter notebook/script/kn_problem_import_02.ipynbkn_problem_import_04parquet -> directly to H2O# load parquet directly into H2Ov_input_1 = v_data_path + "train.parquet"train = h2o.import_file(path=v_input_1)v_input_2 = v_data_path + "test.parquet"valid = h2o.import_file(path=v_input_2)=> refer to Jupyter notebook/script/kn_problem_import_04.ipynbkn_problem_import_01via KNIME import# import the df data into H2O data systemtrain = h2o.H2OFrame(input_table_1.copy())valid = h2o.H2OFrame(input_table_2.copy())kn_problem_import_03Pandas read_parquet with engine pyarrowinput_table_1 = pd.read_parquet(v_data_path + "train.parquet", engine='pyarrow')input_table_2 = pd.read_parquet(v_data_path + "test.parquet", engine='pyarrow')=> refer to Jupyter notebook/script/kn_problem_import_03.ipynbtrain, testcheck versionsof H2O and PythonDirect export with picklesimport H2O dataframes withpickles=> gives errorsee if this can helpkn_problem_import_05try to call Jupyter notebook fromKNIME to import parquet fileTable Reader collect meta data Table Reader Python Script (2⇒2) Python Script (2⇒2) Python Script (2⇒2) Python Script (2⇒2) Prepare Data Python Source Python Source Python Script (1⇒2) Missing Value Missing Value(Apply) Python Script (2⇒2) Tried to set UTF-8 encoding in knime.ini file.Does not help.knime.ini-Dfile.encoding=UTF-8 When transfering data within the Python Script node from a pandas data frame to H2O cluster on Windows there seems to be a strange problem.The command works well in the enclosed Jupyter notebooks (in subfolder /script/) but not when started from within KNIME.Although the error message comes from H2O the problem seems somehow connected to KNIME.This workflow also demonstrates various ways to load parquet and other files within Python.Problem was solved with the introduction of "h2o.no_progress()" to switch off the progress bars inside the wrapperhttps://forum.knime.com/t/python-script-and-h2o-data-frames-error-under-windows/21099/4?u=mlauber71 train.tabletest.tablekn_problem_import_02Pyarrow import to pandas df -> H2Oinput_table_1 = pq.read_table(v_data_path + "train.parquet").to_pandas()input_table_2 = pq.read_table(v_data_path + "test.parquet").to_pandas()# import the df data into H2O data systemtrain = h2o.H2OFrame(input_table_1.copy())valid = h2o.H2OFrame(input_table_2.copy())=> refer to Jupyter notebook/script/kn_problem_import_02.ipynbkn_problem_import_04parquet -> directly to H2O# load parquet directly into H2Ov_input_1 = v_data_path + "train.parquet"train = h2o.import_file(path=v_input_1)v_input_2 = v_data_path + "test.parquet"valid = h2o.import_file(path=v_input_2)=> refer to Jupyter notebook/script/kn_problem_import_04.ipynbkn_problem_import_01via KNIME import# import the df data into H2O data systemtrain = h2o.H2OFrame(input_table_1.copy())valid = h2o.H2OFrame(input_table_2.copy())kn_problem_import_03Pandas read_parquet with engine pyarrowinput_table_1 = pd.read_parquet(v_data_path + "train.parquet", engine='pyarrow')input_table_2 = pd.read_parquet(v_data_path + "test.parquet", engine='pyarrow')=> refer to Jupyter notebook/script/kn_problem_import_03.ipynbtrain, testcheck versionsof H2O and PythonDirect export with picklesimport H2O dataframes withpickles=> gives errorsee if this can helpkn_problem_import_05try to call Jupyter notebook fromKNIME to import parquet fileTable Reader collect meta data Table Reader Python Script (2⇒2) Python Script (2⇒2) Python Script (2⇒2) Python Script (2⇒2) Prepare Data Python Source Python Source Python Script (1⇒2) Missing Value Missing Value(Apply) Python Script (2⇒2)

Nodes

Extensions

Links