Icon

kn_​example_​python_​import_​module

Use functions and Python code from an external .PY file or a Juypter notebook in KNIME's Python Source node

Use functions and Python code from an external .PY file or a Juypter notebook in KNIME's Python Source node

This workflow demonstrates how modules in (external) Python scripts (.PY) and KNIME as well as Jupyter notebooks and KNIME can live and work together. As always: with KNIME as a platform you don't have to choose - you can have it all.

conda install -c conda-forge jupyterlab
conda install -c conda-forge notebook

conda activate py3_knime
jupyter notebook






import knime.scripting.io as knioimport sysfrom pandas import DataFramev_sep = knio.flow_variables['path.separator.system']v_script_path = v_sep + "script" + v_sep + "modules" + v_sepv_path_to_script = knio.flow_variables['context.workflow.absolute-path'] + v_script_path# Add workflow directory to sys.path. Alternatively, use importlib.util.sys.path.append(v_path_to_script)# Loads module my_module.py which is located in the workflow directory.import my_module# Call some function.my_script_output_1 = my_module.my_function_1()# Output to KNIMEoutput_table = DataFrame({"Output": [my_script_output_1]})knio.output_tables[0] = knio.Table.from_pandas(output_table) Use functions and Python code from an external .PY file or a Juypter notebook in KNIME's Python Source nodehttps://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_python_import_moduleThis workflow demonstrates how modules in (external) Python scripts (.PY) and KNIME as well as Jupyter notebooks and KNIME can live and work together. As always: with KNIME as a platform you don't have to choose - you can have it all.The Python nodes work with the bundled Python version of KNIME. Although there is a Conda Environment Propagation and a YML file included to create a Python environment () import knime.scripting.io as knioimport knime.scripting.jupyter as knime_jupyternb = knime_jupyter.load_notebook(notebook_directory="knime://knime.workflow/script",notebook_name="py_module_in_notebook.ipynb",only_include_tag='use_in_knime')import sysfrom pandas import DataFrame# Call some function.my_script_output_2 = nb.my_function_2()# Output to KNIMEoutput_table = DataFrame({"Output": [my_script_output_2]})knio.output_tables[0] = knio.Table.from_pandas(output_table) See the results run in "script/run_module_from_jupyter.ipynb"this is run in the Jupyter notebook - you see the results here in the sub-folder script/ there is a folder/modules/ # the folder gets added to the sys.path to be found __init__.py # containing initialization and will # be loaded by default once my_module gets imported # from .my_module import my_function_1 # the .py file my_module is adressed with a 'relative' path my_module.py # containes the function "def my_function_1():" .... '''Example: import a Parquet file written from KNIME into Python and Jupyter notebook and export it back'''# Import knime_io to access node inputs and outputs.import knime.scripting.io as knioimport knime.scripting.jupyter as knime_jupyterimport numpy as np # linear algebraimport os # accessing directory structureimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)import pyarrow as pq# print("pyarrow version: ", pq.__version__)#current dircwd = os.getcwd()print(cwd)# the ../data/ path from the KNIME flow variables - in the new KNIME knio style ....v_path_data = knio.flow_variables['context.workflow.data-path']# the name of the parquet file from KNIME including the pathv_path_parquet_file = knio.flow_variables['v_path_parquet_file']# this file will contain the variables exported from the KNIME Python codev_var_from_knime = v_path_data + "df_var_from_knime.parquet"# this parquet file will be re-exported from the Jupyter notebookv_path_parquet_file_from_jupyter = v_path_data + "test_data_all_types_export_from_jupyter.parquet"# this parquet file will be re-exported from the KNIME nodev_path_parquet_file_from_knime = v_path_data + "test_data_all_types_export_from_knime.parquet"# collect all the variable informations in a dictionaryvar_from_knime = {'v_path_data': [v_path_data],'v_path_parquet_file': [v_path_parquet_file],'v_path_parquet_file_from_jupyter':[v_path_parquet_file_from_jupyter],'v_path_parquet_file_from_knime':[v_path_parquet_file_from_knime],'v_var_from_knime':[v_var_from_knime]}# pass column names in the columns parameterdf_from_knime = pd.DataFrame.from_dict(var_from_knime)# export all the variables to a Parqute filedf_from_knime.to_parquet(v_var_from_knime, compression='gzip')# run the jupyter notebook (only the tags "use_in_knime")nb = knime_jupyter.load_notebook(notebook_directory="knime://knime.workflow/script",notebook_name="test_data_all_types.ipynb",only_include_tag='use_in_knime')# https://stackoverflow.com/questions/32249960/in-python-pandas-start-row-index-from-1-instead-of-zero-without-creating-additi# import the local parquet file into Pythondf0 = pq.parquet.read_table(v_path_parquet_file).to_pandas()# add an indicator that the file has been exported from within then KNIME nodedf0['source'] = 'exported from within KNIME node'df0['new_index'] = np.arange(1,len(df0)+1)df0.set_index('new_index', drop=True, append=False, inplace=True, verify_integrity=True)# print(type(df0))# export the originally imported Partquet file back to /data/ from within the KNIME nodedf0.to_parquet(v_path_parquet_file_from_knime, compression='gzip')# import the data read into the Jupyter notebook. You could place other results from the Jupyter notebook in such a waydf1 = nb.dfdf1['new_index'] = np.arange(1,len(df1)+1)df1.set_index('new_index', drop=True, append=False, inplace=True, verify_integrity=True)# import the Parquet file Re-Exported from the Jupyter notebookdf2 = pq.parquet.read_table(v_path_parquet_file_from_jupyter).to_pandas()df2['new_index'] = np.arange(1,len(df2)+1)df2.set_index('new_index', drop=True, append=False, inplace=True, verify_integrity=True)# Pass the transformed tables to the data ports of the Python nodeknio.output_tables[0] = knio.Table.from_pandas(df0)knio.output_tables[1] = knio.Table.from_pandas(df1)knio.output_tables[2] = knio.Table.from_pandas(df2)knio.output_tables[3] = knio.Table.from_pandas(df_from_knime) Alternative ways to communicate between KNIME, Python and Jupyter notebooks (use Parquet files) - https://forum.knime.com/t/unicodeencodeerror-charmap-codec-cant-encode-characters/39199/3?u=mlauber71see also: https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_python_read_parquet_file_2021~G1kX4pbOlCeq56cH from /script/test_data_all_types.ipynb read Parquet file andexport Jupyter notebook/script/test_data_all_types.ipynbhttps://forum.knime.com/t/unicodeencodeerror-charmap-codec-cant-encode-characters/39199/3?u=mlauber71=> crashes under Apple Silicon M1 (not anymore)v_path_parquet_filetest_data_all_types.parquetv_path_*data/export_from_jupyter.parquettest_data_all_typestest_data_all_types.parquetgives an error!!!!v_path_parquet_file_from_knimev_path_parquet_file_from_knimetest_data_all_types.parquet/data/test_data_all_types.parquettest_data_all_types_export_from_jupyter.parquetlocate and create/data/ folderwith absolute pathsImport custom modulefrom .PY fileImport custom modulefrom your Jupyter Notebookscript/py_module_in_notebook.ipynblocate and create/data/ folderwith absolute paths py3_knime_knimepy=> to run the Jupyter notebooksYAML file in the sub-folder/script/py3_knime_knimepy.ymland in the nodes descriptionPython Script Java EditVariable (simple) String to Path(Variable) Parquet Reader Test Data Generator prepare_data Parquet Reader Column Rename Cache String to Path Table Rowto Variable Parquet Reader Parquet Writer Parquet Reader Collect LocalMetadata Python Script Python Script Collect LocalMetadata Conda EnvironmentPropagation import knime.scripting.io as knioimport sysfrom pandas import DataFramev_sep = knio.flow_variables['path.separator.system']v_script_path = v_sep + "script" + v_sep + "modules" + v_sepv_path_to_script = knio.flow_variables['context.workflow.absolute-path'] + v_script_path# Add workflow directory to sys.path. Alternatively, use importlib.util.sys.path.append(v_path_to_script)# Loads module my_module.py which is located in the workflow directory.import my_module# Call some function.my_script_output_1 = my_module.my_function_1()# Output to KNIMEoutput_table = DataFrame({"Output": [my_script_output_1]})knio.output_tables[0] = knio.Table.from_pandas(output_table) Use functions and Python code from an external .PY file or a Juypter notebook in KNIME's Python Source nodehttps://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_python_import_moduleThis workflow demonstrates how modules in (external) Python scripts (.PY) and KNIME as well as Jupyter notebooks and KNIME can live and work together. As always: with KNIME as a platform you don't have to choose - you can have it all.The Python nodes work with the bundled Python version of KNIME. Although there is a Conda Environment Propagation and a YML file included to create a Python environment () import knime.scripting.io as knioimport knime.scripting.jupyter as knime_jupyternb = knime_jupyter.load_notebook(notebook_directory="knime://knime.workflow/script",notebook_name="py_module_in_notebook.ipynb",only_include_tag='use_in_knime')import sysfrom pandas import DataFrame# Call some function.my_script_output_2 = nb.my_function_2()# Output to KNIMEoutput_table = DataFrame({"Output": [my_script_output_2]})knio.output_tables[0] = knio.Table.from_pandas(output_table) See the results run in "script/run_module_from_jupyter.ipynb"this is run in the Jupyter notebook - you see the results here in the sub-folder script/ there is a folder/modules/ # the folder gets added to the sys.path to be found __init__.py # containing initialization and will # be loaded by default once my_module gets imported # from .my_module import my_function_1 # the .py file my_module is adressed with a 'relative' path my_module.py # containes the function "def my_function_1():" .... '''Example: import a Parquet file written from KNIME into Python and Jupyter notebook and export it back'''# Import knime_io to access node inputs and outputs.import knime.scripting.io as knioimport knime.scripting.jupyter as knime_jupyterimport numpy as np # linear algebraimport os # accessing directory structureimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)import pyarrow as pq# print("pyarrow version: ", pq.__version__)#current dircwd = os.getcwd()print(cwd)# the ../data/ path from the KNIME flow variables - in the new KNIME knio style ....v_path_data = knio.flow_variables['context.workflow.data-path']# the name of the parquet file from KNIME including the pathv_path_parquet_file = knio.flow_variables['v_path_parquet_file']# this file will contain the variables exported from the KNIME Python codev_var_from_knime = v_path_data + "df_var_from_knime.parquet"# this parquet file will be re-exported from the Jupyter notebookv_path_parquet_file_from_jupyter = v_path_data + "test_data_all_types_export_from_jupyter.parquet"# this parquet file will be re-exported from the KNIME nodev_path_parquet_file_from_knime = v_path_data + "test_data_all_types_export_from_knime.parquet"# collect all the variable informations in a dictionaryvar_from_knime = {'v_path_data': [v_path_data],'v_path_parquet_file': [v_path_parquet_file],'v_path_parquet_file_from_jupyter':[v_path_parquet_file_from_jupyter],'v_path_parquet_file_from_knime':[v_path_parquet_file_from_knime],'v_var_from_knime':[v_var_from_knime]}# pass column names in the columns parameterdf_from_knime = pd.DataFrame.from_dict(var_from_knime)# export all the variables to a Parqute filedf_from_knime.to_parquet(v_var_from_knime, compression='gzip')# run the jupyter notebook (only the tags "use_in_knime")nb = knime_jupyter.load_notebook(notebook_directory="knime://knime.workflow/script",notebook_name="test_data_all_types.ipynb",only_include_tag='use_in_knime')# https://stackoverflow.com/questions/32249960/in-python-pandas-start-row-index-from-1-instead-of-zero-without-creating-additi# import the local parquet file into Pythondf0 = pq.parquet.read_table(v_path_parquet_file).to_pandas()# add an indicator that the file has been exported from within then KNIME nodedf0['source'] = 'exported from within KNIME node'df0['new_index'] = np.arange(1,len(df0)+1)df0.set_index('new_index', drop=True, append=False, inplace=True, verify_integrity=True)# print(type(df0))# export the originally imported Partquet file back to /data/ from within the KNIME nodedf0.to_parquet(v_path_parquet_file_from_knime, compression='gzip')# import the data read into the Jupyter notebook. You could place other results from the Jupyter notebook in such a waydf1 = nb.dfdf1['new_index'] = np.arange(1,len(df1)+1)df1.set_index('new_index', drop=True, append=False, inplace=True, verify_integrity=True)# import the Parquet file Re-Exported from the Jupyter notebookdf2 = pq.parquet.read_table(v_path_parquet_file_from_jupyter).to_pandas()df2['new_index'] = np.arange(1,len(df2)+1)df2.set_index('new_index', drop=True, append=False, inplace=True, verify_integrity=True)# Pass the transformed tables to the data ports of the Python nodeknio.output_tables[0] = knio.Table.from_pandas(df0)knio.output_tables[1] = knio.Table.from_pandas(df1)knio.output_tables[2] = knio.Table.from_pandas(df2)knio.output_tables[3] = knio.Table.from_pandas(df_from_knime) Alternative ways to communicate between KNIME, Python and Jupyter notebooks (use Parquet files) - https://forum.knime.com/t/unicodeencodeerror-charmap-codec-cant-encode-characters/39199/3?u=mlauber71see also: https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_python_read_parquet_file_2021~G1kX4pbOlCeq56cH from /script/test_data_all_types.ipynb read Parquet file andexport Jupyter notebook/script/test_data_all_types.ipynbhttps://forum.knime.com/t/unicodeencodeerror-charmap-codec-cant-encode-characters/39199/3?u=mlauber71=> crashes under Apple Silicon M1 (not anymore)v_path_parquet_filetest_data_all_types.parquetv_path_*data/export_from_jupyter.parquettest_data_all_typestest_data_all_types.parquetgives an error!!!!v_path_parquet_file_from_knimev_path_parquet_file_from_knimetest_data_all_types.parquet/data/test_data_all_types.parquettest_data_all_types_export_from_jupyter.parquetlocate and create/data/ folderwith absolute pathsImport custom modulefrom .PY fileImport custom modulefrom your Jupyter Notebookscript/py_module_in_notebook.ipynblocate and create/data/ folderwith absolute paths py3_knime_knimepy=> to run the Jupyter notebooksYAML file in the sub-folder/script/py3_knime_knimepy.ymland in the nodes descriptionPython Script Java EditVariable (simple) String to Path(Variable) Parquet Reader Test Data Generator prepare_data Parquet Reader Column Rename Cache String to Path Table Rowto Variable Parquet Reader Parquet Writer Parquet Reader Collect LocalMetadata Python Script Python Script Collect LocalMetadata Conda EnvironmentPropagation

Nodes

Extensions

Links