Icon

t_​001_​sqlite_​knime_​40_​db_​nodes

KNIME and SQLite - simple example (4.0 New DB nodes)

this demonstrates the use of a local SQLite database where you can store data and keep the variable type and perform database operations on.

Also the data is read by R and data is written back into SQLite with R. The benefit is that the types of data will be better preserved than with CSV.


this demonstrates the use of a local SQLite database where you can store data and keep the variable type and perform database operations on.hint: you can later use the data from the SQLite data base in Python or R # v_script_path <- dirname(rstudioapi::getSourceEditorContext()$path)# determine# v_pos_script <- regexpr("\\/script", v_script_path)# v_script_basic_path <- substr(v_script_path, start=1, stop=v_pos_script)v_data_path <- paste0(knime.flow.in[["context.workflow.parent-path"]], "data", knime.flow.in[["path.separator.system"]])v_path_sqlite_table <- knime.flow.in[["dbfile"]]# Import SQLite package to Rlibrary("RSQLite")con = dbConnect(drv=RSQLite::SQLite(), dbname=v_path_sqlite_table)tables <- dbListTables(con)lDataFrames <- vector("list", length=length(tables))# load the first table from the list# mytable <- dbReadTable(con,tables[1])# you can also specifically name the Table you want to accessmytable_test <- dbReadTable(con,"test")# write R table back to SQLite if the connectin is activedbWriteTable(con, "test_from_r", mytable_test, overwrite = TRUE)dbDisconnect(con)knime.out <- mytable_test # assign your data frame here '''Example: import a Parquet file written from KNIME into Python and export it back to SQLite and also another Parquet'''# Import knime_io to access node inputs and outputs.import knime.scripting.io as knioimport numpy as np # linear algebraimport os # accessing directory structureimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)import pyarrow.parquet as pqimport sqlite3# the ../data/ path from the KNIME flow variables - in the new KNIME knio style ....var_path_data = knio.flow_variables['context.workflow.parent-path'] + "data" + knio.flow_variables['path.separator.system']# import the local parquet file into Pythondf = knio.input_tables[0].to_pandas()# ----- SQLite ----------------------------------------# ----- location of SQLite Database# define some basic variables# the location of the SQLite databasevar_loc_sqlite = knio.flow_variables['dbfile']# define a function to connect you to the SQLite DB# by default the schema used is -well- defaultdef sqlite_connectme(sqlite_databse="default"): try: # First we define the Connection to the physical SQLite file on the file system global sqlite_connection sqlite_connection = sqlite3.connect(sqlite_databse) # then we have to define a cursor global sqlite_cursor sqlite_cursor = sqlite_connection.cursor() except: print("failed to connect " + sqlite_databse) pass# establish a connection to the DBsqlite_connectme(sqlite_databse=var_loc_sqlite) # define a function that cleans up the SQLite table for the file you want to insertdef sqlite_cleanup(sqlite_tabame="default"): try: sqlite_connection.execute("DROP TABLE IF EXISTS " + sqlite_tabame + ";") sqlite_connection.commit() except: print("failed to clean SQLite DB") pass# what name should the new database havev_sqlite_table_name = "test_file_from_python"# make sure the table is not theresqlite_cleanup(sqlite_tabame=v_sqlite_table_name)# insert the df dataframe into the SQLite tabledf.to_sql(name=v_sqlite_table_name, con=sqlite_connection, if_exists="replace")sqlite_connection.commit()# this will not have an immediate effect under KNIME# but you could work with these functions if you need them# list all tables that are in the SQLite file we accessed earliersqlite_table_list = sqlite_cursor.execute("SELECT name FROM sqlite_master WHERE type='table';").fetchall()print(sqlite_table_list)# Retrieve column information of our desired file# Every column will be represented by a tuple with the following attributes:# (id, name, type, notnull, default_value, primary_key)# http://sebastianraschka.com/Articles/2014_sqlite_in_python_tutorial.html#querying-the-database---selecting-rowssqlite_cursor.execute('PRAGMA TABLE_INFO({})'.format(v_sqlite_table_name))sqlite_table_variables = [tup[1] for tup in sqlite_cursor.fetchall()]print(sqlite_table_variables)# fetch data from SQLite data we named earlier# hint: the SQLite DB stores the variable typesdf_sqlite = pd.read_sql_query("select * from " + v_sqlite_table_name + ";", sqlite_connection)# see what types of variables are theredf_sqlite_types = df_sqlite.dtypes# print(df_sqlite_types)# close the SQLite data base at the endsqlite_connection.close()# set the index to an existing columndf_sqlite.set_index('<RowID>', inplace=True)# Pass the transformed table to the first output port of the node.knio.output_tables[0] = knio.Table.from_pandas(df_sqlite)# output_table = df_sqlite.copy() locate and create/data/ folderwith absolute pathscreate dummy dataread Parquet file andexport to KNIME and SQLiteand again to Parquetcreate variable with path <dbfile>create db.sqlitegroup byClusterresult from test_grpwrite "test_grp"read "test"from db.sqlitein R andwrite to"test_from_r"read "test_from_r"DROP TABLE IF EXISTS `default`.`test`;insert data into DB"test"select an overview of all TablesSELECT name FROM sqlite_master WHERE type='table'create structure oftable "test"read "test_file_from_python" Collect LocalMetadata Data Generator Python Script Java EditVariable (simple) SQLite Connector DB GroupBy DB Query Reader DB ConnectionTable Writer DB Reader R Source (Table) DB Table Selector DB SQL Executor DB Writer DB Query Reader DB Table Creator DB Table Selector this demonstrates the use of a local SQLite database where you can store data and keep the variable type and perform database operations on.hint: you can later use the data from the SQLite data base in Python or R # v_script_path <- dirname(rstudioapi::getSourceEditorContext()$path)# determine# v_pos_script <- regexpr("\\/script", v_script_path)# v_script_basic_path <- substr(v_script_path, start=1, stop=v_pos_script)v_data_path <- paste0(knime.flow.in[["context.workflow.parent-path"]], "data", knime.flow.in[["path.separator.system"]])v_path_sqlite_table <- knime.flow.in[["dbfile"]]# Import SQLite package to Rlibrary("RSQLite")con = dbConnect(drv=RSQLite::SQLite(), dbname=v_path_sqlite_table)tables <- dbListTables(con)lDataFrames <- vector("list", length=length(tables))# load the first table from the list# mytable <- dbReadTable(con,tables[1])# you can also specifically name the Table you want to accessmytable_test <- dbReadTable(con,"test")# write R table back to SQLite if the connectin is activedbWriteTable(con, "test_from_r", mytable_test, overwrite = TRUE)dbDisconnect(con)knime.out <- mytable_test # assign your data frame here '''Example: import a Parquet file written from KNIME into Python and export it back to SQLite and also another Parquet'''# Import knime_io to access node inputs and outputs.import knime.scripting.io as knioimport numpy as np # linear algebraimport os # accessing directory structureimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)import pyarrow.parquet as pqimport sqlite3# the ../data/ path from the KNIME flow variables - in the new KNIME knio style ....var_path_data = knio.flow_variables['context.workflow.parent-path'] + "data" + knio.flow_variables['path.separator.system']# import the local parquet file into Pythondf = knio.input_tables[0].to_pandas()# ----- SQLite ----------------------------------------# ----- location of SQLite Database# define some basic variables# the location of the SQLite databasevar_loc_sqlite = knio.flow_variables['dbfile']# define a function to connect you to the SQLite DB# by default the schema used is -well- defaultdef sqlite_connectme(sqlite_databse="default"): try: # First we define the Connection to the physical SQLite file on the file system global sqlite_connection sqlite_connection = sqlite3.connect(sqlite_databse) # then we have to define a cursor global sqlite_cursor sqlite_cursor = sqlite_connection.cursor() except: print("failed to connect " + sqlite_databse) pass# establish a connection to the DBsqlite_connectme(sqlite_databse=var_loc_sqlite) # define a function that cleans up the SQLite table for the file you want to insertdef sqlite_cleanup(sqlite_tabame="default"): try: sqlite_connection.execute("DROP TABLE IF EXISTS " + sqlite_tabame + ";") sqlite_connection.commit() except: print("failed to clean SQLite DB") pass# what name should the new database havev_sqlite_table_name = "test_file_from_python"# make sure the table is not theresqlite_cleanup(sqlite_tabame=v_sqlite_table_name)# insert the df dataframe into the SQLite tabledf.to_sql(name=v_sqlite_table_name, con=sqlite_connection, if_exists="replace")sqlite_connection.commit()# this will not have an immediate effect under KNIME# but you could work with these functions if you need them# list all tables that are in the SQLite file we accessed earliersqlite_table_list = sqlite_cursor.execute("SELECT name FROM sqlite_master WHERE type='table';").fetchall()print(sqlite_table_list)# Retrieve column information of our desired file# Every column will be represented by a tuple with the following attributes:# (id, name, type, notnull, default_value, primary_key)# http://sebastianraschka.com/Articles/2014_sqlite_in_python_tutorial.html#querying-the-database---selecting-rowssqlite_cursor.execute('PRAGMA TABLE_INFO({})'.format(v_sqlite_table_name))sqlite_table_variables = [tup[1] for tup in sqlite_cursor.fetchall()]print(sqlite_table_variables)# fetch data from SQLite data we named earlier# hint: the SQLite DB stores the variable typesdf_sqlite = pd.read_sql_query("select * from " + v_sqlite_table_name + ";", sqlite_connection)# see what types of variables are theredf_sqlite_types = df_sqlite.dtypes# print(df_sqlite_types)# close the SQLite data base at the endsqlite_connection.close()# set the index to an existing columndf_sqlite.set_index('<RowID>', inplace=True)# Pass the transformed table to the first output port of the node.knio.output_tables[0] = knio.Table.from_pandas(df_sqlite)# output_table = df_sqlite.copy() locate and create/data/ folderwith absolute pathscreate dummy dataread Parquet file andexport to KNIME and SQLiteand again to Parquetcreate variable with path <dbfile>create db.sqlitegroup byClusterresult from test_grpwrite "test_grp"read "test"from db.sqlitein R andwrite to"test_from_r"read "test_from_r"DROP TABLE IF EXISTS `default`.`test`;insert data into DB"test"select an overview of all TablesSELECT name FROM sqlite_master WHERE type='table'create structure oftable "test"read "test_file_from_python" Collect LocalMetadata Data Generator Python Script Java EditVariable (simple) SQLite Connector DB GroupBy DB Query Reader DB ConnectionTable Writer DB Reader R Source (Table) DB Table Selector DB SQL Executor DB Writer DB Query Reader DB Table Creator DB Table Selector

Nodes

Extensions

Links