Icon

m_​020_​db_​access_​local_​bigdata_​tables

An overview of KNIME based functions to access big data systems (with KNIME's local big data environment)
10 - Use KNIME's Database framework and its nodes 100 - bring your (big) data to the Spark framework 110 - bring data back from Spark to Hive/Impala 120 - Spark does support SQL also - wheather in open code or you can use KNIME's nodes (or combine them) SELECT `prod_productname`, COUNT(DISTINCT `order_id`) AS no_distinct_ordersFROM #table# t1GROUP BY `prod_productname` 20 - You can work with 'open' SQL code (if you know what you are doing :-))or instead simple Knime Nodes DROP TABLE IF EXISTS `$${Sv_big_data_schema}$$`.`tmp_db_order_customer_product_02`;CREATE TABLE `$${Sv_big_data_schema}$$`.`tmp_db_order_customer_product_02`AS SELECT *FROM `$${Sv_big_data_schema}$$`.`tmp_db_order_customer` t1 LEFT JOIN `$${Sv_big_data_schema}$$`.`tmp_product` t2 ON `t1`.`odetail_ProductId` = `t2`.`prod_Id` ; 25 - Knime Nodes & SQL Query - Group by and Having 35 - Knime Nodes & SQL Query - Order by 30 - Knime Nodes & SQL Query - Rename & Column Filter 40 - Knime Nodes & SQL Query - Row Filter & Where 50 - Knime Nodes & SQL Query - Datetimes in SQL 45 - Knime Nodes & SQL Query - CaseWhen An overview of KNIME based functions to access big data systems (with KNIME's local big data environment)Use SQL with Impala/Hive and Spark and also PySpark to access and manipulate data on a big data system. The example is from the classic MS "Northwind" database. CustomerOrderOrderDetailAdd CustomerInformationAdd Order Detailstmp_db_order_customerremovetmp_db_order_customerif it is thereProductcombine Product and CustomerInformationtmp_db_order_customer_product_02=> query in open SQL code(you will have to know what you are doing)removetmp_db_order_customer_product_01if it is theretmp_db_order_customer_product_0211 - direct query and results toKNIMEgroup byShipRegionFun with date anddatettimes in Hivesome housekeepingdestroy your SparkContext at the end105 - keep the result inmemory if it has been run oncefor further analysisapply categoricalencoderfetch *all*lines!^(?!no_distinct_orders).*$group by ShipRegionin SQL codeorder_id => bestell_idfiltercolumnsShipRegion like '%Europe%'order by order_id => bestell_idwhereShipRegion like '%Europe%'group by & havingoverwrite tmp_db_order_customer_03without productcase when ShipRegionappend tmp_db_order_customer_03without productv_big_data_schemaProduct=> to SparkCustomer=> to sparktmp_db_order_customer_product_01=> accesseslocal big data folder../big_datafetch *all*lines! DB Table Selector DB Table Selector DB Table Selector DB Joiner DB Joiner DB ConnectionTable Writer DB Table Remover DB Table Selector Spark Joiner DB SQL Executor DB Table Remover DB Table Selector DB Reader DB Query Reader DB Reader DB Sorter DB GroupBy DB Reader DB Query Reader Destroy SparkContext Persist SparkDataFrame/RDD Spark SQL Query Spark CategoryTo Number Spark TransformationsApplier Spark to Table Spark Column Filter DB Query DB Reader DB Column Rename DB Column Filter DB Row Filter DB Query DB Reader DB Reader DB Query DB Reader DB Query DB Query DB Reader DB ConnectionTable Writer DB Column Filter DB Reader DB Query DB Reader DB Reader DB ConnectionTable Writer DB Column Filter String Input Transpose Hive to Spark Hive to Spark Spark to Hive local bigdatacontext connect Spark to Table 10 - Use KNIME's Database framework and its nodes 100 - bring your (big) data to the Spark framework 110 - bring data back from Spark to Hive/Impala 120 - Spark does support SQL also - wheather in open code or you can use KNIME's nodes (or combine them) SELECT `prod_productname`, COUNT(DISTINCT `order_id`) AS no_distinct_ordersFROM #table# t1GROUP BY `prod_productname` 20 - You can work with 'open' SQL code (if you know what you are doing :-))or instead simple Knime Nodes DROP TABLE IF EXISTS `$${Sv_big_data_schema}$$`.`tmp_db_order_customer_product_02`;CREATE TABLE `$${Sv_big_data_schema}$$`.`tmp_db_order_customer_product_02`AS SELECT *FROM `$${Sv_big_data_schema}$$`.`tmp_db_order_customer` t1 LEFT JOIN `$${Sv_big_data_schema}$$`.`tmp_product` t2 ON `t1`.`odetail_ProductId` = `t2`.`prod_Id` ; 25 - Knime Nodes & SQL Query - Group by and Having 35 - Knime Nodes & SQL Query - Order by 30 - Knime Nodes & SQL Query - Rename & Column Filter 40 - Knime Nodes & SQL Query - Row Filter & Where 50 - Knime Nodes & SQL Query - Datetimes in SQL 45 - Knime Nodes & SQL Query - CaseWhen An overview of KNIME based functions to access big data systems (with KNIME's local big data environment)Use SQL with Impala/Hive and Spark and also PySpark to access and manipulate data on a big data system. The example is from the classic MS "Northwind" database. CustomerOrderOrderDetailAdd CustomerInformationAdd Order Detailstmp_db_order_customerremovetmp_db_order_customerif it is thereProductcombine Product and CustomerInformationtmp_db_order_customer_product_02=> query in open SQL code(you will have to know what you are doing)removetmp_db_order_customer_product_01if it is theretmp_db_order_customer_product_0211 - direct query and results toKNIMEgroup byShipRegionFun with date anddatettimes in Hivesome housekeepingdestroy your SparkContext at the end105 - keep the result inmemory if it has been run oncefor further analysisapply categoricalencoderfetch *all*lines!^(?!no_distinct_orders).*$group by ShipRegionin SQL codeorder_id => bestell_idfiltercolumnsShipRegion like '%Europe%'order by order_id => bestell_idwhereShipRegion like '%Europe%'group by & havingoverwrite tmp_db_order_customer_03without productcase when ShipRegionappend tmp_db_order_customer_03without productv_big_data_schemaProduct=> to SparkCustomer=> to sparktmp_db_order_customer_product_01=> accesseslocal big data folder../big_datafetch *all*lines!DB Table Selector DB Table Selector DB Table Selector DB Joiner DB Joiner DB ConnectionTable Writer DB Table Remover DB Table Selector Spark Joiner DB SQL Executor DB Table Remover DB Table Selector DB Reader DB Query Reader DB Reader DB Sorter DB GroupBy DB Reader DB Query Reader Destroy SparkContext Persist SparkDataFrame/RDD Spark SQL Query Spark CategoryTo Number Spark TransformationsApplier Spark to Table Spark Column Filter DB Query DB Reader DB Column Rename DB Column Filter DB Row Filter DB Query DB Reader DB Reader DB Query DB Reader DB Query DB Query DB Reader DB ConnectionTable Writer DB Column Filter DB Reader DB Query DB Reader DB Reader DB ConnectionTable Writer DB Column Filter String Input Transpose Hive to Spark Hive to Spark Spark to Hive local bigdatacontext connect Spark to Table

Nodes

Extensions

Links