Icon

Parquet - Split Data into several Files

<p>Export Data to Parquet Files split in several parts into a folder</p>

URL: use R library(arrow) to read parquet file into KNIME - KNIME Forum (20020) https://forum.knime.com/t/parquet-file-reader-taking-a-long-time/20020/17?u=mlauber71
URL: (65232) forum entry - large files and R https://forum.knime.com/t/knime-assign-failed-request-status-data-overflow-incoming-data-too-big/65232/2?u=mlauber71
URL: MEDIUM: KNIME and R — installation across operating systems — some remarks https://medium.com/p/6494a2a498cc
URL: Export Data to Parquet Files split in several parts into a folder - KNIME Forum (90022) https://forum.knime.com/t/append-method-in-the-write-table-node/90022/5?u=mlauber71
URL: KNIME Forum (35899) - Append Files - Parquet https://forum.knime.com/t/table-writer-append-if-file-exists/35899/6?u=mlauber71
URL: Import split Parquet files back into KNIME https://forum.knime.com/t/problem-reading-writing-big-file-on-hub-space/78177/6?u=mlauber71
URL: Medium: Collect and Restore — or how to handle many large files and resume loops https://medium.com/low-code-for-advanced-data-science/knime-snippets-1-collect-and-restore-or-how-to-handle-many-large-files-and-resume-loops-c57795b65d7e#643c

Export Data to Parquet Files split in several parts into a folder

https://forum.knime.com/t/append-method-in-the-write-table-node/90022/5?u=mlauber71

https://hub.knime.com/-/spaces/-/~4-Nz2crmY1OvrH_M/current-state/
Create Dummy Data
Concatenate
Sorter
import all parquet files from the subfolder/data/test_folder_parquet/....*.parquet
Parquet Reader
Counter Generation
Concatenate
import all parquet files from the subfolder/data/test_folder_parquet/....*.parquet
Parquet Reader
Column Filter
/data/test_folder_parquet/...
Delete Files/Folders
the last counter plus 1
Math Formula
GroupBy
Counter Generation
Table Row to Variable
/data/test_folder_parquet/...split large data into parts=> note: in a real world scenario you shouldset the split values to a larger default1024 MB and 128 MB
Parquet Writer
test_folder_parquet/part_append.parquet
Parquet Writer
Try (Variable Ports)
Catch Errors (Var Ports)

Nodes

Extensions

Links