Icon

JKISeason3-8_​tomljh

Filtering Redundant Folder References

Level: Medium

Description: You are reorganizing a data warehouse in your company, working with a filesystem that creates parent folders if you give it a reference for a child folder. For example, if you ask the filesystem to create “folder1/folder2” and neither folder1 or folder2 exist, it will create both, with folder2 inside folder1, without raising an error. Given a list of folders, you want to keep only the longest unique child folders, filtering out references to parent folders that will be generated anyway for efficiency.

Here's an example of an initial list of folders:

- folder1/folder3
- folder1/folder3/folder22
- folder1/folder3/folder22/folder47

After executing your workflow, the list above should only contain a reference for folder1/folder3/folder22/folder47.

Author: Emilio Silvestri

Datasets: Folder Data in the KNIME Community Hub

Remember to upload your solution with tag JKISeason3-8 to your public space on KNIME Community Hub. To increase the visibility of your solution, also post it to this challenge thread on KNIME Forum.

We will post our solution to this challenge here next Tuesday.

Explanation: 1.The official pre-installed Python environment can be used, as only the "pandas" package is used. For example:"org_knime_pythonscript" can be used.2.The code was generated by LLM and tested manually. Overview of the code's functionality:This code defines a TreeNode class for creating a tree structure based on folder paths provided in a KNIME input table. It thenconstructs this tree by adding children nodes for each folder path, retrieves all paths from the root to the leaf nodes, filters outany empty strings that might occur due to split operations, and finally outputs these paths in a DataFrame format compatiblewith KNIME. Read datafolders.tableGet the paths from the root node to all leaf nodes Table Reader Python Script Explanation: 1.The official pre-installed Python environment can be used, as only the "pandas" package is used. For example:"org_knime_pythonscript" can be used.2.The code was generated by LLM and tested manually. Overview of the code's functionality:This code defines a TreeNode class for creating a tree structure based on folder paths provided in a KNIME input table. It thenconstructs this tree by adding children nodes for each folder path, retrieves all paths from the root to the leaf nodes, filters outany empty strings that might occur due to split operations, and finally outputs these paths in a DataFrame format compatiblewith KNIME. Read datafolders.tableGet the paths from the root node to all leaf nodes Table Reader Python Script

Nodes

Extensions

Links