Icon

kn_​example_​python_​excel_​meta_​information_​files

Extract meta information from Excel file (with the help of Python)

Extract meta information from Excel file (with the help of Python)
https://forum.knime.com/t/extract-date-modified-from-an-excel-file/46332/3?u=mlauber71

Extract meta information from Excel file (with the help of Python)https://forum.knime.com/t/extract-date-modified-from-an-excel-file/46332/3?u=mlauber71 # Importing necessary librariesimport knime.scripting.io as knio # KNIME scripting for handling flow variablesimport numpy as np # NumPy for numerical operations (not used in this snippet)import pandas as pd # Pandas for data manipulationfrom openpyxl import load_workbook # openpyxl for Excel file operations# Retrieving Excel file path and file name from KNIME flow variablesvar_excel_file_path = knio.flow_variables['File path']var_excel_file = knio.flow_variables['File name'] + "." + knio.flow_variables['File extension']# Loading the workbook from the specified file pathwb = load_workbook(var_excel_file_path)# Collecting workbook propertiescollection_wb_properties = wb.properties# Creating a DataFrame with various properties of the Excel filedf = pd.DataFrame({'var_excel_file': [var_excel_file],'var_excel_file_path': [var_excel_file_path],'creator': [str(collection_wb_properties.creator)],'title': [str(collection_wb_properties.title)], 'description': [str(collection_wb_properties.description)], 'subject': [str(collection_wb_properties.subject)], 'identifier': [str(collection_wb_properties.identifier)], 'language': [str(collection_wb_properties.language)], 'created': [str(collection_wb_properties.created)], 'created_ts': [collection_wb_properties.created], 'modified': [str(collection_wb_properties.modified)], 'modified_ts': [collection_wb_properties.modified],'lastModifiedBy': [str(collection_wb_properties.lastModifiedBy)], 'category': [str(collection_wb_properties.category)], 'contentStatus': [str(collection_wb_properties.contentStatus)], 'version': [str(collection_wb_properties.version)], 'revision': [str(collection_wb_properties.revision)], 'keywords': [str(collection_wb_properties.keywords)], 'lastPrinted': [str(collection_wb_properties.lastPrinted)]})# Outputting the DataFrame to the first output table in KNIMEknio.output_tables[0] = knio.Table.from_pandas(df)# Closing the workbookwb.close() /data/ subfolderlist XLSX files and give back URLsextract meta informations fromfilesextract string frompath column"Location"extract file name andextension fromURLextract Excel file propertieswith OpenPyxlmodifiedyyyy-MM-dd HH:mm:ss[.SSSSSS]createdyyyy-MM-dd HH:mm:ss[.SSS]excel_file_python_properties.tableexcel_file_properties.table GenerateExcel Files List Files/Folders Files/FoldersMeta Info Path to String URL to File Path Python Script Table Row ToVariable Loop Start Loop End String to Date&Time String to Date&Time Table Writer Table Writer Extract meta information from Excel file (with the help of Python)https://forum.knime.com/t/extract-date-modified-from-an-excel-file/46332/3?u=mlauber71 # Importing necessary librariesimport knime.scripting.io as knio # KNIME scripting for handling flow variablesimport numpy as np # NumPy for numerical operations (not used in this snippet)import pandas as pd # Pandas for data manipulationfrom openpyxl import load_workbook # openpyxl for Excel file operations# Retrieving Excel file path and file name from KNIME flow variablesvar_excel_file_path = knio.flow_variables['File path']var_excel_file = knio.flow_variables['File name'] + "." + knio.flow_variables['File extension']# Loading the workbook from the specified file pathwb = load_workbook(var_excel_file_path)# Collecting workbook propertiescollection_wb_properties = wb.properties# Creating a DataFrame with various properties of the Excel filedf = pd.DataFrame({'var_excel_file': [var_excel_file],'var_excel_file_path': [var_excel_file_path],'creator': [str(collection_wb_properties.creator)],'title': [str(collection_wb_properties.title)], 'description': [str(collection_wb_properties.description)], 'subject': [str(collection_wb_properties.subject)], 'identifier': [str(collection_wb_properties.identifier)], 'language': [str(collection_wb_properties.language)], 'created': [str(collection_wb_properties.created)], 'created_ts': [collection_wb_properties.created], 'modified': [str(collection_wb_properties.modified)], 'modified_ts': [collection_wb_properties.modified],'lastModifiedBy': [str(collection_wb_properties.lastModifiedBy)], 'category': [str(collection_wb_properties.category)], 'contentStatus': [str(collection_wb_properties.contentStatus)], 'version': [str(collection_wb_properties.version)], 'revision': [str(collection_wb_properties.revision)], 'keywords': [str(collection_wb_properties.keywords)], 'lastPrinted': [str(collection_wb_properties.lastPrinted)]})# Outputting the DataFrame to the first output table in KNIMEknio.output_tables[0] = knio.Table.from_pandas(df)# Closing the workbookwb.close() /data/ subfolderlist XLSX files and give back URLsextract meta informations fromfilesextract string frompath column"Location"extract file name andextension fromURLextract Excel file propertieswith OpenPyxlmodifiedyyyy-MM-dd HH:mm:ss[.SSSSSS]createdyyyy-MM-dd HH:mm:ss[.SSS]excel_file_python_properties.tableexcel_file_properties.tableGenerateExcel Files List Files/Folders Files/FoldersMeta Info Path to String URL to File Path Python Script Table Row ToVariable Loop Start Loop End String to Date&Time String to Date&Time Table Writer Table Writer

Nodes

Extensions

Links