Icon

kn_​example_​python_​excel_​read_​comments

use KNIME and Python OpenPyxl to read the the comments fro all cells in an Excel file and save them as table

use KNIME and Python OpenPyxl to read the the comments fro all cells in an Excel file and save them as table

KNIME and Python — Setting up and managing Conda environments
https://medium.com/p/2ac217792539

################################################################################ https://openpyxl.readthedocs.io/en/stable/worksheet_tables.html# import the KNIME moduleimport knime.scripting.io as knioimport numpy as npimport pandas as pd# Import the necessary librariesfrom openpyxl import Workbookfrom openpyxl import load_workbookfrom openpyxl.comments import Comment# Step 1: Create an Excel file with OpenPyxlwb = Workbook()# Add some comments to the cellssheets = ["Sheet 1", "Sheet 2"]cells_with_comments = [("A1", "Hello world!", "Content 1"), ("B2", "Another comment", "Content 2")]for sheet_name in sheets: ws = wb.create_sheet(sheet_name) for cell, text, cell_content in cells_with_comments: ws[cell] = cell_content ws[cell].comment = Comment(text, "Author")# Save the workbook to a fileexcel_filename = knio.flow_variables['var_excel_name']wb.save(excel_filename) use KNIME and Python OpenPyxl to read the the comments fro all cells in an Excel file and save them as table ################################################################################ https://openpyxl.readthedocs.io/en/stable/worksheet_tables.html# import the KNIME moduleimport knime.scripting.io as knioimport numpy as npimport pandas as pd# Import the necessary librariesfrom openpyxl import Workbookfrom openpyxl import load_workbookfrom openpyxl.comments import Commentfrom typing import List, Tuple, Dictexcel_filename = knio.flow_variables['var_excel_name']parquet_filename = knio.flow_variables['var_parquet_name']# Step 2: Extract the comments and store them in a dataframedef extract_comments(filename: str) -> List[Dict[str, str]]: wb = load_workbook(filename) data = [] for ws in wb.worksheets: for row in ws.iter_rows(): for cell in row: if cell.comment: data.append({"file": filename,"sheet": ws.title,"cell": cell.coordinate,"comment": cell.comment.text,"content": cell.value }) return datadata = extract_comments(excel_filename)# Create a pandas DataFrame from the datadf = pd.DataFrame(data)# Step 3: Store the dataframe in a Parquet filedf.to_parquet(parquet_filename)knio.output_tables[0] = knio.Table.from_pandas(df) locate and create/data/ folderwith absolute pathsvar_excel_namereturn $${Scontext.workflow.data-path}$$ + "my_excel.xlsx";var_parquet_namereturn $${Scontext.workflow.data-path}$$ + "my_excel_comments.parquet";var_*my_excel.xlsxread all cells with commentsmy_excel.xlsxcreate sample filemy_excel_comments.parquet Collect LocalMetadata Java EditVariable (simple) Java EditVariable (simple) String to Path(Variable) Python Script Python Script Parquet Reader ################################################################################ https://openpyxl.readthedocs.io/en/stable/worksheet_tables.html# import the KNIME moduleimport knime.scripting.io as knioimport numpy as npimport pandas as pd# Import the necessary librariesfrom openpyxl import Workbookfrom openpyxl import load_workbookfrom openpyxl.comments import Comment# Step 1: Create an Excel file with OpenPyxlwb = Workbook()# Add some comments to the cellssheets = ["Sheet 1", "Sheet 2"]cells_with_comments = [("A1", "Hello world!", "Content 1"), ("B2", "Another comment", "Content 2")]for sheet_name in sheets: ws = wb.create_sheet(sheet_name) for cell, text, cell_content in cells_with_comments: ws[cell] = cell_content ws[cell].comment = Comment(text, "Author")# Save the workbook to a fileexcel_filename = knio.flow_variables['var_excel_name']wb.save(excel_filename) use KNIME and Python OpenPyxl to read the the comments fro all cells in an Excel file and save them as table ################################################################################ https://openpyxl.readthedocs.io/en/stable/worksheet_tables.html# import the KNIME moduleimport knime.scripting.io as knioimport numpy as npimport pandas as pd# Import the necessary librariesfrom openpyxl import Workbookfrom openpyxl import load_workbookfrom openpyxl.comments import Commentfrom typing import List, Tuple, Dictexcel_filename = knio.flow_variables['var_excel_name']parquet_filename = knio.flow_variables['var_parquet_name']# Step 2: Extract the comments and store them in a dataframedef extract_comments(filename: str) -> List[Dict[str, str]]: wb = load_workbook(filename) data = [] for ws in wb.worksheets: for row in ws.iter_rows(): for cell in row: if cell.comment: data.append({"file": filename,"sheet": ws.title,"cell": cell.coordinate,"comment": cell.comment.text,"content": cell.value }) return datadata = extract_comments(excel_filename)# Create a pandas DataFrame from the datadf = pd.DataFrame(data)# Step 3: Store the dataframe in a Parquet filedf.to_parquet(parquet_filename)knio.output_tables[0] = knio.Table.from_pandas(df) locate and create/data/ folderwith absolute pathsvar_excel_namereturn $${Scontext.workflow.data-path}$$ + "my_excel.xlsx";var_parquet_namereturn $${Scontext.workflow.data-path}$$ + "my_excel_comments.parquet";var_*my_excel.xlsxread all cells with commentsmy_excel.xlsxcreate sample filemy_excel_comments.parquetCollect LocalMetadata Java EditVariable (simple) Java EditVariable (simple) String to Path(Variable) Python Script Python Script Parquet Reader

Nodes

Extensions

Links