Icon

Ollama - Chat with your PDF and Llama3

Ollama - Chat with your PDF or Log Files - create and use a local vector store

Ollama - Chat with your PDF or Log Files - create and use a local vector store

To keep up with the fast pace of local LLMs I try to use more generic nodes and Python code to access Ollama and Llama3 - this workflow will run with KNIME 4.7
The chroma vector store will be persisted in a local SQLite3 database.

To get this to work you will have to install Ollama and a Python environment with the necessary packages (py3_knime_llama), downlaod the Llama3 model and an embedding model (https://ollama.com/blog/embedding-models)

---
Medium: Llama3 and KNIME - Build your local Vector Store from PDFs and other Documents
https://medium.com/p/237eda761c1c

Medium - Chat with local Llama3 Model via Ollama in KNIME Analytics Platform - Also extract Logs into structured JSON Files
https://medium.com/p/aca61e4a690a

---
You can get more example of how to work with your documents by checking these Python Codes that you could then adapt
https://github.com/ml-score/

P.S.: yes I am aware of the large empty white space but I have no idea how to remove it in KNIME 4 and have already contacted KNIME support



Interactive Chat with your PDFs and local Llama3 (via Ollama running in the background) Ollama - Chat with your PDF or Log Files - create and use a local vector storeTo keep up with the fast pace of local LLMs I try to use more generic nodes and Python code to access Ollama and Llama3 - this workflow will run with KNIME 4.7The chroma vector store will be persisted in a local SQLite3 database.To get this to work you will have to install Ollama and a Python environment with the necessary packages (py3_knime_llama), downlaod the Llama3 model and an embedding model (https://ollama.com/blog/embedding-models) Interactive Chat with your LOGs and local Llama3 (via Ollama running in the background) - instead of Log Files you could also adapt other unstructured text files You can also bulk load text files (in this case "LOG" files with log information) and turn them into vector stores. Question is if this does make any sense # conda env create -f="/Users/m_lauber/Dropbox/knime-workspace/_hub/LLM_Space/script/py3_knime_llama.yml"# conda env create -f="C:\\Users\\x1234567\\knime-workspace\\hub\\kn_example_python_graphic_bokeh_json\\data\\py3_knime_llama.yml"# https://docs.gpt4all.io/gpt4all_python.html# conda env remove --name py3_knime_llama# conda activate py3_knime_llama# conda update -n py3_knime_llama --all# conda env update --name py3_knime_llama --file "/Users/m_lauber/Dropbox/knime-workspace/_hub/LLM_Space/script/py3_knime_llama.yml" --prune# conda env update --name py3_knime_llama --file "C:\\Users\\x1234567\\knime-workspace\\hub\\kn_example_python_graphic_bokeh_json\\data\\py3_knime_llama.yml" --prune# conda env update --name py3_knime_llama --file "/Users/m_lauber/Dropbox/knime-workspace/_hub/LLM_Space/script/py3_knime_llama.yml"# conda env update --name py3_knime_llama --file "C:\\Users\\x1234567\\knime-workspace\\hub\\kn_example_python_graphic_bokeh_json\\data\\py3_knime_llama.yml"# conda update -n base -c conda-forge conda# KNIME official Python integration guide# https://docs.knime.com/latest/python_installation_guide/index.html#_introduction# KNIME and Python - Setting up and managing Conda environments# https://medium.com/low-code-for-advanced-data-science/knime-and-python-setting-up-and-managing-conda-environments-2ac217792539# file: py3_knime_llama.yml with some modifications# THX Carsten Haubold (https://hub.knime.com/carstenhaubold) for hintsname: py3_knime_llama # Name of the created environmentchannels: # Repositories to search for packages- conda-forge# https://anaconda.org/knime- knime # conda search knime-python-base -c knime --info # to see what is in the packagedependencies: # List of packages that should be installed- python #=3.10 # Python- knime-python-base<=4.7.0 # dependencies of KNIME - Python integration# - knime-python-scripting # everything you need to also build Python packages for KNIME- cairo # SVG support- pillow # Image inputs/outputs- matplotlib # Plotting- IPython # Notebook support- nbformat # Notebook support- scipy # Notebook support- jpype1 # A Python to Java bridge.# Jupyter Notebook support- jupyter # Jupyter Notebook- pypdf# ---- additional packages LLM- transformers# - openai # # - streamlit # A faster way to build and share data apps- pip- pip: - langchain # LangChain is a framework for developing applications powered by large language models (LLMs) - langchain-community - chromadb - unstructured - sentence-transformers - ollama # - torch # - huggingface-hub #- vtreat Use loops to answer several questions at once - using the Vector Store you have just created - also escape the text in the prompts so as to fit them into a JSON file If you are behind a Proxy server make sure you have the environment variables set for it. Ollama will use them to download the model initially (you can later even switch off the WiFi to be sure nothing leaks ...).You can set them (per session if you must) in the terminal window:set HTTP_PROXY=http://proxy.my-company.com:8080set HTTPS_PROXY=http://proxy.my-company.com:8080You might have to close all running Ollama instances. Set the proxy variables in the Terminal Window and then start Ollama again:"ollama run llama3:instruct" Maybe download the whole LLM workflow group in order to get all the folders(https://hub.knime.com/mlauber71/spaces/LLM_Space/~17k4zAECNryrZw1X/) Run in Terminal window to start Ollama. You can also try and use other models (https://ollama.com). You can also just pull the modelollama pull llama3:instructollama run llama3:instructollama pull mxbai-embed-large Medium - Chat with local Llama3 Model via Ollama in KNIME Analytics Platform - Also extract Logs into structured JSON Fileshttps://medium.com/p/aca61e4a690aP.S.: yes I am aware of the large empty white space but I have no idea how to remove it in KNIME 4 and have already contacted KNIME support You can get more examples of how to work with your documents by checking these Python Codes that you could then adapthttps://github.com/ml-score/- Chat with your Logs- Chat with your PDF- Chat with your Unstructured CSVs- Chat with your Unstructured Log Files- Chat with your Unstructured Text Files Medium: Llama3 and KNIME - Build your local Vector Store from PDFs and other Documentshttps://medium.com/p/237eda761c1c Interactive Chat with your PDFs and local Llama3 (via Ollama running in the background) Interactive Chat with your PDFs and local Llama3 (via Ollama running in the background) LOG Files Interactive Chat with your CSVs and local Llama3 (via Ollama running in the background) - in a CSV one line will be a document Interactive Chat with your PDFs and local Llama3 (via Ollama running in the background) You can also bulk load CSV files (in this case a customer database "Northwind") and turn them into a vector store.You can also download the files here: https://github.com/ml-score/ollama/tree/main/documents/csv CSV Files Activate Conda Environmentbased on Operating SystemWindows or macOScurrent timereload the currentsaved chat with OllamaRight click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructquestionSTARTconstruct the promptENDCollect the Promptssave chat as .tablewith timestampscan for PDF-FilesURIFile Pathpath_vector_store../data/vectorstore/vectorstore_name"vectorstore_pdf_llama"list_pdfsscan for the latestfile to store thepdf_chat* .tableresults../data/chat/InstructionescapedInstructionData=> escape the text so it will fit into a JSON fileescapedPromptData=> escape the text so it will fit into a JSON fileIterate over all the questionsusing the PDFs from thevector storecreate a local Vectorstore from your PDFs with "mxbai-embed-large"(faster)../script/questions.xlsxa collection of questionsyou want the LLM to answerPromptsample 2linesremoveif you have moreNumber of documents to retrievefrom the vector store per questionFile Pathlist_logscreate a local Vectorstore from your LOGs with "mxbai-embed-large"(faster)URIscan for LOG-Files../documents/logs/vectorstore_name"vectorstore_logs"path_vector_store../data/vectorstore/reload the currentsaved chat with OllamaRight click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructsave chat as .tablewith timestampcurrent timescan for the latestfile to store thelog_chat* .tableresults../data/chat/create a local Vectorstore from yourPDFs with "llama3:instruct"(takes more time, consider using "mxbai-embed-large")create a local Vectorstore from yourLOGs with "llama3:instruct"(takes more time, consider using "mxbai-embed-large")Right click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructreload the currentsaved chat with Ollamareload the currentsaved chat with OllamaRight click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructlist_csvsvectorstore_name" vectorstore_csv" path_vector_store../data/vectorstore/reload the currentsaved chat with Ollamacreate a local Vectorstore from yourCSVs with "llama3:instruct"(takes more time, consider using "mxbai-embed-large")create a local Vectorstore from your CSVs with "mxbai-embed-large"(faster)URIscan for CSV-Files../documents/csv/Right click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructcurrent timescan for the latestfile to store thecsv_chat* .tableresults../data/chat/reload the currentsaved chat with OllamaRight click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructsave chat as .tablewith timestampFile PathActivate Conda Environmentbased on Operating SystemWindows or macOSvectorstore_name"vectorstore_pdf"path_vector_store../data/vectorstore/vectorstore_name"vectorstore_logs_llama"path_vector_store../data/vectorstore/Activate Conda Environmentbased on Operating SystemWindows or macOSvectorstore_name" vectorstore_csv_llama" path_vector_store../data/vectorstore/conda_knime_llama Create Date&TimeRange Table Reader Chat with PDFand Ollama Java EditVariable (simple) Table Row ToVariable Loop Start Variable Loop End Table Writer List Files/Folders Path to URI URL to File Path GroupBy Create File/FolderVariables StringConfiguration Column Rename Merge Variables List Files/Folders StringConfiguration Java Edit Variable Java Edit Variable Python Script Create emptyLog File create filename to collect Merge Variables Python Script Excel Reader Column Rename Row Filter IntegerConfiguration path_vector_store GroupBy Column Rename Python Script URL to File Path Path to URI List Files/Folders path_vector_store StringConfiguration Create File/FolderVariables Merge Variables inspect a SQLiteVector Store create filename to collect Table Reader Chat with LOGsand Ollama Table Writer Create emptyLog File Create Date&TimeRange List Files/Folders Merge Variables Python Script Python Script Chat with PDFand Ollama Table Reader Merge Variables Merge Variables Table Reader Chat with LOGsand Ollama Column Rename path_vector_store Merge Variables StringConfiguration Create File/FolderVariables Table Reader Merge Variables Python Script Python Script URL to File Path Path to URI List Files/Folders Chat with LOGsand Ollama create filename to collect Create Date&TimeRange List Files/Folders Merge Variables Table Reader Chat with LOGsand Ollama Table Writer Create emptyLog File GroupBy conda_knime_llama path_vector_store StringConfiguration Create File/FolderVariables Merge Variables path_vector_store StringConfiguration Create File/FolderVariables Merge Variables conda_knime_llama path_vector_store StringConfiguration Create File/FolderVariables Merge Variables Interactive Chat with your PDFs and local Llama3 (via Ollama running in the background) Ollama - Chat with your PDF or Log Files - create and use a local vector storeTo keep up with the fast pace of local LLMs I try to use more generic nodes and Python code to access Ollama and Llama3 - this workflow will run with KNIME 4.7The chroma vector store will be persisted in a local SQLite3 database.To get this to work you will have to install Ollama and a Python environment with the necessary packages (py3_knime_llama), downlaod the Llama3 model and an embedding model (https://ollama.com/blog/embedding-models) Interactive Chat with your LOGs and local Llama3 (via Ollama running in the background) - instead of Log Files you could also adapt other unstructured text files You can also bulk load text files (in this case "LOG" files with log information) and turn them into vector stores. Question is if this does make any sense # conda env create -f="/Users/m_lauber/Dropbox/knime-workspace/_hub/LLM_Space/script/py3_knime_llama.yml"# conda env create -f="C:\\Users\\x1234567\\knime-workspace\\hub\\kn_example_python_graphic_bokeh_json\\data\\py3_knime_llama.yml"# https://docs.gpt4all.io/gpt4all_python.html# conda env remove --name py3_knime_llama# conda activate py3_knime_llama# conda update -n py3_knime_llama --all# conda env update --name py3_knime_llama --file "/Users/m_lauber/Dropbox/knime-workspace/_hub/LLM_Space/script/py3_knime_llama.yml" --prune# conda env update --name py3_knime_llama --file "C:\\Users\\x1234567\\knime-workspace\\hub\\kn_example_python_graphic_bokeh_json\\data\\py3_knime_llama.yml" --prune# conda env update --name py3_knime_llama --file "/Users/m_lauber/Dropbox/knime-workspace/_hub/LLM_Space/script/py3_knime_llama.yml"# conda env update --name py3_knime_llama --file "C:\\Users\\x1234567\\knime-workspace\\hub\\kn_example_python_graphic_bokeh_json\\data\\py3_knime_llama.yml"# conda update -n base -c conda-forge conda# KNIME official Python integration guide# https://docs.knime.com/latest/python_installation_guide/index.html#_introduction# KNIME and Python - Setting up and managing Conda environments# https://medium.com/low-code-for-advanced-data-science/knime-and-python-setting-up-and-managing-conda-environments-2ac217792539# file: py3_knime_llama.yml with some modifications# THX Carsten Haubold (https://hub.knime.com/carstenhaubold) for hintsname: py3_knime_llama # Name of the created environmentchannels: # Repositories to search for packages- conda-forge# https://anaconda.org/knime- knime # conda search knime-python-base -c knime --info # to see what is in the packagedependencies: # List of packages that should be installed- python #=3.10 # Python- knime-python-base<=4.7.0 # dependencies of KNIME - Python integration# - knime-python-scripting # everything you need to also build Python packages for KNIME- cairo # SVG support- pillow # Image inputs/outputs- matplotlib # Plotting- IPython # Notebook support- nbformat # Notebook support- scipy # Notebook support- jpype1 # A Python to Java bridge.# Jupyter Notebook support- jupyter # Jupyter Notebook- pypdf# ---- additional packages LLM- transformers# - openai # # - streamlit # A faster way to build and share data apps- pip- pip: - langchain # LangChain is a framework for developing applications powered by large language models (LLMs) - langchain-community - chromadb - unstructured - sentence-transformers - ollama # - torch # - huggingface-hub #- vtreat Use loops to answer several questions at once - using the Vector Store you have just created - also escape the text in the prompts so as to fit them into a JSON file If you are behind a Proxy server make sure you have the environment variables set for it. Ollama will use them to download the model initially (you can later even switch off the WiFi to be sure nothing leaks ...).You can set them (per session if you must) in the terminal window:set HTTP_PROXY=http://proxy.my-company.com:8080set HTTPS_PROXY=http://proxy.my-company.com:8080You might have to close all running Ollama instances. Set the proxy variables in the Terminal Window and then start Ollama again:"ollama run llama3:instruct" Maybe download the whole LLM workflow group in order to get all the folders(https://hub.knime.com/mlauber71/spaces/LLM_Space/~17k4zAECNryrZw1X/) Run in Terminal window to start Ollama. You can also try and use other models (https://ollama.com). You can also just pull the modelollama pull llama3:instructollama run llama3:instructollama pull mxbai-embed-large Medium - Chat with local Llama3 Model via Ollama in KNIME Analytics Platform - Also extract Logs into structured JSON Fileshttps://medium.com/p/aca61e4a690aP.S.: yes I am aware of the large empty white space but I have no idea how to remove it in KNIME 4 and have already contacted KNIME support You can get more examples of how to work with your documents by checking these Python Codes that you could then adapthttps://github.com/ml-score/- Chat with your Logs- Chat with your PDF- Chat with your Unstructured CSVs- Chat with your Unstructured Log Files- Chat with your Unstructured Text Files Medium: Llama3 and KNIME - Build your local Vector Store from PDFs and other Documentshttps://medium.com/p/237eda761c1c Interactive Chat with your PDFs and local Llama3 (via Ollama running in the background) Interactive Chat with your PDFs and local Llama3 (via Ollama running in the background) LOG Files Interactive Chat with your CSVs and local Llama3 (via Ollama running in the background) - in a CSV one line will be a document Interactive Chat with your PDFs and local Llama3 (via Ollama running in the background) You can also bulk load CSV files (in this case a customer database "Northwind") and turn them into a vector store.You can also download the files here: https://github.com/ml-score/ollama/tree/main/documents/csv CSV Files Activate Conda Environmentbased on Operating SystemWindows or macOScurrent timereload the currentsaved chat with OllamaRight click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructquestionSTARTconstruct the promptENDCollect the Promptssave chat as .tablewith timestampscan for PDF-FilesURIFile Pathpath_vector_store../data/vectorstore/vectorstore_name"vectorstore_pdf_llama"list_pdfsscan for the latestfile to store thepdf_chat* .tableresults../data/chat/InstructionescapedInstructionData=> escape the text so it will fit into a JSON fileescapedPromptData=> escape the text so it will fit into a JSON fileIterate over all the questionsusing the PDFs from thevector storecreate a local Vectorstore from your PDFs with "mxbai-embed-large"(faster)../script/questions.xlsxa collection of questionsyou want the LLM to answerPromptsample 2linesremoveif you have moreNumber of documents to retrievefrom the vector store per questionFile Pathlist_logscreate a local Vectorstore from your LOGs with "mxbai-embed-large"(faster)URIscan for LOG-Files../documents/logs/vectorstore_name"vectorstore_logs"path_vector_store../data/vectorstore/reload the currentsaved chat with OllamaRight click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructsave chat as .tablewith timestampcurrent timescan for the latestfile to store thelog_chat* .tableresults../data/chat/create a local Vectorstore from yourPDFs with "llama3:instruct"(takes more time, consider using "mxbai-embed-large")create a local Vectorstore from yourLOGs with "llama3:instruct"(takes more time, consider using "mxbai-embed-large")Right click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructreload the currentsaved chat with Ollamareload the currentsaved chat with OllamaRight click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructlist_csvsvectorstore_name" vectorstore_csv" path_vector_store../data/vectorstore/reload the currentsaved chat with Ollamacreate a local Vectorstore from yourCSVs with "llama3:instruct"(takes more time, consider using "mxbai-embed-large")create a local Vectorstore from your CSVs with "mxbai-embed-large"(faster)URIscan for CSV-Files../documents/csv/Right click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructcurrent timescan for the latestfile to store thecsv_chat* .tableresults../data/chat/reload the currentsaved chat with OllamaRight click to openinteractive viewMake sure Ollama is running in the backgroundeg. in the Terminal Window:ollama run llama3:instructsave chat as .tablewith timestampFile PathActivate Conda Environmentbased on Operating SystemWindows or macOSvectorstore_name"vectorstore_pdf"path_vector_store../data/vectorstore/vectorstore_name"vectorstore_logs_llama"path_vector_store../data/vectorstore/Activate Conda Environmentbased on Operating SystemWindows or macOSvectorstore_name" vectorstore_csv_llama" path_vector_store../data/vectorstore/conda_knime_llama Create Date&TimeRange Table Reader Chat with PDFand Ollama Java EditVariable (simple) Table Row ToVariable Loop Start Variable Loop End Table Writer List Files/Folders Path to URI URL to File Path GroupBy Create File/FolderVariables StringConfiguration Column Rename Merge Variables List Files/Folders StringConfiguration Java Edit Variable Java Edit Variable Python Script Create emptyLog File create filename to collect Merge Variables Python Script Excel Reader Column Rename Row Filter IntegerConfiguration path_vector_store GroupBy Column Rename Python Script URL to File Path Path to URI List Files/Folders path_vector_store StringConfiguration Create File/FolderVariables Merge Variables inspect a SQLiteVector Store create filename to collect Table Reader Chat with LOGsand Ollama Table Writer Create emptyLog File Create Date&TimeRange List Files/Folders Merge Variables Python Script Python Script Chat with PDFand Ollama Table Reader Merge Variables Merge Variables Table Reader Chat with LOGsand Ollama Column Rename path_vector_store Merge Variables StringConfiguration Create File/FolderVariables Table Reader Merge Variables Python Script Python Script URL to File Path Path to URI List Files/Folders Chat with LOGsand Ollama create filename to collect Create Date&TimeRange List Files/Folders Merge Variables Table Reader Chat with LOGsand Ollama Table Writer Create emptyLog File GroupBy conda_knime_llama path_vector_store StringConfiguration Create File/FolderVariables Merge Variables path_vector_store StringConfiguration Create File/FolderVariables Merge Variables conda_knime_llama path_vector_store StringConfiguration Create File/FolderVariables Merge Variables

Nodes

Extensions

Links