Icon

Lab 4 GenAI Fraud Detection1

<p>GenAI Learnathon: Enrich Data Analytics with GenAI<br><br>During this hands-on learnathon, you will get familiar with the free low-code tool KNIME Analytics Platform, techniques for fraud detection, and the use of GenAI via the KNIME AI Extension.<br><br>Without any coding experience you will learn to:</p><ul><li><p>Detect fraud in investment contracts visually and using statistical techniques</p></li><li><p>Prompt engineer an LLM via the OpenAI integration</p></li><li><p>Create a vector store from a corpus of different types of investment contracts (the knowledge base)</p></li><li><p>Build and deploy a data app to display email alerts and make it available to authenticated users on a web browser</p></li></ul><p>Other KNIME AI integrations we will discuss include GPT4All, Hugging Face, Chroma, FAISS and more.</p>

URL: KNIME AI Extension https://hub.knime.com/knime/extensions/org.knime.python.features.llm/latest/
URL: KNIME for Generative AI https://hub.knime.com/knime/collections/KNIME%20for%20Generative%20AI~D4ckx2q_J5FPBQXu

C) Enriching Analytics with GenAI: Prompting for Email Alerts

Step 1: Get the Hugging Face credentials.

Drag and drop a Credentials Configuration node and paste your Hugging Face access token in the password field. Connect it to the HF Hub Authenticator and execute the node to verify that the connection is working. 

Step 2: Select an LLM model

Drag and drop an HF Hub LLM Selector and connect it to the HF Hub Authenticator. Select an instruction/chat model (for example: HuggingFaceH4/zephyr-7b-beta). This model will be used to generate the email alerts.

Step 3: Engineer a prompt for email alerts 

Drag and drop a String Manipulation node and connect it to the contract data output. Structure a prompt that asks the model to generate an alert email for each detected contract issue. Connect the output to the LLM Prompter

Step 4: Visualize the LLM output

Visualize the alert message produced by the LLM Prompter with a String Format Manager (set 'between words') and Table View (set 'Row Height > Custom > 500') or Email Preview component to visualize the generated alert emails. 

B) Analytics Workflow:
Outlier Detection for Fraud

Step 1: Perform outlier detection

Search, drag and drop a Numeric Outliers to perform an automated detection of extreme values for the column 'payments' using the interquartile range.

Step 2: Inspect the detected outliers

Search, drag and drop a Tile View to visualize the first output of the Numeric Outliers.

Are those the same contracts you visually detected in Part A?

What happens if you change k in the Numeric Outliers settings?

Step 3: Save outlier detection model for Part E

User Model Writer to save the last output of the Numeric Outlier node (blue port) in the 'models' folder. This will save the model for later when we want to apply the same technique on a new contract.

A) Analytics Workflow:
Visual Detection for Fraud

Step 1: Parse the contract data

Drag and drop a PDF Parser node and load the provided contracts PDF. Execute the node to extract the raw text from the document.

Step 2: Extract contract information

Use the provided data extraction component (or text processing nodes) to split the parsed text into structured columns

Step 3: Load the database

Load the database using the given nodes.

Step 4: Create a dashboard data app with multiple charts

To display all your chart in the same view, you need to create a Component. Select all the view nodes (Ctrl + Left Click) then Right Click > Create Component.

Next, go inside the component (Right Click > Component > Open component) and customize the layout (Open layout editor in top toolbar).

Step 5: Visualize potential fraud patterns

Use Data App to visualize the issues. Open the component composite view. Can you visually detect fraud by investigating odd values in the payment column?

D) Enriching Analytics with GenAI: Providing Context for Email Alerts

Step 1: Segment knowledge base PDF in sections

Search, drag and drop a Sentence Extractor and apply it to the PDF Parser output, that is the document describing different types of investments. This will segment the document into rows, each containing a different sentence from the PDF.

Step 2: Create a vector store from the segmented knowledge base

Search, drag and drop a FAISS Vector Store Creator to convert the text sections in vectors and store them in a FAISS vector store. Connect the input embeddings model from the OpenAI Embeddings Connector. Save the Vector Store for Part E in the 'models' folder by adding a Model Writer node.

Step 3: Retrieve text describing the contract type of detected fraud

Search, drag and drop a Vector Store Retriever to search in the vector store the relevant sentences describing the contract types for the detected frauds in Part B. Connect as first input the vector store, and as second input the Tile View output from Part B. Select 'type' in the settings.

Step 4: Engineer a prompt for email alerts with contract type context

Similarly to before use a String Manipulation to engineer the prompt generating the alert emails.
This time add the context from the vector store to explain the contract type.

Step 5: Perform retrieval augmented generation (RAG)

Drag and drop an LLM Prompter, and feed it with the output of the String Manipulation.

Finally, visualize the generated emails via the provided component.

Data Wrangling: Access, Blending, Manipulation

Simple GenAI Adoption for generating an alert email

Advanced GenAI Adoption for generating an alert email

Fraud Detection via Analytics

PDF Parser
prompt engineering
String Manipulation
Numeric Outliers
LLM Prompter
Enter theHugging Face Hubaccess token
Credentials Configuration
Sentence Extractor
FAISS Vector Store Creator
PDF Parser
Save outlier detectionmodel
Model Writer
Visual detection of frauds
Data App
String Format Manager
View LLM output
Table View
LLM Prompter
Authenticate withHugging Face API Key
HF Hub Authenticator
Table View
Save vector store
Model Writer
Value Lookup
List of outliers
Tile View (JavaScript) (legacy)
Visualize & downloademail alerts
Email Preview
HF Hub LLM Selector
HF Hub Embedding Model Selector
Extract Contract Data
DB Table Selector (deprecated)
Prompt engineering
String Manipulation
SQLite Connector
DB Reader
Vector Store Retriever

Nodes

Extensions

Links