Icon

Solution_​Workflow

Unify each file format to common columns

Ensures:

  • The same column names

  • Consistent datetime processing (the string representations vary)

List of payment files: 01_file_register.xlsx

Prepare file path

Use:

  • Variable expression: Construct the string by concatenating the path to your folder and the filename. Separate paths by slashes.
    e.g. "folder_name/" + $["FilePath"]

  • String to Path (Variable): use the path to the file (which you just created)

    Use the settings in the FileSystem section: "Relative to" and "Current mountpoint", which means that it will start from your KNIME workspace

Add vendor info to transactions

  • Join on VendorID and Entity columns

  • Inner Join and Left Join - i.e. Keep all transactions even when a join is not made.

  • Include:

    • All columns from 'left table'

    • From right side only keep VendorName_Master, VendorRiskClass, PaymentTermsText, ActiveFlag

  • NOTE: Ensure that the vendor master only has one row per VendorID and Entity, as the joiner will take all joins, not just the first match

CSV Reader

(Use file path flow variable)

Excel Reader

(Use file path flow variable)

Excel Reader + Sheet

(Use file-path and sheets flow variables)

Case switch selection

Use Variable Expression node to create a flow variable, which will choose the index 0, 1 or 2 based on the file format (TemplateType).
Hint:
if($$["TemplateType"] = "AP_CSV", 0,

$$["TemplateType"] = "AP_XLSX", 1,

$$["TemplateType"] = "AP_MULTISHEET", 2,

0)

Unify column naming

Convert different string formats into KNIME formats. Convert amount from string to number. Must specify decimal and thousands separator

  • String to Date&Time

  • String to Number

Read 05_vendor_master.xlsx

  • Remove duplicate rows with "Duplicate Row Filter" based on:

    • Entity

    • VendorID

  • If duplicates remain, we may cause incorrectly duplicated records in the next step in the joiner

Read 06_control_thresholds.xlsx

  • The control thresholds table tracks the full set of parameters for each scenario to test

  • By filtering this table, by the ThresholdProfile, we can select the parameters to apply

Configurable Settings: 07_reporting_parameters.xlsx

Change as you see fit

These selections will affect the output, for example, selecting which control thresholds based on a selected profile (to the right)

Add threshold information

  • Join on Entity column

  • Inner Join and Left Join - i.e. Keep all transactions even when a join is not made.

  • Include:

    • All columns from 'left table'

    • From right side only keep HighValueLimit, WeekendPostingFlagAllowed, MaxApprovalLagDays, AgingDaysAmber, AgingDaysRed

  • These threshold values will now be columns on every transaction row, available for comparison in the exception rules.

Read 06b_allowed_currencies.xlsx

Tracks what currencies are permitted per entity.

Add info about permitted currencies

  • Join on Entity and Currency columns

  • Inner Join and Left Join - i.e. Keep all transactions even when a join is not made.

  • Note that the "Entity (Right)" column will be coming from the 'allowed currencies' table, so we'll need to check the presence of this to test whether or not the currency is allowed. i.e. a missing value means it's an invalid currency. Use the Expressions node to create the CurrencyAllowedFlag variable. It should be "N' if it's a missing value or an empty string

  • HINT:

    if(

    ($["Entity (Right)"] = MISSING) or ($["Entity (Right)"] = ""),"
    Y","N"
    )

Convert duration terms to special duration format, check date ranges

  • Parse the PaymentTermsText (is actually a duration) to a KNIME duration format with the String to Duration node

  • Extract fields from the PostingDate column using the Date&Time Part Extractor nodes, as follows:

    • Year - Rename as PostingYear

    • Month (number) - Rename as PostingMonth

    • Day of week (number) - Rename as PostingDayOfWeek

Duplicate check

Use the Duplicate Row Filter node.

Ensure to check the boxes :

  • "Keep Duplicate Rows" - Allows to add a column with info about the duplicate, being unique or not

  • "Add column showing the row status" to ensure that the type of duplicate can be added in a new column

  • Output name DuplicateStatus

Select only these for the uniqueness criteria:

  • Entity

  • InvoiceNo

  • VendorID, Amount

  • InvoiceDate

Other checks/flags

Use an Expression node to create these flags:

PostingMonthLabel — a YYYY-MM text label for grouping and display:

format_date($["PostingDate"], "yyyy-MM")

NegativeAmountFlag:

if($["Amount"] < 0, "Y", "N")

WeekendPostingFlag — use the PostingDayOfWeek column from Step 25. Replace 6 with the actual minimum value for Saturday in your locale if different:

if($["PostingDayOfWeek"] >= 6, "Y", "N")

MissingVendorFlag — catches both a truly missing VendorID and an empty string:

if(($["VendorID"] ?? "") = "", "Y", "N")

VendorMatchedFlag — a null VendorRiskClass means the vendor join in Step 18 found no matching row. "UNMATCHED" is used as the sentinel because it is not a valid risk class value:

if(($["VendorRiskClass"] ?? "UNMATCHED") = "UNMATCHED", "N", "Y")

AbsoluteAmount — used in the high-value check so that large credit notes are also caught:

abs($["Amount"])

Aging calculations

  • Use the Date&Time Difference node to calculate new column "AgingDays" as the difference between DueDate and PostingDate

  • Use another Date&Time Difference node to calculate new column "PostingDelay" as the difference between InvoiceDate and PostingDate

  • For both nodes, select to output 'Number' instead of 'Duration', with granularity of days.

Classify exceptions/issue

Use a combination of rules on top of specific flags to classify what kind of exception/issue the record has

Create an Expression node, create a new column called ExceptionType, and use the following expression:

if($["MissingVendorFlag"] = "Y", "MISSING_VENDOR",
if($["VendorMatchedFlag"] = "N" and $["MissingVendorFlag"] = "N", "UNMATCHED_MASTER",
if($["DuplicateStatus"] = "duplicate", "DUPLICATE_SUSPECTED",
if($["AbsoluteAmount"] > $["HighValueLimit"], "HIGH_VALUE",
if($["AgingDays"] > $["AgingDaysRed"], "OVERDUE",
if($["WeekendPostingFlag"] = "Y" and $["WeekendPostingFlagAllowed"] = "N", "WEEKEND_POSTING",
if($["CurrencyAllowedFlag"] = "N", "CURRENCY_NOT_ALLOWED",
if($["NegativeAmountFlag"] = "Y", "NEGATIVE_AMOUNT_REVIEW",
"OK"))))))))

Separate 'ok' data from flagged data (i.e. detected Exceptions)

  • Top output port: rows where ExceptionType != "OK" — these are your exceptions

  • Bottom output port: rows where ExceptionType = "OK" — these are your clean transactions

Create summaries of data and insights

  • Build exception summaries: Use GroupBy node collapses many rows into summary rows by grouping on one or more columns and applying aggregation functions to the others. Create at least two summary tables:

    • Group by Entity + ExceptionType . Aggregations: count of rows, sum of Amount, max of AgingDays

    • Group by Entity only. Aggregations: count of rows, sum of Amount

  • Build the exception matrix: Use Pivot node to create matrix where each cell contains an aggregated value.

    • Groups (rows): Entity

    • Pivot (columns): ExceptionType

    • Aggregation: count of rows

The output has one row per entity and one column per exception type suitable for inclusion in a report.

Aggregations by Entity and ExceptionType

Create a Group Loop Start node, and iterate through one Entity at a time

Save each output as a new tab in an excel file

HINT: Use the Entity flow variable as the tab name (which is generated by the loop as it's the grouping condition), and you may manually specify the file path

Configurable Parameters
Excel Reader
Table Row to Variable
Read Excel Formatwith sheet namespecified
Excel Reader
01 - File Register
Excel Reader
Keep only rowswhere IsActive is true
Row Filter
Go through each filenameone at a time
Table Row to Variable Loop Start
Create port selectionbased on file format.
Variable Expression
Process filebased on format.Each format willbe normalised bythe respective case
CASE Switch Start
Change format-specificcolumn names to universalcolumn names
Column Renamer
Determine if validcurrency
Expression
Yet year, month and day
Date&Time Part Extractor
Merge transactionswith permitted currency
Joiner
Parse payment termsto duration
String to Duration
Date, dash format
String to Date&Time
Detect duplicates
Duplicate Row Filter
Change format-specificcolumn names to universalcolumn names
Column Renamer
Create flags
Expression
Date AND time format
String to Date&Time
Convert amount fromstring to number. Mustspecify decimal andthousands separator
String to Number
Date, slash fromat
String to Date&Time
AgingDays
Date&Time Difference
PostingDelay
Date&Time Difference
Clean text fields
Expression
GroupBy
Change format-specificcolumn names to universalcolumn names
Column Renamer
Pivot
Outputs flags, or 'OK' as ExceptionType
Expression
'Remove' the time fromdate&time types, so onlythe date remains
Time Modifier
Inspect 'ok' data on top. portInspect flagged on bottom port
Row Splitter
Read CSV Format
CSV Reader
Duplicate Row Filter
Variable Loop End
Convert pathas string to correctpath type
String to Path (Variable)
Create file path(string)
Variable Expression
06_control_thresholds.xlsx
Excel Reader
Vendor master data
Excel Reader
Group Loop Start
Merge transactionswith vendor info
Joiner
Excel Writer
Concatenates rowsfrom all files into one
Loop End
06b_allowed_currencies.xlsx
Excel Reader
Read Excel Format
Excel Reader
Row Filter
Takes only theactive branch(Which is set upin the case switchstart node)
CASE Switch End
Merge transactionswith thresholds
Joiner

Nodes

Extensions

Links