Icon

Workflow

1. Data Ingestion, Cleaning and Formatting

2. Rule Based Checks

  • Compare each employee against agerage

  • Flag values outside permitted ranges (e.g. too many hours worked)

  • Check for consecutive overtime weeks

  • To-do: Check that the overtime matches the employee role

3. Use Machine Learning to Detect Outliers

It is better to use labelled data instead of data that just has generated 'flags' for a baseline to test the model.

Secondary region dataset

Next step: Continue to unify with other dataset format and perform same operations

Gather all results for the employee ID

If your dataset is
combined at this step
then use the row
splitter instead

Results

Detecting Outliers In Overtime Hours

USA Overtime Data(Select sheet basedon flow variable)
Excel Reader
South Africa Overtime Data
Excel Reader
Get the averageOT rate as variablesso we can compare
Table Row to Variable
Apply IsolationForest to get Mean Length
H2O Isolation Forest Predictor
USA Overtime Data(Sheet Names)
Read Excel Sheet Names
Of the 'ok' data, wewant some percent tojoin with the 'outlier' datato run as a test.
Table Partitioner
Create H2O Frame
Table to H2O
Check if a shift isoutside the permitted range(Arbitrary 55hr)
Expression
Allows to compare to priortime periods
Lag Column
Aggregate by employee
GroupBy
Aggregate by employee
GroupBy
Check if the OThas increased overlast two timesteps
Expression
Check if OT is outsidestd deviation
Expression
Select only recordswith all 3 flags
Missing Value
Joiner
30% of 'ok' dataAll of 'anomaly' data
Concatenate
Joiner
Scorer
Classified as possibleoutlier
Row Filter
Create H2O Frame
Table to H2O
Create PossibleOutliercolumn = 1
Expression
Inner join: Employeeswith possible issuesRight unmatched: Allother records
Joiner
Create KNIMETable
H2O to Table
Expression
Create PossibleOutliercolumn = 0
Expression
H2O Isolation Forest Learner
H2O Local Context
Iterate through eachtab/sheet of excel
Table Row to Variable Loop Start
Specifyquarter startdate
Variable Creator
Gather data fromall tabs
Loop End
Date Shifter
- Extract the weeknumber from the tab/sheet name. - Add the quarterstart date
Expression
String to Date&Time
Aggregate by employee ID
GroupBy
Aggregate overfull dataset
GroupBy
Turn week-endinto week-start
Date Shifter
Change to 'Date'
Column Renamer

Nodes

Extensions

Links