Icon

JKISeason2-24_​sryu

Challenge 24: Fraudulent Email Address Detection
Level: Medium

Description: In this challenge you will take the role of cybersecurity analyst, and see if you can identify emails that are trying to pass as legitimate when they are in fact malicious. You notice that bad-actor emails try to trick the receiver by mimicking major email domains. For instance, you notice that @gnail, @gmial, etc. are trying to pass as @gmail. You then decide to get a count of all the domains: those that have the lowest count have a higher probability of being fraudulent. You must also check whether those low-count email domains are trying to pose as the major emails domains or not. Your answer should not mark @unique.com as fraudulent. Note: Try not to hard-code any variables in your workflow, but instead use mean or median for instance. Hint: Checking for string similarity might help.

Workflow with Similarity Search node Workflow with String Similarity node Workflow without String Similarity / Similarity Search node datadelimiter = @Count the number of emails per domainSimilarity to major domains Count >=2 For each email, keep the row with the maximum similarity.$Similarity$ = 1 =>"Legitimate"$Similarity$ >= 0.5 =>"Malicious"TRUE => "Legitimate"Create mimicking columntop: count=1bottom: count >=2datadelimiter = @Count the number of emails per domainJudgementMISSING $distance$ =>"Legitimate"$distance$ <= 0.5 =>"Malicious"TRUE =>"Legitimate"Create mimicking columnsNode 51Create mimicking columnstop: count=1bottom: count >=2dataString similarity with n-gram = 2delimiter = @Count the number of emails per domainJudgementMISSING $Email_Arr[1] (#1)$ =>"Legitimate"$Similarity$ >= 0.5 =>"Malicious"TRUE =>"Legitimate" CSV Reader Cell Splitter GroupBy String Similarity Row Filter Cross Joiner DuplicateRow Filter Rule Engine Rule Engine Color Manager Table View Row Splitter CSV Reader Cell Splitter GroupBy Joiner Rule Engine Color Manager Table View Rule Engine Similarity Search RowID Color Manager Table View Rule Engine Row Splitter CSV Reader Metanode Joiner Cell Splitter GroupBy Rule Engine Workflow with Similarity Search node Workflow with String Similarity node Workflow without String Similarity / Similarity Search node datadelimiter = @Count the number of emails per domainSimilarity to major domains Count >=2 For each email, keep the row with the maximum similarity.$Similarity$ = 1 =>"Legitimate"$Similarity$ >= 0.5 =>"Malicious"TRUE => "Legitimate"Create mimicking columntop: count=1bottom: count >=2datadelimiter = @Count the number of emails per domainJudgementMISSING $distance$ =>"Legitimate"$distance$ <= 0.5 =>"Malicious"TRUE =>"Legitimate"Create mimicking columnsNode 51Create mimicking columnstop: count=1bottom: count >=2dataString similarity with n-gram = 2delimiter = @Count the number of emails per domainJudgementMISSING $Email_Arr[1] (#1)$ =>"Legitimate"$Similarity$ >= 0.5 =>"Malicious"TRUE =>"Legitimate" CSV Reader Cell Splitter GroupBy String Similarity Row Filter Cross Joiner DuplicateRow Filter Rule Engine Rule Engine Color Manager Table View Row Splitter CSV Reader Cell Splitter GroupBy Joiner Rule Engine Color Manager Table View Rule Engine Similarity Search RowID Color Manager Table View Rule Engine Row Splitter CSV Reader Metanode Joiner Cell Splitter GroupBy Rule Engine

Nodes

Extensions

Links