Icon

20230907 Pikairos JustKNIMEIt Season 2 Challenge 24 Fraudulent Email Address Detection

In this challenge you will take the role of cybersecurity analyst, and see if you can identify emails that are trying to pass as legitimate when they are in fact malicious. You notice that bad-actor emails try to trick the receiver by mimicking major email domains. For instance, you notice that @gnail, @gmial, etc. are trying to pass as @gmail. You then decide to get a count of all the domains: those that have the lowest count have a higher probability of being fraudulent. You must also check whether those low-count email domains are trying to pose as the major emails domains or not. Your answer should not mark @unique.com as fraudulent. Note: Try not to hard-code any variables in your workflow, but instead use mean or median for instance. Hint: Checking for string similarity might help.

Challenge 24: Fraudulent Email Address DetectionIn this challenge you will take the role of cybersecurity analyst, and see if you can identify emails that are trying to pass as legitimate when theyare in fact malicious. You notice that bad-actor emails try to trick the receiver by mimicking major email domains. For instance, you notice that@gnail, @gmial, etc. are trying to pass as @gmail. You then decide to get a count of all the domains: those that have the lowest count have ahigher probability of being fraudulent. You must also check whether those low-count email domains are trying to pose as the major emailsdomains or not. Your answer should not mark @unique.com as fraudulent. Note: Try not to hard-code any variables in your workflow, but insteaduse mean or median for instance. Hint: Checking for string similarity might help. Read DomainsDataSplit Emailwith Separator @Peform LevenshteinSimilarity BetweenEmail Domainand Every Other EmailDomainCross Jointhe Table withItselfGroup byEmail and TakeFirst Entryfor EachSort byDescendingSimilarityGroup ByEmail Domainand Count Frequencyof OccurenceJoin the CountColumnExclude RowsWhere Similarityis 1Rule:Similarity > 0.7ANDDomain Count < Domain Countof Comparison Email=> FRAUDULENTTRUE => NOT FRAUDULENTRemoveUnwantedColumnsView Result File Reader Cell Splitter String Similarity Cross Joiner GroupBy Sorter GroupBy Joiner Row Filter Rule Engine Column Filter Table View (Labs) Challenge 24: Fraudulent Email Address DetectionIn this challenge you will take the role of cybersecurity analyst, and see if you can identify emails that are trying to pass as legitimate when theyare in fact malicious. You notice that bad-actor emails try to trick the receiver by mimicking major email domains. For instance, you notice that@gnail, @gmial, etc. are trying to pass as @gmail. You then decide to get a count of all the domains: those that have the lowest count have ahigher probability of being fraudulent. You must also check whether those low-count email domains are trying to pose as the major emailsdomains or not. Your answer should not mark @unique.com as fraudulent. Note: Try not to hard-code any variables in your workflow, but insteaduse mean or median for instance. Hint: Checking for string similarity might help. Read DomainsDataSplit Emailwith Separator @Peform LevenshteinSimilarity BetweenEmail Domainand Every Other EmailDomainCross Jointhe Table withItselfGroup byEmail and TakeFirst Entryfor EachSort byDescendingSimilarityGroup ByEmail Domainand Count Frequencyof OccurenceJoin the CountColumnExclude RowsWhere Similarityis 1Rule:Similarity > 0.7ANDDomain Count < Domain Countof Comparison Email=> FRAUDULENTTRUE => NOT FRAUDULENTRemoveUnwantedColumnsView Result File Reader Cell Splitter String Similarity Cross Joiner GroupBy Sorter GroupBy Joiner Row Filter Rule Engine Column Filter Table View (Labs)

Nodes

Extensions

Links