Icon

JKISeasor2-24_​tomljh_​ver1

There has been no title set for this workflow's metadata.

Challenge 24: Fraudulent Email Address Detection

Challenge 24: Fraudulent Email Address DetectionLevel: MediumDescription: In this challenge you will take the role of cybersecurity analyst, and see if you canidentify emails that are trying to pass as legitimate when they are in fact malicious. You notice thatbad-actor emails try to trick the receiver by mimicking major email domains. For instance, younotice that @gnail, @gmial, etc. are trying to pass as @gmail. You then decide to get a count of allthe domains: those that have the lowest count have a higher probability of being fraudulent. Youmust also check whether those low-count email domains are trying to pose as the major emailsdomains or not. Your answer should not mark @unique.com as fraudulent. Note: Try not to hard-code any variables in your workflow, but instead use mean or median for instance. Hint: Checkingfor string similarity might help.Author: Victor Palacios 挑战24:欺诈性电子邮件地址检测水平: 中等描述:在这个挑战中,您将扮演网络安全分析师的角色,看看您是否可以识别那些试图通过的电子邮件,而实际上它们是恶意的。您注意到不良行为者电子邮件试图通过模仿主要电子邮件域来欺骗收件人。例如,您注意到@gnail、@gmial等正在尝试传递为@gmail。然后,您决定获取所有域的计数:计数最低的域具有更高的欺诈概率。您还必须检查这些低计数的电子邮件域是否试图伪装成主要电子邮件域。您的答案不应将@unique.com 标记为欺诈。注意:尽量不要对工作流中的任何变量进行硬编码,而是使用平均值或中位数。提示:检查字符串相似性可能会有所帮助。作者: 维克多·帕拉西奥斯 Step 1:Obtaining low-frequency and high-frequency domains Step 2: Obtain the final dictionary table bydeleting rows with large distances between lowand high frequencies. Note: This dictionary table is a list offraudulent email addresses Readdomains.csvCount by DomainCalculate Mean and MedianUsing Levenshtein distanceGet Domain StringGenerate thresholds and place them in stream variablestop: Low-frequency domainbottom: High-frequency domainDelete rows with too much distance CSV Reader GroupBy GroupBy Similarity Search Cell Splitter Table RowTo Variable Row Splitter Double To Integer Numeric Outliers Value Lookup Table Manipulator Table View Color Manager Challenge 24: Fraudulent Email Address DetectionLevel: MediumDescription: In this challenge you will take the role of cybersecurity analyst, and see if you canidentify emails that are trying to pass as legitimate when they are in fact malicious. You notice thatbad-actor emails try to trick the receiver by mimicking major email domains. For instance, younotice that @gnail, @gmial, etc. are trying to pass as @gmail. You then decide to get a count of allthe domains: those that have the lowest count have a higher probability of being fraudulent. Youmust also check whether those low-count email domains are trying to pose as the major emailsdomains or not. Your answer should not mark @unique.com as fraudulent. Note: Try not to hard-code any variables in your workflow, but instead use mean or median for instance. Hint: Checkingfor string similarity might help.Author: Victor Palacios 挑战24:欺诈性电子邮件地址检测水平: 中等描述:在这个挑战中,您将扮演网络安全分析师的角色,看看您是否可以识别那些试图通过的电子邮件,而实际上它们是恶意的。您注意到不良行为者电子邮件试图通过模仿主要电子邮件域来欺骗收件人。例如,您注意到@gnail、@gmial等正在尝试传递为@gmail。然后,您决定获取所有域的计数:计数最低的域具有更高的欺诈概率。您还必须检查这些低计数的电子邮件域是否试图伪装成主要电子邮件域。您的答案不应将@unique.com 标记为欺诈。注意:尽量不要对工作流中的任何变量进行硬编码,而是使用平均值或中位数。提示:检查字符串相似性可能会有所帮助。作者: 维克多·帕拉西奥斯 Step 1:Obtaining low-frequency and high-frequency domains Step 2: Obtain the final dictionary table bydeleting rows with large distances between lowand high frequencies. Note: This dictionary table is a list offraudulent email addresses Readdomains.csvCount by DomainCalculate Mean and MedianUsing Levenshtein distanceGet Domain StringGenerate thresholds and place them in stream variablestop: Low-frequency domainbottom: High-frequency domainDelete rows with too much distance CSV Reader GroupBy GroupBy Similarity Search Cell Splitter Table RowTo Variable Row Splitter Double To Integer Numeric Outliers Value Lookup Table Manipulator Table View Color Manager

Nodes

Extensions

Links