17 September 2022
***************
Thank you for your interest in this component. I recommend that you switch to using the String Emoji and Character Class Filter which allows you to filter emojis and other characters using a list of "class/category" names
https://hub.knime.com/takbb/spaces/Public/latest/Components/String%20Emoji%20Filter
***************
** EXPERIMENTAL ** You are welcome to use it (at your own risk). Please check back for improvements in filters
Filter out emoji characters from a string using a built in Regular Expression. This is a proof-of-concept demonstration component. A future version will possibly include the ability to update the regular expression used.
23 April 2021 @takbb Brian Bates
This uses a Java Snippet with a java regex replaceall call, and the following regular expression to identify emoji. This is currently experimental with different "filter types" being used.
Please contact @takbb on the forum if you have suggestions for improvements to the regex, or techniques used
FILTER TYPE 1
**************
filters using the following regex expression and appears to provide limited emoji filtering:
[\\p{C}\\p{So}\uFE00-\uFE0F\\x{E0100}-\\x{E01EF}]
FILTER TYPE 2
**************
uses the following regex expression, and at this time is the most extensive of the filters:
emojiRegex="(?:[\\u2700-\\u27bf]|" +
"(?:[\\ud83c\\udde6-\\ud83c\\uddff]){2}|" +
"[\\ud800\\udc00-\\uDBFF\\uDFFF]|[\\u2600-\\u26FF])[\\ufe0e\\ufe0f]?(?:[\\u0300-\\u036f\\ufe20-\\ufe23\\u20d0-\\u20f0]|[\\ud83c\\udffb-\\ud83c\\udfff])?" +
"(?:\\u200d(?:[^\\ud800-\\udfff]|" +
"(?:[\\ud83c\\udde6-\\ud83c\\uddff]){2}|" +
"[\\ud800\\udc00-\\uDBFF\\uDFFF]|[\\u2600-\\u26FF])[\\ufe0e\\ufe0f]?(?:[\\u0300-\\u036f\\ufe20-\\ufe23\\u20d0-\\u20f0]|[\\ud83c\\udffb-\\ud83c\\udfff])?)*|" +
"[\\u0023-\\u0039]\\ufe0f?\\u20e3|\\u3299|\\u3297|\\u303d|\\u3030|\\u24c2|[\\ud83c\\udd70-\\ud83c\\udd71]|[\\ud83c\\udd7e-\\ud83c\\udd7f]|\\ud83c\\udd8e|[\\ud83c\\udd91-\\ud83c\\udd9a]|[\\ud83c\\udde6-\\ud83c\\uddff]|[\\ud83c\\ude01-\\ud83c\\ude02]|\\ud83c\\ude1a|\\ud83c\\ude2f|[\\ud83c\\ude32-\\ud83c\\ude3a]|[\\ud83c\\ude50-\\ud83c\\ude51]|\\u203c|\\u2049|[\\u25aa-\\u25ab]|\\u25b6|\\u25c0|[\\u25fb-\\u25fe]|\\u00a9|\\u00ae|\\u2122|\\u2139|\\ud83c\\udc04|[\\u2600-\\u26FF]|\\u2b05|\\u2b06|\\u2b07|\\u2b1b|\\u2b1c|\\u2b50|\\u2b55|\\u231a|\\u231b|\\u2328|\\u23cf|[\\u23e9-\\u23f3]|[\\u23f8-\\u23fa]|\\ud83c\\udccf|\\u2934|\\u2935|[\\u2190-\\u21ff]";
FILTER TYPE 3
**************
Filter does not use Regular Expressions but attempts to filter out based on "surrogate pairs" of characters to identify that this is likely to be an Emoji. It filters many usual emoji but does
not find all of them
To use this component in KNIME, download it from the below URL and open it in KNIME:
Download ComponentDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.