This node implements Presidio's Anonymizer, which allows to anonymize English text data. It uses pseudonymization, which makes it possible to reinsert the personal information into the anonymized data with the Presidio Deanonymizer node.
The node anonymizes the data of a specified string column of the input table by replacing all occurrences of the selected PII entity types with abstract placeholders. If it is possible for the selected types, the information can be replaced with randomly generated information of the same type. You can choose whether the anonymized data replaces the original data or is appended in a new column.
Per default, this node detects the PII entities before anonymizing them. Since Presidio may mistakenly detect words as PII, it is possible to connect a table that has the output columns of the Presidio Analyzer node to the dynamic port. The Presidio Anonymizer will then only anonymize the entities stored in that table.
Warning: Presidio can help identify sensitive/PII data in un/structured text. However, because it is using automated detection mechanisms, there is no guarantee that Presidio will find all sensitive information. Therefore, always evaluate the quality of detections and take appropriate measures if necessary.
Select the string column that contains the data for PII anonymization.
Select the column that contains the types of the entities that will be anonymized.
Select the column that contains the index of the first character of each entity.
Select the column that contains the index of the last character of each entity.
Select the column that contains the certainty of Presidio for the detection.
Select the column that contains the row of the original table in which the PII entity was detected.
Select the PII entity types that will be anonymized.
Available options:
Select whether the anonymizer should use abstract placeholders or randomly generated information to replace PII entities.
Available options:
Provide the random seed used to generate replacement values.
Select whether the anonymized data should replace the original data or be appended to the table in a new column.
Available options:
Provide the name of the new column containing the anonymized data.
You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.
To use this node in KNIME, install the extension KNIME Python Extension Development (Labs) from the below update site following our NodePit Product and Node Installation Guide:
A zipped version of the software site can be downloaded here.
Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.