Palladian: Changelog

This log gives an overview of the most prominent changes in each release. Minor fixes, changes “under the hood” and refactoring changes are not listed here. While we were rather sloppy with the versioning in the past, we follow the Semantic Versioning scheme and the guidelines from “Keep a Changelog” since 2019.

version-2.3.0 (2020-09-25)

Info
Requires at least KNIME 4.0 (please make sure you’re using an update site URL corresponding to your KNIME version)
Add
Regex Extractor: Add a “Rows or Missing” output mode which appends a row with missing value cells in case of a no-match (see here)
Add
Text Classifier Model Writer: Report progress while writing model
Add
Text Classifier Model Reader: Report progress whiel reading model
Add
GeoIP2 Extractor, GeoIP2 DB Connector, GeoIP2 WS Connector: New nodes to get information for IP addresses using the MaxMind API or MMDB files
Change
More efficient storage of HttpResult cells
Change
Improved renderer for HttpResult cells showing headers and payload
Change
HTML Parser: Add “Drop input column” setting (see here)
Change
HTML Parser: Allow to input HTML strings
Change
Regex Extractor: Timeout presumably endless regexes in dialog after 15 seconds
Change
Regex Extractor: Allow to cancel long running regexes during node execution
Change
String Similarities: Allow to configure name of output column (see here)
Fix
Text Classifier Model Writer: Ensure that model file is always written in GZIP format (see here)
Fix
Text Classifier Model Writer: Ensure that .palladianDictionaryModel extension is appended
Fix
String Similarities: Handle missing value input

version-2.2.0 (2020-05-15)

Add
Regex Extractor: Add a “Columns” output mode which appends a column for each matched group.
Fix
Google Address Geocoder: Fix pointer to preferences in node documentation – kudos to joan_beneyto
Fix
Location Extractor: Fix pointer to preferences in node documentation
Fix
MapQuest Geocoder: Fix pointer to preferences in node documentation

version-2.1.0 (2020-05-08)

Add
Regex Extractor: Add “Drop Full Match” option (see here)

version-2.0.2 (2020-02-01)

Fix
Regex Extractor: Fix configuration logic which would prevent output when picking a different input column than the first – kudos to Armin Ghassemi Rudd

version-2.0.1 (2020-01-26)

Fix
Date Extractor: Fix execution exception which would happen for some settings combinations – kudos to Armin Ghassemi Rudd
Fix
Improve KNIME server detection, avoid false alarms on “normal” KNIME when “KNIME Executor connector” is installed

version-2.0.0 (2020-01-24)

Info
Requires at least KNIME 3.7 (please make sure you’re using an update site URL corresponding to your KNIME version)
Info
Provide a zipped version of the update site -- simply append .zip to the update site URL, e.g. http://download.nodepit.com/palladian/4.1.zip (see here, here)
Info
Rename “Palladian Nodes for KNIME Workbench” to simply “Palladian for KNIME”, update the license to version 2.1.1 to reflect this change
Info
Update Palladian library to version 1.0
Add
Add “Virtual Earth” tiles to “Map Viewer” node
Add
Add “Stamen” tiles “Toner”, “Terrain”, and “Watercolor” to “Map Viewer” node
Add
Add “Wikimedia” maps to “Map Viewer” node
Add
Add new node “Regex Extractor” -- create your regular expressions as easy as a breeze; build, preview, and test your regexes in real time with your real data
Add
Add new node “Web Page Content Extractor” -- replace old “Content Extractor” and outputs the results as plain String and XML cells instead of prorietary “Document” cell from KNIME textprocessing, which makes use much more flexible
Add
Add new “Hash Calculator” node with additional hashing algorithms (MD2, MD5, SHA, SHA-224, SHA-256, SHA-384, SHA-512), possibility to hash binary data beside string data, and option the drop input input column
Change
Date Extractor: Allow to extract dates into collection cells, individual rows, or only extract first data occurrence (see here).
Change
Date Extractor: Allow to append column with input Row ID
Change
Date Extractor: Make use of KNIME’s “Local Date Time” cells
Change
Date Extractor: Allow to remove input column
Change
Date Extractor: Allow to specify output column name
Change
Threshold Analyzer: Update node documentation to mention “Accuracy” measure (see here)
Change
Update “Map Viewer” to new JXMapViewer2 library
Change
Moved nodes which depend on KNIME Textprocessing (“Date Extractor”, “Palladian NER”, “Content Extractor”) and deprecated nodes (“NekoHtmlParser”) to a separate, optional feature; this avoids having to install the heavyweight Textprocessing dependency
Change
Restructure Palladian-related preferences to common entry, allow to enter license key
Deprecate
Date Extractor: Old node is deprecated -- replace with new version for additional functionality
Deprecate
URL Extractor: Mark node as deprecated -- we recommend to use the new “Regex Extractor” instead which has a dedicated “URL” preset
Deprecate
Content Extractor: Mark node as deprecated -- we recommend to use the new “Web Page Content Extractor” from now on
Deprecate
Hash Calculator -- replaced with new version
Remove
Remove MapQuest tiles in “Map Viewer” node; they do not offer direct tile access any longer
Remove
Ranking Services: Remove Facebook ranking source
Fix
Fix missing OSM tiles in “Map Viewer” node
Fix
Google Address Geocoder: Fix link to API key in node documentation

version-1.8.0 (2019-07-27)

Change
Updated categories structure; move “Palladian” entry to root, organize nodes into sub-categories
Change
Adding categories description for better presentation on NodePit
Change
Changed node labels from “CamelCase” to proper spacing
Change
Added new Palladian logo
Change
Minor typograhpy fixes in node documentation
Change
Added additional content to node documentation

version-1.7.1 (2019-06-29)

Info
KNIME 4 compatibility

version-1.7.0 (2018-07-04)

Change
Rename “ColumnDistanceNode” to “ColumnDistance”
Deprecate
RankingServices: Deprecate node

version-1.7.0 (2018-06-20)

Deprecate
freegeoip: Deprecate node due to changed API

version-1.7.0 (2017-04-18)

Change
TextClassifierLearner: Option to disable listening for memory warnings; see https://forum.knime.com/t/textclassifierlearner-received-memory-warning-at-30-of-dedicated-mem-usage/10766
Change
GoogleAddressGeocoder: Add option to specify API key
Deprecate
MapzenGeocoder, ReverseGeocoder: Deprecate nodes and remove logic due to Mapzen shutdown

version-1.7.0 (2017-12-07)

Change
MultipartEncodedHttpEntityCreator: Make sure empty filenames are transformed to null
Change
MultipartEncodedHttpEntityCreator: Improve usability -- set default entity name when selecting input column in dialog
Change
HttpRetriever: Add content type 'text/plain'
Change
MultipartEncodedHttpEntityCreator: Add validation
Change
MultipartEncodedHttpEntityCreator: Allow StringValue input as well
Change
MultipartEncodedHttpEntityCreator: Allow to specify name of output column
Change
MultipartEncodedHttpEntityCreator: Allow 10 instead of 5 inputs
Change
MultipartEncodedHttpEntityCreator: Documentation

version-1.6.100 (2017-10-12)

Change
HttpRetriever: Add “Fail on network error” setting

version-1.6.100 (2017-05-29)

Add
TfIdfSimilarity: Node to calculate similarity between two strings based on their tf—idf vectors and their cosine similarities
Add
CorpusCreator: Node to create a corpus which contains counts for each unique term within the given texts
Add
NGramExtractor: New node for creating token-/word-n-grams as lightweight alternative to the n-gram creator from the Text Processing plugin which works on simple strings as input and produces string collections as output

version-1.6.100 (2017-05-07)

Add
HtmlNodeToText: New node to convert HTML documents/nodes to human-readable strings
Change
TextClassifier: Allow setting minimum and maximum term lengths

version-1.6.100 (2017-05-06)

Change
TextClassifier: Improve handling of large dictionary models: Load model data lazily to speed up opening of workflows

version-1.6.100 (2017-04-12)

Remove
WebSearcher: Removing Bing, DuckDuckGo, Social Mention, WebKnox as they are no longer functional

version-1.6.100 (2017-04-06)

Change
TextClassifierPredictor: Enable parallel processing
Fix
TextClassifierPredictor: Fix overriding classification column
Fix
TextClassifierLearner: Fix enabling/disabling of applicable checkboxes in configuration dialog

version-1.6.100 (2017-02-20)

Fix
Fix node description and column guessing in ReverseGeocoder

version-1.6.100 (2016-10-31)

Change
Stricter throttling in GoogleGeocoder to avoid being blocked

version-1.6.100 (2016-09-10)

Change
Setting to retrieve location hierarchy from Geonames, which greatly improves extraction quality in LocationExtractor

version-1.6.100 (2016-07-27)

Remove
Remove obsolete searchers (old Google API, Topsy), add Flickr searcher to WebSearcher node

version-1.6.100 (2016-07-07)

Change
Return image- and video-specific properties from WebSearcher node

version-1.6.100 (2016-06-08)

Add
Reverse geocoder node using MapZen
Add
Added node to create Multipart encoded HTTP entities
Change
Option to fail HttpRetriever node execution in case a non-success HTTP status code is returned (>= 400)
Change
Make FreeGeoIP lookup more robust

version-1.6.100 (2016-03-31)

Change
Skip-gram features for TextClassifierLearner

version-1.6.100 (2016-02-27)

Add
Added node to set Eclipse preferences for testing purposes
Add
Added Mapzen geocoder

version-1.6.100 (2016-02-26)

Fix
Fix synchronization issue with cookie store

version-1.6.100 (2016-02-02)

Change
Accept any kind of NominalValue for InformationGainCalculator

version-1.6.100 (2016-01-15)

Add
Added FreeGeoIP node

version-1.6.100 (2016-01-04)

Change
Added RankingService for Hacker News
Fix
Missing value handling in RankingServices node

version-1.6.100 (2015-12-20)

Add
Added new column-based distance calculation node

version-1.6.100 (2015-12-09)

Fix
Fixed proxy issue in HttpRetriever node

version-1.6.100 (2015-12-01)

Remove
Removed obsolete RankingServices: Friendfeed Stats, Friendfeed Aggregated Stats, Twitter
Remove
Removed obsolete WebSearchers: WebKnox News

version-1.6.100 (2015-11-01)

Add
UrlDomainExtractor node to extract domain from URLs, optionally without subdomains

version-1.6.100 (2015-10-09)

Info
Adaption to KNIME 3.0

version-1.6.0 (2015-09-17)

Change
Replaced “accept self-signed certificates” by “accept all certificates” option in HttpRetriever

version-1.6.0 (2015-06-22)

Change
HttpRetriever also accepts StringValues as HTTP entity, HttpRetriever allows to specify an arbitrary content type.

version-1.6.0 (2015-06-22)

Change
Additional preprocessing options for TextClassifierLearner node: stemming, stop word removal for German and English language

version-1.6.0 (2015-06-01)

Change
Setting for HttpRetriever to allow self-signed SSL certificates

version-1.6.0 (2015-05-31)

Fix
Remove temporary debugging code in HtmlParser, which was causing exception with invalid encoding string

version-1.6.0 (2015-05-28)

Change
HtmlParser node additionally accepts binary object cells as input

version-1.6.0 (2015-05-27)

Fix
Improve missing value handling in FeedDiscovery node

version-1.6.0 (2015-05-23)

Add
Cookie support for new HttpRetriever node (optional input and output tables)
Add
Ability to specify HTTP methods in new HttpRetriever node by input column
Add
HttpResultDataExtractor node optionally creates a binary instead of a string cell
Add
New HttpRetrieverNode can send binary data, which can be specified through an optional input column
Add
Added FormEncodedHttpEntityCreator node to convert key-value data to form-encoded input for HttpRetriever
Add
Possibility to input HTTP headers in HttpRetriever requests.
Deprecate
Mark old HttpRetriever node as deprecated

version-1.6.0 (2015-05-01)

Change
Change default file extension for text classifier models from '.gz' to 'palladianDictionaryModel', ability to drop models and create appropriate TextClassifierModelReader dialog

version-1.6.0 (2015-04-30)

Change
Stop training via TextClassifierLearner when memory is getting full (using KNIME's MemoryWarningSystem)

version-1.6.0 (2015-04-20)

Fix
Fix guessing of category column in TextClassifierLearner node

version-1.6.0 (2015-04-17)

Fix
Better handling of missing values in HttpRetriever, return IntCell instead of LongCell for HTTP status codes

version-1.6.0 (2015-04-16)

Fix
Better handling of missing values in ContentExtractor and HtmlParser nodes

version-1.6.0 (2015-04-10)

Add
Added MapQuest cell renderer

version-1.5.0 (2015-04-07)

Add
Added MapQuestGeocoder node

version-1.5.0 (2015-03-28)

Add
Added GoogleAddressGeocoder node

version-1.5.0 (2015-03-26)

Add
Added Jaro–Winkler string distance measure

version-1.5.0 (2015-02-24)

Add
Added CoordinateToLatitudeLongitude node
Add
Added ReverseLocationLookupNode
Change
Renamed CoordinateParser to LatitudeLongitudeToCoordinate node
Change
Renamed CoordinateParser to LatitudeLongitudeToCoordinate.

version-1.5.0 (2015-02-23)

Add
Adding HttpResultToStringNode

version-1.2.0 (2014-11-20)

Change
Output warning in HtmlParser when processing http URLs (should use HttpRetriever)

version-1.2.0 (2014-11-10)

Fix
Use explicitly given encoding in HtmlParser node when processing HttpResults

version-1.2.0 (2014-08-28)

Change
Possibility for weighted inputs for TextClassifierLearner node

version-1.2.0 (2014-08-27)

Change
Additional preprocessing options for TextClassifierLearner node: case sensitivity, border padding

version-1.2.0 (2014-08-05)

Change
Offer additional languages in WebSearcher node

version-1.2.0 (2014-08-04)

Change
Add SocialMentionSearcher to WebSearcher node
Deprecate
Mark RMSE node as deprectated

version-1.2.0 (2014-07-10)

Change
Provide accuracy values in ThresholdAnalyzer node
Change
Changed pruning capabilities to updated Palladian functionality

version-1.2.0 (2014-06-17)

Change
Automatically trim spaces when entering API keys in preferences

version-1.2.0 (2014-05-14)

Change
WebSearcher node append column with tags
Change
Provide paging for Twitter searcher in WebSearcher node

version-1.2.0 (2014-05-03)

Change
Ability to switch scoring algorithms in TextClassifierPredictor node (expert mode)

version-1.2.0 (2014-04-30)

Add
InformationGain node

version-1.2.0 (2014-04-25)

Fix
Fixing shifted month in DateParserNode (+ adding test)

version-1.2.0 (2014-04-23)

Change
TextClassifierModelToTable outputs second table with category priors

version-1.2.0 (2014-03-14)

Change
Setting for maximum number of terms for TextClassifierLearner

version-1.2.0 (2014-03-10)

Change
TextClassifierModelToTable provides the term counts as column

version-1.2.0 (2014-02-27)

Change
Cutoff irrelevant parts of graph in ThresholdAnalyzer node (values on the left, which are no different from their successors)

version-1.2.0 (2014-02-25)

Change
Greatly reduce memory consumption when training with TextClassifierLearner node
Fix
Fix NullPointerException in ThresholdAnalyzer node

version-1.2.0 (2014-02-21)

Change
Try to auto-select positive class column in ThresholdAnalyzer node

version-1.2.0 (2014-02-20)

Add
TextClassifierModelToTable node to write a Palladian text classifier dictionary to a KNIME table.
Change
Give statistics about text classifier dictionary on output port's tooltip

version-1.2.0 (2014-01-23)

Change
Output warning to log, in case a deprecated searcher is used.
Change
WebSearcher allows to append column with total number of results available for a given query (in case the specific searcher provides this information)

version-1.2.0 (2014-01-18)

Change
FeedParser now allows input of XML documents

version-1.2.0 (2013-12-26)

Change
DateExtractor now optionally appends a column with the parse pattern used for extracting a specific date.
Change
WebSearcher node adds a column providing GeoCoordinate values (in case this information is provided by the actual search engine; currently, YouTube, Twitter, Instagram, Flickr, Panoramio provide coordinates for some results)
Fix
DateExtractor now handles date/time precision correctly (e.g. only extract date without time in case it is appropriate)

version-1.2.0 (2013-12-25)

Change
Provide additional short version rendering for GeoCoordinate values (beside full precision and DMS)
Change
Log output from the Palladian library is now piped to KNIME's integrated node logger.