Icon

02 Reading Text Data

Exercise: Encapsulating strings into a document1) Read the KNIME-Tweets.table file available in the data folder. It contains a list of tweets about KNIME, the user, thedate of the tweet, and the retweets. 2) Convert the date of the tweet from string to Local Date. Append a new column so that you don't lose the timeinformation in the string column.3) Convert the tweets into documents with- Tweet column as the title and the full text- Twitter as the document source- User column as the author- Date column as the publication date Exercise: Reading data from pdf1) Read the 2020-05-25-l4-tp-5-sessions.pdf file available in the data folder with the Tika Parser node. The file is theagenda of the L4-TP Introduction to Text Processing instructor-led course. Enable the Extract attachments andembedded files and Extract inline images from pdfs options. You can select any directory.2) Take a look at the extracted image. What do you see?3) Convert the text output into a document. Use "2020-05-25" as the publication date. Read course agendaas pdfRead KNIMEtweets Strings To Document String to Date&Time Tika Parser Strings To Document Document Viewer Table Reader Exercise: Encapsulating strings into a document1) Read the KNIME-Tweets.table file available in the data folder. It contains a list of tweets about KNIME, the user, thedate of the tweet, and the retweets. 2) Convert the date of the tweet from string to Local Date. Append a new column so that you don't lose the timeinformation in the string column.3) Convert the tweets into documents with- Tweet column as the title and the full text- Twitter as the document source- User column as the author- Date column as the publication date Exercise: Reading data from pdf1) Read the 2020-05-25-l4-tp-5-sessions.pdf file available in the data folder with the Tika Parser node. The file is theagenda of the L4-TP Introduction to Text Processing instructor-led course. Enable the Extract attachments andembedded files and Extract inline images from pdfs options. You can select any directory.2) Take a look at the extracted image. What do you see?3) Convert the text output into a document. Use "2020-05-25" as the publication date. Read course agendaas pdfRead KNIMEtweets Strings To Document String to Date&Time Tika Parser Strings To Document Document Viewer Table Reader

Nodes

Extensions

Links