Icon

02_​OCR_​meets_​SemanticWeb

Will they blend? OCR meets Semantic Web - Natural Selection vs Modern Theory

The challenge here is to blend Semantic Web data and image data (in PNG format) by implementing dbpedia queries using SPARQL query language and by using Optical Character Recognition (OCR). The goal is to find differences in the content between the Natural Selection theory developed by Charles Darwin and the new modern evolutionary theories. The evolutionary idea proposed by Charles Darwin and called "Natural selection theory" explains the mechanisms of evolution. It is featured in his book "Origin of Species" which, for this workflow, is available through .png files. On the other side, the new modern evolutionary synthesis theory has been queried with SPARQL Query nodes. After some text pre-processing, two different tag clouds graphically show the terms occurring in both sources (the book "Origin of Species" and the content queried from dbpedia related to modern evolutionary synthesis theory)Will they blend?

Visualization What have we learned? Reading Images from the book "Origin of Species" by Darwin Performing OpticalCharacter Recognition(OCR) Converting KNIME ImageProcessing Images fromKNIME to KNIME PNGImages Accessing and Querying Semantic Web with SPARQL Will they blend? OCR meets Semantic Web - Natural Selection vs Modern TheoryThe challenge here is to blend Semantic Web data and image data (in PNG format) by implementing dbpedia queries using SPARQL query language and by using Optical Character Recognition (OCR). The goal is to find differences in the content between the Natural Selection theory developed by Charles Darwin and the new modern evolutionary theories. The evolutionary idea proposed by Charles Darwin and called "Natural selection theory"explains the mechanisms of evolution. It is featured in his book "Origin of Species" which, for this workflow, is available through .png files. On the other side, the new modern evolutionary synthesis theory has been queried with SPARQL Querynodes. After some text pre-processing, two different tag clouds graphically show the terms occurring in both sources (the book "Origin of Species" and the content queried from dbpedia related to modern evolutionary synthesis theory)Will they blend?Blog post available at https://www.knime.org/blog/OCR-meets-SemanticWeb ... and yes! They blend. Optical Character Recognition (OCR) of Xerox Copy Querying the Semantic Web: DBPedia OCR in KNIMEto visualize imageQuerying from Modern evolutionary synthesis dbpedia pageConnect tohttp://dbpedia.org/sparqlQuerying from Extended evolutionary synthesis dbpedia pageKeep only Recap and Conclusion from the bookPOS tagging, number filter, punctuation erasure, ..From the bookmissing "presence"=> presence 1missing "presence"=> presence 1From dbpediaBoW, term frequencies,doc. data extractor, ..Querying from Evolutionarydevelopmental biology dbpedia pagePOS tagging, number filter, punctuation erasure, ..Listing all the images to readReading Origin of Species book images - Darwin Tess4J Image Viewer ImgPlus toPNG Images SPARQL Query SPARQL Endpoint SPARQL Query ConstantValue Column Row Filter Document Creationand Preprocessing I Tag Cloud Rule Engine Color Manager Color Manager Rule Engine Tag Cloud ReferenceRow Filter Sorter Preprocessing II Table View SPARQL Query Document Creationand Preprocessing I ConstantValue Column Renaming & Joining Concatenate List Files/Folders Image Reader(Table) Path to String Visualization What have we learned? Reading Images from the book "Origin of Species" by Darwin Performing OpticalCharacter Recognition(OCR) Converting KNIME ImageProcessing Images fromKNIME to KNIME PNGImages Accessing and Querying Semantic Web with SPARQL Will they blend? OCR meets Semantic Web - Natural Selection vs Modern TheoryThe challenge here is to blend Semantic Web data and image data (in PNG format) by implementing dbpedia queries using SPARQL query language and by using Optical Character Recognition (OCR). The goal is to find differences in the content between the Natural Selection theory developed by Charles Darwin and the new modern evolutionary theories. The evolutionary idea proposed by Charles Darwin and called "Natural selection theory"explains the mechanisms of evolution. It is featured in his book "Origin of Species" which, for this workflow, is available through .png files. On the other side, the new modern evolutionary synthesis theory has been queried with SPARQL Querynodes. After some text pre-processing, two different tag clouds graphically show the terms occurring in both sources (the book "Origin of Species" and the content queried from dbpedia related to modern evolutionary synthesis theory)Will they blend?Blog post available at https://www.knime.org/blog/OCR-meets-SemanticWeb ... and yes! They blend. Optical Character Recognition (OCR) of Xerox Copy Querying the Semantic Web: DBPedia OCR in KNIMEto visualize imageQuerying from Modern evolutionary synthesis dbpedia pageConnect tohttp://dbpedia.org/sparqlQuerying from Extended evolutionary synthesis dbpedia pageKeep only Recap and Conclusion from the bookPOS tagging, number filter, punctuation erasure, ..From the bookmissing "presence"=> presence 1missing "presence"=> presence 1From dbpediaBoW, term frequencies,doc. data extractor, ..Querying from Evolutionarydevelopmental biology dbpedia pagePOS tagging, number filter, punctuation erasure, ..Listing all the images to readReading Origin of Species book images - Darwin Tess4J Image Viewer ImgPlus toPNG Images SPARQL Query SPARQL Endpoint SPARQL Query ConstantValue Column Row Filter Document Creationand Preprocessing I Tag Cloud Rule Engine Color Manager Color Manager Rule Engine Tag Cloud ReferenceRow Filter Sorter Preprocessing II Table View SPARQL Query Document Creationand Preprocessing I ConstantValue Column Renaming & Joining Concatenate List Files/Folders Image Reader(Table) Path to String

Nodes

Extensions

Links