Icon

Target Tractability Information Retrieval

This KNIME workflow facilitates comprehensive assessment of biological target tractability by integrating up-to-date data from multiple public resources including UniProt, ChEMBL, PDB, Open Targets, and Human Protein Atlas. Users can input target lists using UniProt accession codes or gene symbols, either via file upload or direct entry.

The workflow retrieves extensive functional and expression information such as protein function, subcellular localization, tissue-specific and disease-relevant expression data, and related disease annotations. This enables evaluation of target relevance in the biological context of interest.

For tractability insights, it collects rich druggability data comprising associated mechanisms of action (MOA), linked approved and investigational drugs with detailed activity profiles, and chemical probes validated for target modulation. It further aggregates compound bioactivity data by assay type, supporting ligand-based drug discovery and QSAR modeling.

Structure-based tractability is informed through retrieval of experimental 3D structures and ligand information from PDB. Additional small-molecule tractability scores and druggable family classifications come from Open Targets, delivering a multi-dimensional view of target feasibility.

Interactive visualizations summarize these data, enabling filtering and detailed exploration of targets’ clinical development status, drug-target interactions, chemical probe availability, compound activity breadth, and structural data quality. The workflow supports export of results for downstream analysis, making it a valuable tool for prioritizing targets to guide resource allocation in drug discovery projects.

Ligand data: MOAs, indications, drug activity, bioactivities association and chemical probes through ChEMBL Retrieve the number of distinct compounds with available activity data for each target Retrieve activities for approved/investigational drugs (possible only for targets with MOA data). Targets input Retrieve MOA-indication data WebPortal page WebPortal page WebPortal page WebPortal WebPortal WebPortal Targets retrieval, revision and selection through UniProt data. WebPortal Run mode selector: real vs (pre-loaded) example EXAMPLE: pre-loaded targets input WebPortal Check max dataset size WebPortal Target structure data: association through PDB File output & export WebPortal Extract the number and name of chemical probes for targets available in ChEMBL. Retrieve basic target info from ChEMBL WebPortal WebPortal WebPortal Target tractability data: small molecule tractability from Open Targets Distinct Ensembl gene IDs can be associated with the same UniProt ID (e.g., P62269). For these cases, we retrieved target tractability separately from Open Targets and then combined the data uniquely under the same UniProt ID. UniProt data to export WebPortal Check max dataset size connect to chembl DBfrom uniprot accessionretrieve PDB structuresfrom PDB web servicedownload outputfilestarget_input_representationtarget input methodswitch start1. from a file2. write or paste in a text editortarget input methodswitch endtarget_input_methodinput mols as filetarget_input_file_textselect the target_representationcol and rename it astarget_input_representationwf_run_typecase switch startwf_run_typecase switch endexample filetarget_input_representationtarget_input_representation (index)select the target_representationcol and rename it astarget_input_representationsubstitute ";" for ""on multi-rows colummnsselectmolecule_chembl_id,target_chembl_id,target_variant, action_typeremoveduplicatesjoin on:1. target_chembl_id2. target_variant3. molecule_chembl_idtarget_variantsplit top activities wherehomologous targets were assignedsort by:1. target_chembl_id2. target_variant3. drug_nameleft join drugs activitygroup by: target_chembl_id,target_variant and action_typeaggregate: aggregated_activityrename asdrug_actvitiesremoveduplicatesmax 500 targetsexclude homologous targets byselecting target_confidence = 9(direct single protein target assigned)exclude homologous targets byselecting target_confidence = 7(direct single protein complex subunits assigned)exclude homologous targets byselecting target_confidence = 5(multiple direct protein target assigned - protein family)concatenateselect only quantitativedata (standard_relation = "=")remove duplicatesfilter relevant colsfilter only the activity standard typesassociated with the correctactivity standard unitsactivity_type_definitionleft join availableactivity datadefine target_variantfrom mutationresort colsfilter out variant_idand tidgroup by: target_chembl_id,and target_variantaggregate: compounds_for_activity_typecompounds_for_activity_typeadd "(general for target)" suffix to colMOA-indicationempty table switch startfilter onlyuniprot_accessionleft-joinavailable structural data on the right(might be empty)add "(general for target)" suffixtarget_druggability_datatarget_functional_expression_dataleft-joinChEMBL data on the rightThis adjust general target col namesfor tagets that are not available in ChEMBLrename "uniProt accession"as "Uniprot accession"otherwise the automatic col nameadjuster things it's a camelcaseIf geneName field is emtpy but geneSynonym is not,it moves geneSynonym content into geneName(leaving geneSynonym null)full-outer join gene symboland uniprot accessionre-sort colsextract target_chembl_idfrom uniprot accessionsplit bottom rows withmissing target_chembl_id(i.e. targets not in chembl)rows with missing MoAbottomMOA-indicationempty table switch endvalidate table and fill upwith missing valuesfilter only 2 colstarget_chembl_idtarget_variantreplace empty or missingtarget_variant with "WT"filter only chembltarget general colsreport targets present inchembl but without MOA datafilter only target_chembl_idtarget_variant & chemical_probes colsreplace empty or missingtarget_variant with "WT"left joinfull outer joinmerge left & righttarget_chemb_id(necessary afterfull outer join)merge left & righttarget_variant(necessary afterfull outer join)merge left & righttarget_chemb_id(necessary afterfull outer join)merge left & righttarget_variant(necessary afterfull outer join)no target retrievedcase switch starttarget_representation_for_uniprot_to_target_info_componentrename molecule_chembl_idas drug_chembl_idfilter out colwith links(used just for visualisationin interactive view pages)add Human ProteinAtlas target expression inkssplit bottomHPA cols (to be addedonly to csv)add "(general for target)" suffixselect onlyresults colsleft joinavailable target tractability dataon the right (might be empty)split ensembl gene colson new line char ";\n"filter only UniProtand Ensemble gene colsungrouprename asEnsembl gene idsplit open target tractabilitysplit open target tractabilitydescriptionjoin by1. uniprot id2. ensemble gene idungroupre-group by uniprot_idunique concatenatetractability inforename colsas originalre-joinHPA links cols to beexported on csvsubstitute ";" for ""on multi-rows colummnssplit top only relevantactivity types: AC50, EC50IC50, XC50, Ki, Kd, potencyinhibitionsplit top only relevantactivity types: AC50, EC50IC50, XC50, Ki, Kd, potencyinhibitionrename chembl_idas target_chembl_idtarget unavailable in chemblswitch starttarget unavailable in chemblswitch end(it uses only the 1st non-empty input)select onlyuniprot_accession colrename itvalidate table and insertmissing for unavailable cols(it happens for targets not in chembl)max 500 targets ChEMBL DBConnection PDB UniProt Accessionto Structure DownloadOutput Files Target InputRepresentation Selection CASE Switch Start CASE Switch End Target InputMethod Selection No Target Selected Temp File PathGenerator File Upload String Manipulation(Variable) Selection of TargetRepresentation Column Target InputBy Text Editor adjust targetrepresentation table Empty Table Switch StandardizeColumn Names CASE Switch Start CASE Switch End WF Run Type CSV Reader Selection of TargetRepresentation Column String Manipulation(Variable) Math Formula(Variable) adjust targetrepresentation table String Manipulation(Multi Column) Column Filter DuplicateRow Filter Joiner ChEMBL Activity FromMolecule Retrieval Rule Engine Row Splitter Sorter join drug name onaggregated_activity col Joiner GroupBy Column Renamer DuplicateRow Filter Max Input DatasetSize Checker Row Filter Row Filter Row Filter Concatenate Row Filter DuplicateRow Filter Column Filter Joiner Column Renamer Joiner Rule Engine Column Resorter Column Filter GroupBy String Manipulation Column Renamer Empty Table Switch Column Filter Joiner Column Rename(Regex) CSV Writer CSV Writer Joiner handle data fortargets not in ChEMBL Column Renamer adjust content of Geneand Gene synonyms cols Joiner Column Resorter Workflow intro ChEMBL UniProt Accessionto Target Info Row Splitter Row Splitter CASE Switch End Table Validator Column Filter Rule Engine Column Filter report targetswith no MOA Column Filter re-join chembltarget cols Rule Engine Joiner Target TypeSelection Joiner Column Merger Column Merger Column Merger Column Merger Empty Table Switch No Target Retrieved Workflow outro UniProt Accession orGene to Target Info Rule EngineVariable Column Renamer Column Filter add Human Protein Atlasexpression links UniProt TargetSearch Selection Open Targets TargetTractability Retrieval Column Splitter Column Rename(Regex) Column Filter Joiner Cell Splitter Column Filter Ungroup Column Renamer Cell Splitter Cell Splitter Joiner Ungroup GroupBy Column Renamer Joiner String Manipulation(Multi Column) ChEMBL MOA-IndicationFrom Target Retrieval Row Splitter Row Splitter ChEMBL Activity TypeRetrieval (Batch) Column Renamer Target DruggabilitySummary Empty Table Switch CASE Switch End Column Filter Column Renamer Table Validator ChEMBL Chemical Probesfrom Target Retrieval Max Input DatasetSize Checker Ligand data: MOAs, indications, drug activity, bioactivities association and chemical probes through ChEMBL Retrieve the number of distinct compounds with available activity data for each target Retrieve activities for approved/investigational drugs (possible only for targets with MOA data). Targets input Retrieve MOA-indication data WebPortal page WebPortal page WebPortal page WebPortal WebPortal WebPortal Targets retrieval, revision and selection through UniProt data. WebPortal Run mode selector: real vs (pre-loaded) example EXAMPLE: pre-loaded targets input WebPortal Check max dataset size WebPortal Target structure data: association through PDB File output & export WebPortal Extract the number and name of chemical probes for targets available in ChEMBL. Retrieve basic target info from ChEMBL WebPortal WebPortal WebPortal Target tractability data: small molecule tractability from Open Targets Distinct Ensembl gene IDs can be associated with the same UniProt ID (e.g., P62269). For these cases, we retrieved target tractability separately from Open Targets and then combined the data uniquely under the same UniProt ID. UniProt data to export WebPortal Check max dataset size connect to chembl DBfrom uniprot accessionretrieve PDB structuresfrom PDB web servicedownload outputfilestarget_input_representationtarget input methodswitch start1. from a file2. write or paste in a text editortarget input methodswitch endtarget_input_methodinput mols as filetarget_input_file_textselect the target_representationcol and rename it astarget_input_representationwf_run_typecase switch startwf_run_typecase switch endexample filetarget_input_representationtarget_input_representation (index)select the target_representationcol and rename it astarget_input_representationsubstitute ";" for ""on multi-rows colummnsselectmolecule_chembl_id,target_chembl_id,target_variant, action_typeremoveduplicatesjoin on:1. target_chembl_id2. target_variant3. molecule_chembl_idtarget_variantsplit top activities wherehomologous targets were assignedsort by:1. target_chembl_id2. target_variant3. drug_nameleft join drugs activitygroup by: target_chembl_id,target_variant and action_typeaggregate: aggregated_activityrename asdrug_actvitiesremoveduplicatesmax 500 targetsexclude homologous targets byselecting target_confidence = 9(direct single protein target assigned)exclude homologous targets byselecting target_confidence = 7(direct single protein complex subunits assigned)exclude homologous targets byselecting target_confidence = 5(multiple direct protein target assigned - protein family)concatenateselect only quantitativedata (standard_relation = "=")remove duplicatesfilter relevant colsfilter only the activity standard typesassociated with the correctactivity standard unitsactivity_type_definitionleft join availableactivity datadefine target_variantfrom mutationresort colsfilter out variant_idand tidgroup by: target_chembl_id,and target_variantaggregate: compounds_for_activity_typecompounds_for_activity_typeadd "(general for target)" suffix to colMOA-indicationempty table switch startfilter onlyuniprot_accessionleft-joinavailable structural data on the right(might be empty)add "(general for target)" suffixtarget_druggability_datatarget_functional_expression_dataleft-joinChEMBL data on the rightThis adjust general target col namesfor tagets that are not available in ChEMBLrename "uniProt accession"as "Uniprot accession"otherwise the automatic col nameadjuster things it's a camelcaseIf geneName field is emtpy but geneSynonym is not,it moves geneSynonym content into geneName(leaving geneSynonym null)full-outer join gene symboland uniprot accessionre-sort colsextract target_chembl_idfrom uniprot accessionsplit bottom rows withmissing target_chembl_id(i.e. targets not in chembl)rows with missing MoAbottomMOA-indicationempty table switch endvalidate table and fill upwith missing valuesfilter only 2 colstarget_chembl_idtarget_variantreplace empty or missingtarget_variant with "WT"filter only chembltarget general colsreport targets present inchembl but without MOA datafilter only target_chembl_idtarget_variant & chemical_probes colsreplace empty or missingtarget_variant with "WT"left joinfull outer joinmerge left & righttarget_chemb_id(necessary afterfull outer join)merge left & righttarget_variant(necessary afterfull outer join)merge left & righttarget_chemb_id(necessary afterfull outer join)merge left & righttarget_variant(necessary afterfull outer join)no target retrievedcase switch starttarget_representation_for_uniprot_to_target_info_componentrename molecule_chembl_idas drug_chembl_idfilter out colwith links(used just for visualisationin interactive view pages)add Human ProteinAtlas target expression inkssplit bottomHPA cols (to be addedonly to csv)add "(general for target)" suffixselect onlyresults colsleft joinavailable target tractability dataon the right (might be empty)split ensembl gene colson new line char ";\n"filter only UniProtand Ensemble gene colsungrouprename asEnsembl gene idsplit open target tractabilitysplit open target tractabilitydescriptionjoin by1. uniprot id2. ensemble gene idungroupre-group by uniprot_idunique concatenatetractability inforename colsas originalre-joinHPA links cols to beexported on csvsubstitute ";" for ""on multi-rows colummnssplit top only relevantactivity types: AC50, EC50IC50, XC50, Ki, Kd, potencyinhibitionsplit top only relevantactivity types: AC50, EC50IC50, XC50, Ki, Kd, potencyinhibitionrename chembl_idas target_chembl_idtarget unavailable in chemblswitch starttarget unavailable in chemblswitch end(it uses only the 1st non-empty input)select onlyuniprot_accession colrename itvalidate table and insertmissing for unavailable cols(it happens for targets not in chembl)max 500 targets ChEMBL DBConnection PDB UniProt Accessionto Structure DownloadOutput Files Target InputRepresentation Selection CASE Switch Start CASE Switch End Target InputMethod Selection No Target Selected Temp File PathGenerator File Upload String Manipulation(Variable) Selection of TargetRepresentation Column Target InputBy Text Editor adjust targetrepresentation table Empty Table Switch StandardizeColumn Names CASE Switch Start CASE Switch End WF Run Type CSV Reader Selection of TargetRepresentation Column String Manipulation(Variable) Math Formula(Variable) adjust targetrepresentation table String Manipulation(Multi Column) Column Filter DuplicateRow Filter Joiner ChEMBL Activity FromMolecule Retrieval Rule Engine Row Splitter Sorter join drug name onaggregated_activity col Joiner GroupBy Column Renamer DuplicateRow Filter Max Input DatasetSize Checker Row Filter Row Filter Row Filter Concatenate Row Filter DuplicateRow Filter Column Filter Joiner Column Renamer Joiner Rule Engine Column Resorter Column Filter GroupBy String Manipulation Column Renamer Empty Table Switch Column Filter Joiner Column Rename(Regex) CSV Writer CSV Writer Joiner handle data fortargets not in ChEMBL Column Renamer adjust content of Geneand Gene synonyms cols Joiner Column Resorter Workflow intro ChEMBL UniProt Accessionto Target Info Row Splitter Row Splitter CASE Switch End Table Validator Column Filter Rule Engine Column Filter report targetswith no MOA Column Filter re-join chembltarget cols Rule Engine Joiner Target TypeSelection Joiner Column Merger Column Merger Column Merger Column Merger Empty Table Switch No Target Retrieved Workflow outro UniProt Accession orGene to Target Info Rule EngineVariable Column Renamer Column Filter add Human Protein Atlasexpression links UniProt TargetSearch Selection Open Targets TargetTractability Retrieval Column Splitter Column Rename(Regex) Column Filter Joiner Cell Splitter Column Filter Ungroup Column Renamer Cell Splitter Cell Splitter Joiner Ungroup GroupBy Column Renamer Joiner String Manipulation(Multi Column) ChEMBL MOA-IndicationFrom Target Retrieval Row Splitter Row Splitter ChEMBL Activity TypeRetrieval (Batch) Column Renamer Target DruggabilitySummary Empty Table Switch CASE Switch End Column Filter Column Renamer Table Validator ChEMBL Chemical Probesfrom Target Retrieval Max Input DatasetSize Checker

Nodes

Extensions

Links