Icon

Chemical_​data_​ML_​ChemAxon

Chemical_data_for_machine_learning

There has been no description set for this workflow's metadata.

Chemical Descriptors & Standardizers for Machine Learning ModelsBy: Bilal NizamiChemAxon, Sep 2020 1. Read hERG patch clamp assay/chembl target 240 data from the local files.2. Filter missing pchembl values and average multiple measurement (for chembl data). 3. Use ChemAxon's standardizer and Structure Checker to sanitize the data.4. ChemAxon's ECFP calculation.5. Basic feauture selection (remove low variance and highly correlated features).6. Assign 2 classes in chembl data and three classes in patch clamp assay data.7. Export the patch clamp assay data.8. Random Forest model for pubchem data. WARNING!: Please note that the workflow might take long time to finish as the size of datasetis large. Specially lower end system might show considerable reduction in performance READ DATA Pre-process Standardize, Check, Feature calculation, and class assignment EXPORT MODEL BUILDING herg_MLSMR_automated_patch_clamp80:20 SplitNode 21Node 22>= 20% inhibitor <= -20 ActivatorThe rest are inactiveClass distributionNode 75 >= 6.5 pchembl is ActiveRest InactiveNode 164Remove unused column Chembl25_herg_bioactivityNode 238RandomFores modelNode 251Class distributionNode 255Node 256Remove unused columnClass distributionClass distributionNode 261Node 262 missing pchemblAverage pchemblcol renameExcel Reader (XLS) Partitioning Statistics Histogram Assign Class Pie/Donut Chart Basic featureselection Propertycalculators Assign Class CSV Writer Column Filter Excel Reader (XLS) ROC Curve (local) RF PrepareChemicalData Pie/Donut Chart Basic featureselection Column Rename PrepareChemicalData Column Filter Pie/Donut Chart Pie/Donut Chart Scorer LibMCS_misClassified pre-process Chemical Descriptors & Standardizers for Machine Learning ModelsBy: Bilal NizamiChemAxon, Sep 2020 1. Read hERG patch clamp assay/chembl target 240 data from the local files.2. Filter missing pchembl values and average multiple measurement (for chembl data). 3. Use ChemAxon's standardizer and Structure Checker to sanitize the data.4. ChemAxon's ECFP calculation.5. Basic feauture selection (remove low variance and highly correlated features).6. Assign 2 classes in chembl data and three classes in patch clamp assay data.7. Export the patch clamp assay data.8. Random Forest model for pubchem data. WARNING!: Please note that the workflow might take long time to finish as the size of datasetis large. Specially lower end system might show considerable reduction in performance READ DATA Pre-process Standardize, Check, Feature calculation, and class assignment EXPORT MODEL BUILDING herg_MLSMR_automated_patch_clamp80:20 SplitNode 21Node 22>= 20% inhibitor <= -20 ActivatorThe rest are inactiveClass distributionNode 75 >= 6.5 pchembl is ActiveRest InactiveNode 164Remove unused column Chembl25_herg_bioactivityNode 238RandomFores modelNode 251Class distributionNode 255Node 256Remove unused columnClass distributionClass distributionNode 261Node 262 missing pchemblAverage pchemblcol renameExcel Reader (XLS) Partitioning Statistics Histogram Assign Class Pie/Donut Chart Basic featureselection Propertycalculators Assign Class CSV Writer Column Filter Excel Reader (XLS) ROC Curve (local) RF PrepareChemicalData Pie/Donut Chart Basic featureselection Column Rename PrepareChemicalData Column Filter Pie/Donut Chart Pie/Donut Chart Scorer LibMCS_misClassified pre-process

Nodes

Extensions

Links