Icon

L1-DS Final Assessment Workflow

This workflow contains the final assessment of the L1-DS self-paced course. Solve the workflow and complete the quiz at the end of the course!

Task 3. Make sense of the dataThe columns edu_mother and edu_father represent theeducation level of a student's parents. Map the indices to thefollowing categories0 - none1 - primary2 - middle3 - secondary4 - higherGroup the data in order to obtain the percentage of studentsfor each combination of mother and father education level. Task 1. Read dataThe students.sqlite database stores studentpersonal info in 2 tables - GP and MS - correspondingto two different schools. Read the content of the twotables into the workflow.The transcript.csv file contains failures, absencesand grades for each student. Question: What is thepercentage of students withboth parents having highereducation? Task 2. Bring things togetherMerge all the data in a single table containing studentsfrom both schools and the relative transcript data. Question: What is thetotal number of studentsin both schools? Fill the gaps: The feature with the highestcoefficient is ____. Applied on the test data, thelinear regression shows a mean absolute error of___. Task 4. Linear RegressionTrain a linear regresion model to fit the grade_final category.Partition 70-30 with random seed 1.Apply the model to the test data and evaluate its performance. Question: Which of thefollowing nodes can beused to replace theeducation index? Question: For which ofthe following studentsthere is no transcriptavailable? students.sqlitetranscript.csv SQLite Connector CSV Reader Task 3. Make sense of the dataThe columns edu_mother and edu_father represent theeducation level of a student's parents. Map the indices to thefollowing categories0 - none1 - primary2 - middle3 - secondary4 - higherGroup the data in order to obtain the percentage of studentsfor each combination of mother and father education level. Task 1. Read dataThe students.sqlite database stores studentpersonal info in 2 tables - GP and MS - correspondingto two different schools. Read the content of the twotables into the workflow.The transcript.csv file contains failures, absencesand grades for each student. Question: What is thepercentage of students withboth parents having highereducation? Task 2. Bring things togetherMerge all the data in a single table containing studentsfrom both schools and the relative transcript data. Question: What is thetotal number of studentsin both schools? Fill the gaps: The feature with the highestcoefficient is ____. Applied on the test data, thelinear regression shows a mean absolute error of___. Task 4. Linear RegressionTrain a linear regresion model to fit the grade_final category.Partition 70-30 with random seed 1.Apply the model to the test data and evaluate its performance. Question: Which of thefollowing nodes can beused to replace theeducation index? Question: For which ofthe following studentsthere is no transcriptavailable? students.sqlitetranscript.csvSQLite Connector CSV Reader

Nodes

Extensions

Links