Icon

Final Project - Keplar Planets

1 2 3 For my final project I used the dataset for extrasolar planets that were discovered using the Keplar probe. This data included arich amount of data, such as planetary orbits, orbital periods, mass of the object, mass of the star, and similar information if therewere multiple planets detected. I used the same process within the data meta node for each step, and that contains severalrequirements for the project.Step 1: Using some data filters clean up, I took the data and made a pie chart out of how the planet was discovered. The overallwinner of most common method of discovery was - Transit. This means that the most common method was detecting the dip inthe light of the star that would occur when the planet passed between our line of sight and the star itself (similar to an eclipse butMUCH farther away). I then did a scatter plot using the orbital period versus the orbital distance to show how the period anddistance were tied together. I added in a color manager to show how the planet was discovered. I limited this to the primaryplanet (the first field for each of the planets). Step 2: Although I knew this wouldn't be very accurate, because there are multiple planets potentially involved and they haveinteractions that would affect the calculation, I used a linear regression learner to take the planet's orbital period, speed, and themass of the star and tried to use the regression predictor to determine the mass of the planet, and compared it to what thecalculated mass was. This regression calculation wasn't using enough parameters to determine the mass very well.Step 3: I used a Decision Tree Learner and Predictor to see if I could predict which method was used to determine that the planetexisted based on the orbital period of the primary planet in the data. The accuracy of this was surprising, as it predicted thecorrect method of discovery 94.9% of the time!When I found the data for this in the datasets on Kaggle, I knew I had to try to come up with a way to use it, as I love Astronomyand Keplar finding new planets has been an exciting new development in that field over the past couple of years! Data source: https://www.kaggle.com/muhakabartay/markmarkohkeplerconfirmedplanets Node 27Node 7Node 8Node 9Node 10Node 11Node 31Node 30Node 15Node 16Node 17Node 18Node 19Node 20Node 27Node 22Node 28Node 29Node 30 Partitioning GroupBy Pie chart (local) Scatter Plot Color Manager Data Meta Node Numeric Scorer RegressionPredictor Linear RegressionLearner Row Filter Row Filter Column Filter Data Meta Node Data Meta Node Partitioning Row Filter DecisionTree Learner Decision TreePredictor Scorer 1 2 3 For my final project I used the dataset for extrasolar planets that were discovered using the Keplar probe. This data included arich amount of data, such as planetary orbits, orbital periods, mass of the object, mass of the star, and similar information if therewere multiple planets detected. I used the same process within the data meta node for each step, and that contains severalrequirements for the project.Step 1: Using some data filters clean up, I took the data and made a pie chart out of how the planet was discovered. The overallwinner of most common method of discovery was - Transit. This means that the most common method was detecting the dip inthe light of the star that would occur when the planet passed between our line of sight and the star itself (similar to an eclipse butMUCH farther away). I then did a scatter plot using the orbital period versus the orbital distance to show how the period anddistance were tied together. I added in a color manager to show how the planet was discovered. I limited this to the primaryplanet (the first field for each of the planets). Step 2: Although I knew this wouldn't be very accurate, because there are multiple planets potentially involved and they haveinteractions that would affect the calculation, I used a linear regression learner to take the planet's orbital period, speed, and themass of the star and tried to use the regression predictor to determine the mass of the planet, and compared it to what thecalculated mass was. This regression calculation wasn't using enough parameters to determine the mass very well.Step 3: I used a Decision Tree Learner and Predictor to see if I could predict which method was used to determine that the planetexisted based on the orbital period of the primary planet in the data. The accuracy of this was surprising, as it predicted thecorrect method of discovery 94.9% of the time!When I found the data for this in the datasets on Kaggle, I knew I had to try to come up with a way to use it, as I love Astronomyand Keplar finding new planets has been an exciting new development in that field over the past couple of years! Data source: https://www.kaggle.com/muhakabartay/markmarkohkeplerconfirmedplanets Node 27Node 7Node 8Node 9Node 10Node 11Node 31Node 30Node 15Node 16Node 17Node 18Node 19Node 20Node 27Node 22Node 28Node 29Node 30Partitioning GroupBy Pie chart (local) Scatter Plot Color Manager Data Meta Node Numeric Scorer RegressionPredictor Linear RegressionLearner Row Filter Row Filter Column Filter Data Meta Node Data Meta Node Partitioning Row Filter DecisionTree Learner Decision TreePredictor Scorer

Nodes

Extensions

Links