Icon

Challenge 23 - Generating Synthetic Population Attributes

Create a workflow to generate synthetic data for an imaginary population consisting of 1000 people. The data should include attributes such as age, height, and weight.

Generate Unique random ID for 1000 people.

Generate Age Distribution using a Gaussian distribution with a mean of 40 and a standard deviation of 10, generate ages for 1000 people, then bin the dataset into four age groups: 'Children', 'Young Adults', 'Adults', and 'Seniors'.

For each age group, generate heights using a beta distribution. Tune the parameters of the distribution to reflect realistic height ranges for each age group.. Categorize heights into three groups: 'less than 160', 'more than 180', and 'rest'.

Based on the binned height information, generate weights using a gamma distribution. Adjust the parameters of the distribution to accurately model weight distributions for each height group.

Utilize a scatter plot matrix to visualize the relationships between age, height, and weight. Identify any patterns or correlations within the synthetic population.

Nodes

Extensions

Links