randomFromDataset(values, kernel, percentageOfFrameTime, bandwidth)
Overview
Generates a random number based on a given dataset using Kernel Density Estimation (KDE).
Parameters
Parameter | Type | Description |
---|---|---|
values |
Table | The dataset from which to generate the random number. |
kernel * |
String | The type of kernel to use for the KDE. Default is Gaussian. |
percentageOfFrameTime |
Number | The percentage of frame time allowed for the function to run. Default is 0.1. |
bandwidth |
Number | The bandwidth to use in the KDE. Calculated by default if not provided. |
* Options for kernel
are: "Gaussian", "Epanechnikov", "Uniform", "Triangular", "Biweight", "Cosine", "Logistic", and "Sigmoid".
Returns
Return | Type | Description |
---|---|---|
xRandom |
Number | A random number generated based on the KDE of the dataset. |
Example
local dataset = {1, 2, 3, 3, 4, 4, 5, 6, 7}
local kernelType = "Gaussian"
local percentageOfFrameTime = 0.1
local randomValue = StatBook.randomFromDataset(dataset, kernelType, percentageOfFrameTime)
print(randomValue) -- Output will vary
Mathematical Background
The function uses Kernel Density Estimation (KDE) to approximate the probability density function of the given dataset. The KDE is computed for multiple points within the data range, and the maximum density \text{maxY} is determined. Random coordinates (x, y) are generated within this computed density. The y -value of the coordinate is compared to the KDE value at x , and if y \leq \text{KDE}(x) , then x is returned as the generated random number.
The kernel density estimation is given by:
where K(u) is the kernel function and n is the number of data points. Different types of kernel functions like Gaussian, Epanechnikov, etc., can be used.