open_cp.kernels¶

kernels¶

For us, a “kernel” is simply a non-normalised probability density function. We use kernels extensively to represent (conditional) intensity functions in point processes.

More formally, a kernel is any python object which is callable (e.g. a function, or an instance of a class implementing __call__). We follow the e.g. scipy convention:

A kernel expecting a one dimensional input may take a scalar as input, or a one-dimensional numpy array. It should return, respectively, a scalar or a one-dimensional array of the same size. For example:
```
def gaussian(p):
    return np.exp(-p * p)
```
Here we use np.exp to make sure that if p is an array, we handle it correctly.
A kernel expecting a k dimensional input may take an array of shape (k) to represent a point, or an array of shape (k,N) to represent N points. The return should be, respectively, a scalar or an array of shape (N). We follow this convention to allow e.g. the following:
```
def x_y_sum(p):
    return p[0] + p[1]
```
In the single-point case, p[0] is a scalar representing the x coordinate and p[1] a scalar representing the y coordinate. In the multiple point case, p[0] is an array of all the x coordinates.

class open_cp.kernels.GaussianKernel(means, variances, scale=1.0)¶

Bases: open_cp.kernels.Kernel

A variable bandwidth gaussian kernel. Each input Gaussian is an uncorrelated k-dimensional Gaussian. These are summed to produce the kernel.

Parameters:	means – Array of shape (k,M). The centre of each Gaussian. variances – Array of shape (k,M). The variances of each Gaussian. scale – The overall normalisation factor, defaults to 1.0.

set_scale(scale)¶

class open_cp.kernels.KNNG1_NDFactors(k_first=100, k_rest=15)¶

Bases: open_cp.kernels.TimeSpaceFactorsEstimator

A KernelEstimator which applies the KthNearestNeighbourGaussianKDE to first coordinate with one value of k, and then to the remaining coordinates with another value of k, and combines the result.

Parameters:	k_first – The nearest neighbour to use in the first coordinate, defaults to 100, if N is too small then uses N-1. k_rest – The nearest neighbour to use for the remaining coordinates, defaults to 15, if N is too small then uses N-1.

class open_cp.kernels.Kernel¶

Bases: object

Abstract base class for classes implementing kernels. You are not required to extend this class, but you should implement the interface.

set_scale(scale=1.0)¶: The output kernel should be multiplied by this value before being returned.

class open_cp.kernels.KernelEstimator¶

Bases: object

Abstract base class for classes implementing kernel estimators. You are not required to extend this class, but you should implement the interface.

class open_cp.kernels.KthNearestNeighbourGaussianKDE(k=15)¶

Bases: open_cp.kernels.KernelEstimator

A KernelEstimator which applies the algorithm given by kth_nearest_neighbour_gaussian_kde()

Parameters:	k – The nearest neighbour to use, defaults to 15, if N is too small then uses N-1.

class open_cp.kernels.ReflectedKernel(delegate, reflected_axis=0)¶

Bases: open_cp.kernels.Kernel

A specialisation of Kernel which is for where, along certain axes, we know that the data is concentrated on the positive interval [0, infty]. We wrap an existing Kernel instance, but reflect about 0 any estimated probability mass on the negative reals.

Parameters:	delegate – The `Kernel` instance to delegate to. reflected_axis – Which axis to reflect about.

set_scale(value)¶

class open_cp.kernels.ReflectedKernelEstimator(estimator, reflected_axis=0)¶

Bases: open_cp.kernels.KernelEstimator

Wraps an existing :class KernelEstimator: but reflects the estimated kernel about 0 in one axis. See ReflectedKernel

Parameters:	estimator – The `KernelEstimator` to delegate to. reflected_axis – Which axis to reflect about.

class open_cp.kernels.TimeSpaceFactorsEstimator(time_estimator, space_estimator)¶

Bases: open_cp.kernels.KernelEstimator

A KernelEstimator which applies a one-dimensional kernel estimator to the first (time) coordinate of the data, and another kernel estimator to the remaining (space) coordinates.

Parameters:	time_estimator – A `KernelEstimator` for the one-dimensional time data. space_estimator – A `KernelEstimator` for the remaining coordinates.

class Factors_Kernel(first, rest)¶

Bases: open_cp.kernels.Kernel

set_scale(scale)¶

space_kernel(points)¶: The space component of the overall kernel, scaled appropriately.

time_kernel(points)¶: A one-dimensional, normalised kernel giving the time component of the overall kernel.

TimeSpaceFactorsEstimator.first(coords)¶

Find the kernel estimate for the first coordinate only.

Parameters:	coords – All the coordinates; only the 1st coordinate will be used.
Returns:	A one dimensional kernel.

TimeSpaceFactorsEstimator.rest(coords)¶

Find the kernel estimate for the remaining (n-1) coordinates only.

Parameters:	coords – All the coordinates; the 1st coordinate will be ignored.
Returns:	A (n-1) dimensional kernel.

open_cp.kernels.compute_kth_distance(coords, k=15)¶

Find the (Euclidean) distance to the k th nearest neighbour.

Parameters:	coords – An array of shape (n,N) of N points in n dimensional space; if n=1 then input is an array of shape (N). k – The nearest neighbour to use, defaults to 15, if N is too small then uses N-1.
Returns:	An array of shape (N) where the i-th entry is the distance from the i-th point to its k-th nearest neighbour.

open_cp.kernels.compute_normalised_kth_distance(coords, k=15)¶

Find the (Euclidean) distance to the k th nearest neighbour. The input data is first scaled so that each coordinate (independently) has unit sample variance.

Parameters:	coords – An array of shape (n,N) of N points in n dimensional space; if n=1 then input is an array of shape (N). k – The nearest neighbour to use, defaults to 15, if N is too small then uses N-1.
Returns:	An array of shape (N) where the i-th entry is the distance from the i-th point to its k-th nearest neighbour.

open_cp.kernels.kth_nearest_neighbour_gaussian_kde(coords, k=15)¶

Estimate a kernel using variable bandwidth with a Gaussian kernel. The input data is scaled (independently in each coordinate) to have unit variance in each coordinate, and then the distance to the k th nearest neighbour is found. The returned kernel is normalised, and is the sum of Gaussians centred on each data point, where the standard deviation for each coordinate is the distance for that point, multiplied by the standard deviation for that coordinate.

See the Appendix of: Mohler et al, “Self-Exciting Point Process Modeling of Crime”, Journal of the American Statistical Association, 2011 DOI: 10.1198/jasa.2011.ap09546

Parameters:	coords – An array of shape (n,N) of N points in n dimensional space; if n=1 then input is an array of shape (N). k – The nearest neighbour to use, defaults to 15, if N is too small then uses N-1.
Returns:	A kernel object.

open_cp.kernels.marginal_knng(coords, coord_index=0, k=15)¶

Computes a one-dimensional marginal for the kernel which would be returned by :function kth_nearest_neighbour_gaussian_kde: Equivalent to, but much faster, than (numerically) integerating out all but one variable.

Parameters:	coords – An array of shape (n,N) of N points in n dimensional space; if n=1 then input is an array of shape (N). coord_index – Which coordinate to return the marginal for; defaults to 0 so giving the first coordinate. k – The nearest neighbour to use, defaults to 15, if N is too small then uses N-1.
Returns:	A one-dimensional kernel.