.. _voronoi_sampling:

=========================
Sampling in Voronoi grids
=========================

In RT computations, the need to uniformly sample points in the cells of a grid often arises. A typical example
is that of the evaluation of the average value of a function (e.g., a density function) within a cell.

Whereas this task is relatively simple in regular grids, the situation is more complicated in Voronoi grids.
Cells in :ref:`Voronoi grids <voronoi_grid>` are indeed convex polyhedra with an arbitrary number of faces,
whose shapes and volumes are determined by the intial distribution of sites.

The Voronoi helper class in Hyperion contains support for producing a list of sampling points for each cell.

Simple case: averaging of function
==================================

In the simplest case, we might want to evaluate a function in all cells (for example a function to find the density), and rather than using only the position of the sites, we want to use several random samples in each cell.

The easiest way to do this is to make use of the top-level :class:`~hyperion.grid.VoronoiGrid` class and the :meth:`~hyperion.grid.VoronoiGrid.evaluate_function_average` method. Suppose we have a density function defined as

.. code-block:: python

   def density_func(x,y,z):
      # Density is proportional to the inverse
      # square of the distance from the origin.
      density = 1 / (x*x + y*y + z*z)
      return density

We can now generate some random sites:

>>> import numpy as np
>>> N = 100000
>>> x = np.random.uniform(size=N)
>>> y = np.random.uniform(size=N)
>>> z = np.random.uniform(size=N)

and set up a voronoi grid:

>>> from hyperion.grid import VoronoiGrid
>>> g = VoronoiGrid(x, y, z)

We can then simply call the
:meth:`~hyperion.grid.VoronoiGrid.evaluate_function_average` method to evaluate
the function at ``n_samples`` sites in total, with a minimum of
``min_cell_samples`` samples in each cell::

    >>> dens = g.evaluate_function_average(density_func,
                                           n_samples=1000000,
                                           min_cell_samples=5)

Advanced: accessing the random samples directly
===============================================

In this example, we now show how to manually get the random samples in each cell and show to how to do the same function evaluation as above.

As before, we first initialise a cubic domain with unitary edge, and we fill it with randomly-placed points:

>>> import numpy as np
>>> N = 100000
>>> x = np.random.uniform(size=N)
>>> y = np.random.uniform(size=N)
>>> z = np.random.uniform(size=N)
>>> sites_arr = np.array([x, y, z]).transpose()

Next, we import the ``voronoi_helpers`` module and we compute the Voronoi grid ``g`` generated by the points in ``sites_arr``:

>>> from hyperion.grid import voronoi_helpers as vh
>>> g = vh.voronoi_grid(sites_arr, np.array([[0, 1.], [0, 1], [0, 1]]),n_samples=1000000,min_cell_samples=10)
INFO: Computing the tessellation via voro++ [hyperion.grid.voronoi_helpers]

Here we have passed to ``voronoi_grid`` two extra parameters related to the sampling:

* ``n_samples`` is the (approximate) *total* number of sampling points to be produced. For each cell, the algorithm will produce
  a number of sampling points that is proportional to the cell volume: larger cells will be sampled with more points
  than smaller cells;
* ``min_cell_samples`` is the *minimum* number of sampling points per cell. If a cell is small enough, it could be
  that no sampling points are allotted to it. This parameters forces the minimum number of sampling points to be
  allocated to a cell, regardless of its volume. The default value for this parameter, if omitted, is 10.

When a Voronoi grid is constructed with a positive ``n_samples`` parameters, it will expose two properties:

* ``samples`` is a list of three-dimensional vectors representing all sampling points in the domain;
* ``samples_idx`` is a list of indices describing, in sparse format, to which cell the points in ``samples``
  belong.

For instance, in the example above:

>>> g.samples
array([[ 0.57565603,  0.9219989 ,  0.15469812],
       [ 0.58406352,  0.91473664,  0.15834503],
       [ 0.57642814,  0.93045367,  0.16361907],
       ...,
       [ 0.80025712,  0.18526818,  0.61809793],
       [ 0.78721772,  0.18366617,  0.62582103],
       [ 0.79493898,  0.17735752,  0.62803905]])
>>> g.samples_idx
array([      0,      10,      20, ..., 1147105, 1147115, 1147131], dtype=int32)

This means that the sampling points for the first cell have indices 0 to 10 in ``g.samples``, the sampling
points for the second cell have indices 10 to 20 in ``g.samples``, and so on.

If now we suppose to have a density function defined as

.. code-block:: python

   def density_func(x,y,z):
      # Density is proportional to the inverse
      # square of the distance from the origin.
      density = 1 / (x*x + y*y + z*z)
      return density

where ``x``, ``y`` and ``z`` are 1-D numpy arrays of coordinates, we can then first compute the density at
all sampling points like this:

>>> dens_all = density_func(g.samples[:,0],g.samples[:,1],g.samples[:,2])

We can then compute the average density per cell with:

>>> dens_average = np.add.reduceat(dens_all, g.samples_idx[:-1]) / np.diff(g.samples_idx)

That is, ``dens_average`` will be an array of 1E5 elements each containing the average value of ``density_func()``
for each cell of the grid:

>>> dens_average
array([ 0.8288213 ,  3.24626334,  0.74344873, ...,  2.98673651,
        0.64962755,  0.96117706])
>>> len(dens_average)
100000