Dendrogram (grid_dendro.dendrogram)#

Implements Dendrogram class

class grid_dendro.dendrogram.Dendrogram(arr, boundary_flag='periodic', verbose=True)[source]#

Dendrogram representing hierarchical structure in 3D data

nodes#

Maps a node to flat indices of its member cells. {node: cells}

Type

dict

parent#

Maps a node to its parent node. {node: node}

Type

dict

children#

Maps a node to its child nodes. {node: list of nodes}

Type

dict

ancestor#

Maps a node to its ancestor. {node: node}

Type

dict

descendants#

Maps a node to its descendant nodes. {node: list of nodes}

Type

dict

leaves#

List of nodes that do not have children.

Type

list

trunk#

Id of the trunk node.

Type

int

minima#

Flat indices at the local potential minima.

Type

set

construct()[source]#

Construct dendrogram

Constructs dendrogram dictionaries: nodes, parent, children, ancestor, and descendants. Then finds leaf nodes and trunk node.

Notes

Use int64 in a loop which seems to be faster in 64 bit system, and then cast to int32 or int64 depending on the size of the array after construction is done, in order to save memory. Similarly, use list instead of numpy array for efficient append operation, then cast to numpy array after construction is done to save memory. These memory optimization is only done to “nodes” and “cells_ordered”, which consume most of the memory.

Examples

>>> import pyathena as pa
>>> s = pa.LoadSim("/scratch/smoon/M5J2P0N512")
>>> ds = s.load_hdf5(50, load_method='pyathena')
>>> gd = dendrogram.Dendrogram(ds.phigas.to_numpy())
>>> gd.construct()
>>> gd.prune()  # Remove buds
filter_data(dat, nodes, fill_value=nan, drop=False)[source]#

Filter data by node, including all descendant nodes.

Get all member cells of the nodes and their descendants and return the masked data.

Parameters
  • dat (xarray.DataArray or numpy.ndarray) – Input array to be filtered.

  • nodes (int or array of ints) – Selected nodes.

  • fill_value (float, optional) – The value to fill outside of the filtered region. Default to nan.

  • drop (bool) – If true, return faltten data that only include filtered cells.

Returns

out – Filtered array matching the input array type

Return type

xarray.DataArray or numpy.ndarray

See also

dendrogram.filter_by_dict

Notes

If dat is already 1-D flattend array, return 1-D array that only contains filtered region. Otherwise, return as the original shape and data type unless `drop`=True.

find_minimum(node)[source]#

Find leaf that is at the potential minimum in this node

If node is already a leaf, return itself.

Parameters

node (int) – ID of the node.

Returns

leaf – ID of the leaf node.

Return type

int

get_all_descendant_cells(node)[source]#

Return all member cells of the node, including descendant nodes

Parameters

node (int) – Id of a selected node.

Returns

cells – Flattned indices of all member cells of this node.

Return type

array

len(node)[source]#

Returns the number of cells in a node (including descendants)

Parameters

node (int) – ID of a selected node.

Returns

Number of cells in this node

Return type

int

prune(ncells_min=27)[source]#

Prune the buds by applying minimum number of cell criterion

Parameters

ncells_min (int, optional) – Minimum number of cells of a leaf node.

reindex(start_indices, target_shape, direction='forward')[source]#

Transform global indices to local indices and vice versa.

When only a part of the entire domain is loaded, the indices of the cell becomes different from the global indices. This function do the required reindexing. Assumes the index ordering is (k, j, i).

When direction=’forward’, the transform is from global to local. In this case, the node IDs are not transformed and only the values are transformed. The use case is to filter the partially loaded data using the full dendrogram.

When direction=’backward’, the trasnform is from local to global. This is useful when a dendrogram is constructed using the local data cube. Both the key and values (i.e., node IDs and cell indices) are transformed in this case.

Parameters
  • start_indices (array-like) – (ks, js, is) of the local domain

  • target_shape (array-like) – (nz, ny, nx) of the local domain

sibling(node)[source]#

Returns my sibling

Parameter#

nodeint

ID of a selected node.

returns

ID of my sibling node

rtype

int

grid_dendro.dendrogram.filter_by_dict(dat, node_dict=None, cells=None, fill_value=nan, drop=False)[source]#

Filter data by node dictionary or array of cell indices.

This is a stand-alone filtering function that offers more flexibility to Dendrogram.filter_data method.

Parameters
  • dat (xarray.DataArray or numpy.ndarray) – Input array to be filtered.

  • node_dict (dict, optional) – Dictionary that maps node id to flat indices of member cells.

  • cells (array, optional) – Flat indices of selected cells. Overrides node_dict.

  • fill_value (float, optional) – Value to fill outside of the filtered region. Default is np.nan.

  • drop (bool, optional) – If true, return flatten dat that only include selected cells.

Returns

out – Filtered array matching the input array type

Return type

xarray.DataArray or numpy.ndarray