Dendrogram (grid_dendro.dendrogram)#
Implements Dendrogram class
- class grid_dendro.dendrogram.Dendrogram(arr, boundary_flag='periodic', verbose=True)[source]#
Dendrogram representing hierarchical structure in 3D data
- nodes#
Maps a node to flat indices of its member cells. {node: cells}
- Type
dict
- parent#
Maps a node to its parent node. {node: node}
- Type
dict
- children#
Maps a node to its child nodes. {node: list of nodes}
- Type
dict
- ancestor#
Maps a node to its ancestor. {node: node}
- Type
dict
- descendants#
Maps a node to its descendant nodes. {node: list of nodes}
- Type
dict
- leaves#
List of nodes that do not have children.
- Type
list
- trunk#
Id of the trunk node.
- Type
int
- minima#
Flat indices at the local potential minima.
- Type
set
- construct()[source]#
Construct dendrogram
Constructs dendrogram dictionaries: nodes, parent, children, ancestor, and descendants. Then finds leaf nodes and trunk node.
Notes
Use int64 in a loop which seems to be faster in 64 bit system, and then cast to int32 or int64 depending on the size of the array after construction is done, in order to save memory. Similarly, use list instead of numpy array for efficient append operation, then cast to numpy array after construction is done to save memory. These memory optimization is only done to “nodes” and “cells_ordered”, which consume most of the memory.
Examples
>>> import pyathena as pa >>> s = pa.LoadSim("/scratch/smoon/M5J2P0N512") >>> ds = s.load_hdf5(50, load_method='pyathena') >>> gd = dendrogram.Dendrogram(ds.phigas.to_numpy()) >>> gd.construct() >>> gd.prune() # Remove buds
- filter_data(dat, nodes, fill_value=nan, drop=False)[source]#
Filter data by node, including all descendant nodes.
Get all member cells of the nodes and their descendants and return the masked data.
- Parameters
dat (xarray.DataArray or numpy.ndarray) – Input array to be filtered.
nodes (int or array of ints) – Selected nodes.
fill_value (float, optional) – The value to fill outside of the filtered region. Default to nan.
drop (bool) – If true, return faltten data that only include filtered cells.
- Returns
out – Filtered array matching the input array type
- Return type
xarray.DataArray or numpy.ndarray
See also
dendrogram.filter_by_dict
Notes
If dat is already 1-D flattend array, return 1-D array that only contains filtered region. Otherwise, return as the original shape and data type unless `drop`=True.
- find_minimum(node)[source]#
Find leaf that is at the potential minimum in this node
If node is already a leaf, return itself.
- Parameters
node (int) – ID of the node.
- Returns
leaf – ID of the leaf node.
- Return type
int
- get_all_descendant_cells(node)[source]#
Return all member cells of the node, including descendant nodes
- Parameters
node (int) – Id of a selected node.
- Returns
cells – Flattned indices of all member cells of this node.
- Return type
array
- len(node)[source]#
Returns the number of cells in a node (including descendants)
- Parameters
node (int) – ID of a selected node.
- Returns
Number of cells in this node
- Return type
int
- prune(ncells_min=27)[source]#
Prune the buds by applying minimum number of cell criterion
- Parameters
ncells_min (int, optional) – Minimum number of cells of a leaf node.
- reindex(start_indices, target_shape, direction='forward')[source]#
Transform global indices to local indices and vice versa.
When only a part of the entire domain is loaded, the indices of the cell becomes different from the global indices. This function do the required reindexing. Assumes the index ordering is (k, j, i).
When direction=’forward’, the transform is from global to local. In this case, the node IDs are not transformed and only the values are transformed. The use case is to filter the partially loaded data using the full dendrogram.
When direction=’backward’, the trasnform is from local to global. This is useful when a dendrogram is constructed using the local data cube. Both the key and values (i.e., node IDs and cell indices) are transformed in this case.
- Parameters
start_indices (array-like) – (ks, js, is) of the local domain
target_shape (array-like) – (nz, ny, nx) of the local domain
- grid_dendro.dendrogram.filter_by_dict(dat, node_dict=None, cells=None, fill_value=nan, drop=False)[source]#
Filter data by node dictionary or array of cell indices.
This is a stand-alone filtering function that offers more flexibility to Dendrogram.filter_data method.
- Parameters
dat (xarray.DataArray or numpy.ndarray) – Input array to be filtered.
node_dict (dict, optional) – Dictionary that maps node id to flat indices of member cells.
cells (array, optional) – Flat indices of selected cells. Overrides node_dict.
fill_value (float, optional) – Value to fill outside of the filtered region. Default is np.nan.
drop (bool, optional) – If true, return flatten dat that only include selected cells.
- Returns
out – Filtered array matching the input array type
- Return type
xarray.DataArray or numpy.ndarray
See also