Raster Histograms

First, add the required depedencies

using Rasters, NCDatasets, Downloads, CairoMakie

and the RasterHistograms package.

using RasterHistograms

Using this package we can produce Histograms from data that is in a Raster, RasterStack or RasterSeries, which are named N-dimensional arrays, in a similar way that xhistogram works for xarray in python. This example is structured similarly to the xhistogram tutorial.

Randomly generated toy data

First we generate some randomly distributed data and form a Raster.

x, t = range(-2π, 2π; length = 50), range(0, 4π; length = 100)
dimensions = (X(x), Ti(t))
rs = Raster(randn(length(x), length(t)), dimensions; name = :Toy_data)
50×100 Raster{Float64,2} Toy_data with dimensions: 
  X Sampled{Float64} -6.283185307179586:0.2564565431501872:6.283185307179586 ForwardOrdered Regular Points,
  Ti Sampled{Float64} 0.0:0.12693303650867852:12.566370614359172 ForwardOrdered Regular Points
extent: Extent(X = (-6.283185307179586, 6.283185307179586), Ti = (0.0, 12.566370614359172))missingval: missingparent:
            0.0         0.126933  …  12.3125    12.4394     12.5664
 -6.28319  -1.55352     1.64812       2.14722   -0.426242   -1.07511
 -6.02673  -0.695644   -0.283337      1.11782   -0.964371   -1.08192
 -5.77027  -0.0306251  -0.828464      0.301694   0.0992767  -0.160908
 -5.51382  -0.474193   -1.1104        0.110742   2.16118     0.436497
 -5.25736   0.347457   -0.741123  …   0.729264   0.985316    1.31704
  ⋮                               ⋱              ⋮          
  5.0009    1.02661    -1.53722      -0.578545  -0.2662     -0.111284
  5.25736   0.326418    0.233874      0.61121    0.547834   -0.50282
  5.51382  -1.91073    -0.504785      1.18482    0.878692   -0.778682
  5.77027   0.826699   -1.23202   …   1.89936   -1.15832    -1.2568
  6.02673  -2.20647    -1.56199      -0.863645   1.60535    -0.204472
  6.28319   0.159996   -0.408273      0.084944  -0.769416   -0.259385

The we can form a RasterLayerHistogram for the :Toy_data

rs_hist = RasterLayerHistogram(rs)
RasterLayerHistogram for the variable Toy_data
┣━━ Layer dimensions: (:X, :Ti) 
┣━━━━━━━━ Layer size: (50, 100)
┗━━━━━━━━━ Histogram: 1-dimensional

We can then plot the data and the Histogram

fig = Figure(size = (1000, 600))
ax1 = Axis(fig[1, 1];
           title = "Toy data",
           xlabel = "x",
           ylabel = "time")
hm = heatmap!(ax1, x, t, rs.data)
Colorbar(fig[2, 1], hm; vertical = false, flipaxis = false)
ax2 = Axis(fig[1, 2];
          title = "Histogram of Toy data",
          xlabel = "Toy data", ylabel = "Frequency")
plot!(ax2, rs_hist; color = :steelblue)
fig

By default we have a frequency Histogram. We can normalise the Histogram by calling the normalize! function on rs_hist and choosing a mode of normalisation. For more information about the possible modes of normalisation see here.

normalize!(rs_hist; mode = :pdf)

Then replot with the normalised histogram

fig = Figure(size = (900, 600))
ax1 = Axis(fig[1, 1];
           title = "Toy data",
           xlabel = "x",
           ylabel = "time")
hm = heatmap!(ax1, x, t, rs.data)
Colorbar(fig[2, 1], hm; vertical = false, flipaxis = false)
ax2 = Axis(fig[1, 2];
          title = "Histogram (pdf) of Toy data",
          xlabel = "Toy data", ylabel = "density")
plot!(ax2, rs_hist; color = :steelblue)
fig
Info

Plotting using Plots.jl is also possible. See the module documentation for more info.

Real world data example

We now look at temperature and salinity distributions using ECCOv4r4 [1], [2], [3] model output. The function uses try to download from NASA EarthData but this sometimes fails during the docs build.

Info

See the NCDatasets.jl example for information on how to download data from NASA EarthData.

function download_ECCO()

    try
        Downloads.download("https://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%2520Ocean%2520Temperature%2520and%2520Salinity%2520-%2520Daily%2520Mean%25200.5%2520Degree%2520(Version%25204%2520Release%25204)/granules/OCEAN_TEMPERATURE_SALINITY_day_mean_2007-01-01_ECCO_V4r4_latlon_0p50deg.dap.nc4", "ECCO_data.nc")
    catch
        @info "dowloading from drive"
        Downloads.download("https://drive.google.com/uc?id=1MNeThunqpY-nFzsZLZj9BV8sM5BJgnxT&export=download", "ECCO_data.nc")
    end

    return nothing

end
download_ECCO()
[ Info: dowloading from drive

This example shows how the package works for 2-dimensional Histograms though it can be generalised to N dimensions depending on the number of variables (i.e. layers in the RasterStack) one is looking at.

Forming the RasterStack

We form a RasterStack with only the :SALT (practical salinity) and :THETA (potential temperature) layers. This means the resulting RasterStackHistogram will be 2 dimensional. Note the order of the variables matters here for plotting purposes. The first layer, in this case :SALT will be the x-axis, and the second layer :THETA will be the y-axis.

stack_TS = RasterStack("ECCO_data.nc", name = (:SALT, :THETA))
edges = (31:0.025:38, -2:0.1:32)
stack_hist = RasterStackHistogram(stack_TS, edges)
RasterStackHistogram for the variables (:SALT, :THETA)
┣━━ Stack dimensions: (:X, :Y, :Z, :Ti)
┣━━ Stack layer size: (720, 360, 50, 1)
┗━━━━━━━━━ Histogram: 2-dimensional

Now we can plot, the histogram and look at the unweighted distribtution of temperature and salinity. By default the empty bins are plotted with the value of zero.

fig = Figure(size = (500, 500))
ax = Axis(fig[1, 1];
          title = "Temperature and salinity joint distribution (unweighted)",
          xlabel = "Practical salinity (psu)",
          ylabel = "Potential temperature (°C)")
hm = heatmap!(ax, stack_hist; colorscale = log10)
Colorbar(fig[1, 2], hm)
fig

Weighting the Histogram

The module also exports simple functions for calculating area and volume weights from the dimensions of the grid which can be used to weight an AbstractRasterHistogram. Where weights are available from model data they should be used in favour of the functions.

dV = volume_weights(stack_TS)
weighted_stack_hist = RasterStackHistogram(stack_TS, dV, edges)
fig = Figure(size = (500, 500))
ax = Axis(fig[1, 1];
          title = "Temperature and salinity joint distribution (weighted)",
          xlabel = "Practical salinity (psu)",
          ylabel = "Potential temperature (°C)")
hm = heatmap!(ax, weighted_stack_hist; colorscale = log10)
Colorbar(fig[1, 2], hm)
fig
[1]
I. Fukumori, O. Wang, G. Forget, P. Heimbach, R. M. Ponte and 2022}. {ECCO Version 4 Release 4 Dataset}. (2022).
[2]
I. Fukumori, O. Wang, I. Fenty, G. Forget, P. Heimbach and R. M. Ponte. {Synopsis of the ECCO Central Production Global Ocean and Sea-Ice State Estimate, Version 4 Release 4 (Version 4 Release 4).}. Zenodo 3, 1–17 (2021).
[3]
G. Forget, J. M. Campin, P. Heimbach, C. N. Hill, R. M. Ponte and C. Wunsch. {ECCO version 4: An integrated framework for non-linear inverse modeling and global ocean state estimation}. Geoscientific Model Development 8, 3071–3104 (2015).

This page was generated using Literate.jl.