# Computational Tricks with Turing (Non-Centered Parametrization and QR Decomposition)

There are some computational tricks that we can employ with Turing. I will cover here two computational tricks:

1. QR Decomposition

2. Non-Centered Parametrization

## QR Decomposition

Back in "Linear Algebra 101" we've learned that any matrix (even rectangular ones) can be factored into the product of two matrices:

• $\mathbf{Q}$: an orthogonal matrix (its columns are orthogonal unit vectors meaning $\mathbf{Q}^T = \mathbf{Q}^{-1})$.

• $\mathbf{R}$: an upper triangular matrix.

This is commonly known as the QR Decomposition:

$\mathbf{A} = \mathbf{Q} \cdot \mathbf{R}$

Let me show you an example with a random matrix $\mathbf{A} \in \mathbb{R}^{3 \times 2}$:

A = rand(3, 2)
3×2 Matrix{Float64}:
0.706461  0.417284
0.695988  0.436497
0.878977  0.196846

Now let's factor A using LinearAlgebra's qr() function:

using LinearAlgebra: qr, I
Q, R = qr(A)
LinearAlgebra.QRCompactWY{Float64, Matrix{Float64}, Matrix{Float64}}
Q factor:
3×3 LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}:
-0.533108  -0.421251  -0.733719
-0.525205  -0.515123   0.677354
-0.663292   0.746455   0.053373
R factor:
2×2 Matrix{Float64}:
-1.32517  -0.582274
0.0      -0.253694

Notice that qr() produced a tuple containing two matrices Q and R. Q is a 3x3 orthogonal matrix. And R is a 2x2 upper triangular matrix. So that $\mathbf{Q}^T = \mathbf{Q}^{-1}$ (the transpose is equal the inverse):

Matrix(Q') ≈ Matrix(Q^-1)
true

Also note that $\mathbf{Q}^T \cdot \mathbf{Q}^{-1} = \mathbf{I}$ (identity matrix):

Q' * Q ≈ I(3)
true

This is nice. But what can we do with QR decomposition? It can speed up Turing's sampling by a huge factor while also decorrelating the columns of $\mathbf{X}$, i.e. the independent variables. The orthogonal nature of QR decomposition alters the posterior's topology and makes it easier for HMC or other MCMC samplers to explore it. Let's see how fast we can get with QR decomposition. First, let's go back to the kidiq example in 6. Bayesian Linear Regression:

using Turing
using LinearAlgebra: I
using Statistics: mean, std
using Random: seed!
seed!(123)

@model function linreg(X, y; predictors=size(X, 2))
#priors
α ~ Normal(mean(y), 2.5 * std(y))
β ~ filldist(TDist(3), predictors)
σ ~ Exponential(1)

#likelihood
return y ~ MvNormal(α .+ X * β, σ^2 * I)
end;

using DataFrames
using CSV
using HTTP

url = "https://raw.githubusercontent.com/storopoli/Bayesian-Julia/master/datasets/kidiq.csv"
X = Matrix(select(kidiq, Not(:kid_score)))
y = kidiq[:, :kid_score]
model = linreg(X, y)
chain = sample(model, NUTS(), MCMCThreads(), 1_000, 4)
Chains MCMC chain (1000×17×4 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 4
Samples per chain = 1000
Wall duration     = 10.19 seconds
Compute duration  = 16.15 seconds
parameters        = α, β, β, β, σ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
parameters      mean       std   naive_se      mcse         ess      rhat   ess_per_sec
Symbol   Float64   Float64    Float64   Float64     Float64   Float64       Float64

α   21.5919    8.7175     0.1378    0.2040   1753.4216    1.0017      108.5441
β    2.0147    1.8092     0.0286    0.0452   1914.7422    1.0013      118.5305
β    0.5808    0.0584     0.0009    0.0012   2148.1559    1.0006      132.9798
β    0.2437    0.3056     0.0048    0.0070   2166.7021    1.0009      134.1279
σ   17.8928    0.5896     0.0093    0.0115   2708.8277    1.0001      167.6877

Quantiles
parameters      2.5%     25.0%     50.0%     75.0%     97.5%
Symbol   Float64   Float64   Float64   Float64   Float64

α    4.3464   15.7517   21.7247   27.2853   38.7344
β   -0.6530    0.7333    1.6501    3.0029    6.3849
β    0.4687    0.5416    0.5794    0.6197    0.6980
β   -0.3336    0.0285    0.2439    0.4560    0.8316
σ   16.7749   17.4893   17.8613   18.2915   19.0887


See the wall duration in Turing's chain: for me it took around 24 seconds.

Now let's us incorporate QR decomposition in the linear regression model. Here, I will use the "thin" instead of the "fat" QR, which scales the $\mathbf{Q}$ and $\mathbf{R}$ matrices by a factor of $\sqrt{n-1}$ where $n$ is the number of rows of $\mathbf{X}$. In practice it is better to implement the thin QR decomposition, which is to be preferred to the fat QR decomposition. It is numerically more stable. Mathematically, the thin QR decomposition is:

\begin{aligned} x &= \mathbf{Q}^* \mathbf{R}^* \\ \mathbf{Q}^* &= \mathbf{Q} \cdot \sqrt{n - 1} \\ \mathbf{R}^* &= \frac{1}{\sqrt{n - 1}} \cdot \mathbf{R}\\ \boldsymbol{\mu} &= \alpha + \mathbf{X} \cdot \boldsymbol{\beta} + \sigma \\ &= \alpha + \mathbf{Q}^* \cdot \mathbf{R}^* \cdot \boldsymbol{\beta} + \sigma \\ &= \alpha + \mathbf{Q}^* \cdot (\mathbf{R}^* \cdot \boldsymbol{\beta}) + \sigma \\ &= \alpha + \mathbf{Q}^* \cdot \widetilde{\boldsymbol{\beta}} + \sigma \\ \end{aligned}

Then we can recover the original $\boldsymbol{\beta}$ with:

$\boldsymbol{\beta} = \mathbf{R}^{*-1} \cdot \widetilde{\boldsymbol{\beta}}$

In Turing, a model with QR decomposition would be the same linreg but with a different X matrix supplied, since it is a data transformation. First, we decompose your model data X into Q and R:

Q, R = qr(X)
Q_ast = Matrix(Q) * sqrt(size(X, 1) - 1)
R_ast = R / sqrt(size(X, 1) - 1);

Then, we instantiate a model with Q instead of X and sample as you would:

model_qr = linreg(Q_ast, y)
chain_qr = sample(model_qr, NUTS(1_000, 0.65), MCMCThreads(), 1_000, 4)
Chains MCMC chain (1000×17×4 Array{Float64, 3}):

Iterations        = 1001:1:2000
Number of chains  = 4
Samples per chain = 1000
Wall duration     = 5.16 seconds
Compute duration  = 8.67 seconds
parameters        = α, β, β, β, σ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
parameters       mean       std   naive_se      mcse         ess      rhat   ess_per_sec
Symbol    Float64   Float64    Float64   Float64     Float64   Float64       Float64

α    32.9480    7.7468     0.1225    0.2364   1152.1382    1.0028      132.9032
β   -49.9599    6.9387     0.1097    0.2156   1136.4348    1.0032      131.0918
β    22.0431    3.5609     0.0563    0.1093   1155.7501    1.0035      133.3199
β     0.2728    0.8819     0.0139    0.0237   1398.8210    1.0020      161.3590
σ    17.8491    0.6193     0.0098    0.0131   2365.8711    1.0004      272.9116

Quantiles
parameters       2.5%      25.0%      50.0%      75.0%      97.5%
Symbol    Float64    Float64    Float64    Float64    Float64

α    18.5380    27.7723    32.6057    37.9096    49.3423
β   -62.7857   -54.5494   -50.2855   -45.4741   -35.3443
β    14.7913    19.7396    22.1681    24.4313    28.6965
β    -1.4185    -0.2770     0.2243     0.7787     2.1571
σ    16.6845    17.4242    17.8210    18.2614    19.0900


See the wall duration in Turing's chain_qr: for me it took around 5 seconds. Much faster than the regular linreg. Now we have to reconstruct our $\boldsymbol{\beta}$s:

betas = mapslices(
x -> R_ast^-1 * x, chain_qr[:, namesingroup(chain_qr, :β), :].value.data; dims=
)
chain_beta = setrange(
Chains(betas, ["real_β[$i]" for i in 1:size(Q_ast, 2)]), 1_001:1:2_000 ) chain_qr_reconstructed = hcat(chain_beta, chain_qr) Chains MCMC chain (1000×20×4 Array{Float64, 3}): Iterations = 1001:1:2000 Number of chains = 4 Samples per chain = 1000 parameters = real_β, real_β, real_β, α, β, β, β, σ internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size Summary Statistics parameters mean std naive_se mcse ess rhat Symbol Float64 Float64 Float64 Float64 Float64 Float64 real_β 6.2660 2.2533 0.0356 0.0315 3970.3435 1.0002 real_β 0.5045 0.0632 0.0010 0.0015 1846.1138 1.0017 real_β -0.0666 0.2154 0.0034 0.0058 1398.8210 1.0020 α 32.9480 7.7468 0.1225 0.2364 1152.1382 1.0028 β -49.9599 6.9387 0.1097 0.2156 1136.4348 1.0032 β 22.0431 3.5609 0.0563 0.1093 1155.7501 1.0035 β 0.2728 0.8819 0.0139 0.0237 1398.8210 1.0020 σ 17.8491 0.6193 0.0098 0.0131 2365.8711 1.0004 Quantiles parameters 2.5% 25.0% 50.0% 75.0% 97.5% Symbol Float64 Float64 Float64 Float64 Float64 real_β 1.9309 4.7406 6.2556 7.7621 10.6728 real_β 0.3815 0.4635 0.5050 0.5455 0.6267 real_β -0.5268 -0.1902 -0.0548 0.0676 0.3464 α 18.5380 27.7723 32.6057 37.9096 49.3423 β -62.7857 -54.5494 -50.2855 -45.4741 -35.3443 β 14.7913 19.7396 22.1681 24.4313 28.6965 β -1.4185 -0.2770 0.2243 0.7787 2.1571 σ 16.6845 17.4242 17.8210 18.2614 19.0900  ## Non-Centered Parametrization Now let's us explore Non-Centered Parametrization (NCP). This is useful when the posterior's topology is very difficult to explore as has regions where HMC sampler has to change the step size $L$ and the $\epsilon$ factor. This is I've showed one of the most infamous case in 5. Markov Chain Monte Carlo (MCMC): Neal's Funnel (Neal, 2003): using CairoMakie using Distributions funnel_y = rand(Normal(0, 3), 10_000) funnel_x = rand(Normal(), 10_000) .* exp.(funnel_y / 2) f, ax, s = scatter( funnel_x, funnel_y; color=(:steelblue, 0.3), axis=(; xlabel=L"X", ylabel=L"Y", limits=(-100, 100, nothing, nothing)), ) Neal's Funnel Here we see that in upper part of the funnel HMC has to take few steps $L$ and be more liberal with the $\epsilon$ factor. But, the opposite is in the lower part of the funnel: way more steps $L$ and be more conservative with the $\epsilon$ factor. To see the devil's funnel (how it is known in some Bayesian circles) in action, let's code it in Turing and then sample: @model function funnel() y ~ Normal(0, 3) return x ~ Normal(0, exp(y / 2)) end chain_funnel = sample(funnel(), NUTS(), MCMCThreads(), 1_000, 4) Chains MCMC chain (1000×14×4 Array{Float64, 3}): Iterations = 501:1:1500 Number of chains = 4 Samples per chain = 1000 Wall duration = 5.03 seconds Compute duration = 9.14 seconds parameters = y, x internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size Summary Statistics parameters mean std naive_se mcse ess rhat ess_per_sec Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64 y 0.9349 2.5471 0.0403 0.1919 132.2382 1.0329 14.4665 x -0.3946 6.4402 0.1018 0.3474 280.4549 1.0046 30.6810 Quantiles parameters 2.5% 25.0% 50.0% 75.0% 97.5% Symbol Float64 Float64 Float64 Float64 Float64 y -4.0096 -0.8760 0.7658 2.6657 6.0989 x -14.0106 -1.1051 -0.0337 0.7439 10.4440  Wow, take a look at those rhat values... That sucks: all are above 1.01 even with 4 parallel chains with 1,000 iterations! How do we deal with that? We reparametrize! Note that we can add two normal distributions in the following manner: $\text{Normal}(\mu, \sigma) = \text{Standard Normal} \cdot \sigma + \mu$ where the standard normal is the normal with mean $\mu = 0$ and standard deviation $\sigma = 1$. This is why is called Non-Centered Parametrization because we "decouple" the parameters and reconstruct them before. @model function ncp_funnel() x̃ ~ Normal() ỹ ~ Normal() y = 3.0 * ỹ # implies y ~ Normal(0, 3) return x = exp(y / 2) * x̃ # implies x ~ Normal(0, exp(y / 2)) end chain_ncp_funnel = sample(ncp_funnel(), NUTS(), MCMCThreads(), 1_000, 4) Chains MCMC chain (1000×14×4 Array{Float64, 3}): Iterations = 501:1:1500 Number of chains = 4 Samples per chain = 1000 Wall duration = 4.06 seconds Compute duration = 7.74 seconds parameters = x̃, ỹ internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size Summary Statistics parameters mean std naive_se mcse ess rhat ess_per_sec Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64 x̃ 0.0130 0.9902 0.0157 0.0182 3639.7339 1.0012 470.0677 ỹ -0.0447 0.9978 0.0158 0.0160 3864.2412 1.0000 499.0625 Quantiles parameters 2.5% 25.0% 50.0% 75.0% 97.5% Symbol Float64 Float64 Float64 Float64 Float64 x̃ -1.9084 -0.6314 0.0152 0.6657 1.9736 ỹ -2.0144 -0.7034 -0.0420 0.6243 1.9661  Much better now: all rhat are well below 1.01 (or below 0.99). How we would implement this a real-world model in Turing? Let's go back to the cheese random-intercept model in 10. Multilevel Models (a.k.a. Hierarchical Models). Here was the approach that we took, also known as Centered Parametrization (CP): @model function varying_intercept( X, idx, y; n_gr=length(unique(idx)), predictors=size(X, 2) ) #priors α ~ Normal(mean(y), 2.5 * std(y)) # population-level intercept β ~ filldist(Normal(0, 2), predictors) # population-level coefficients σ ~ Exponential(std(y)) # residual SD #prior for variance of random intercepts #usually requires thoughtful specification τ ~ truncated(Cauchy(0, 2); lower=0) # group-level SDs intercepts αⱼ ~ filldist(Normal(0, τ), n_gr) # CP group-level intercepts #likelihood ŷ = α .+ X * β .+ αⱼ[idx] return y ~ MvNormal(ŷ, σ^2 * I) end; To perform a Non-Centered Parametrization (NCP) in this model we do as following: @model function varying_intercept_ncp( X, idx, y; n_gr=length(unique(idx)), predictors=size(X, 2) ) #priors α ~ Normal(mean(y), 2.5 * std(y)) # population-level intercept β ~ filldist(Normal(0, 2), predictors) # population-level coefficients σ ~ Exponential(std(y)) # residual SD #prior for variance of random intercepts #usually requires thoughtful specification τ ~ truncated(Cauchy(0, 2); lower=0) # group-level SDs intercepts zⱼ ~ filldist(Normal(0, 1), n_gr) # NCP group-level intercepts #likelihood ŷ = α .+ X * β .+ zⱼ[idx] .* τ return y ~ MvNormal(ŷ, σ^2 * I) end; Here we are using a NCP with the zⱼs following a standard normal and we reconstruct the group-level intercepts by multiplying the zⱼs by τ. Since the original αⱼs had a prior centered on 0 with standard deviation τ, we only have to use the multiplication by τ to get back the αⱼs. Now let's see how NCP compares to the CP. First, let's redo our CP hierarchical model: url = "https://raw.githubusercontent.com/storopoli/Bayesian-Julia/master/datasets/cheese.csv" cheese = CSV.read(HTTP.get(url).body, DataFrame) for c in unique(cheese[:, :cheese]) cheese[:, "cheese_$c"] = ifelse.(cheese[:, :cheese] .== c, 1, 0)
end

cheese[:, :background_int] = map(cheese[:, :background]) do b
if b == "rural"
1
elseif b == "urban"
2
else
missing
end
end

X = Matrix(select(cheese, Between(:cheese_A, :cheese_D)));
y = cheese[:, :y];
idx = cheese[:, :background_int];

model_cp = varying_intercept(X, idx, y)
chain_cp = sample(model_cp, NUTS(), MCMCThreads(), 1_000, 4)
Chains MCMC chain (1000×21×4 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 4
Samples per chain = 1000
Wall duration     = 10.52 seconds
Compute duration  = 17.18 seconds
parameters        = α, β, β, β, β, σ, τ, αⱼ, αⱼ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
parameters       mean       std   naive_se      mcse         ess      rhat   ess_per_sec
Symbol    Float64   Float64    Float64   Float64     Float64   Float64       Float64

α    70.6428    5.0487     0.0798    0.1737    770.1477    1.0046       44.8360
β     2.9791    1.3144     0.0208    0.0258   2172.3735    1.0004      126.4699
β   -10.5427    1.3601     0.0215    0.0303   2084.5327    1.0002      121.3560
β     6.5850    1.3731     0.0217    0.0292   2098.4036    1.0007      122.1636
β     1.1433    1.3117     0.0207    0.0299   2069.7945    1.0003      120.4980
σ     7.3996    0.4542     0.0072    0.0080   2613.2512    1.0008      152.1366
τ     6.3827    8.5037     0.1345    0.3209    800.8580    1.0184       46.6239
αⱼ    -3.3743    4.9339     0.0780    0.1727    757.3016    1.0050       44.0881
αⱼ     3.6766    4.9634     0.0785    0.1742    748.6351    1.0056       43.5836

Quantiles
parameters       2.5%      25.0%      50.0%     75.0%     97.5%
Symbol    Float64    Float64    Float64   Float64   Float64

α    60.7302    68.3134    70.7241   73.0491   80.2703
β     0.3954     2.1190     2.9438    3.8796    5.5242
β   -13.2151   -11.4684   -10.5399   -9.6328   -7.8694
β     3.9123     5.6639     6.5755    7.4932    9.2530
β    -1.4944     0.2537     1.1339    2.0287    3.6481
σ     6.5570     7.0811     7.3764    7.6938    8.3492
τ     1.8508     3.2115     4.5971    6.8672   20.0755
αⱼ   -12.7622    -5.6539    -3.3432   -1.1848    6.2812
αⱼ    -5.7783     1.3861     3.5480    5.7955   13.6332


Now let's do the NCP hierarchical model:

model_ncp = varying_intercept_ncp(X, idx, y)
chain_ncp = sample(model_ncp, NUTS(), MCMCThreads(), 1_000, 4)
Chains MCMC chain (1000×21×4 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 4
Samples per chain = 1000
Wall duration     = 15.18 seconds
Compute duration  = 27.18 seconds
parameters        = α, β, β, β, β, σ, τ, zⱼ, zⱼ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
parameters       mean       std   naive_se      mcse         ess      rhat   ess_per_sec
Symbol    Float64   Float64    Float64   Float64     Float64   Float64       Float64

α    70.6381    3.9661     0.0627    0.2270    276.3623    1.0032       10.1686
β     3.0184    1.3067     0.0207    0.0343   1430.7232    1.0010       52.6427
β   -10.5484    1.3543     0.0214    0.0345   1795.1054    1.0000       66.0499
β     6.5794    1.3084     0.0207    0.0366   1155.6678    1.0000       42.5222
β     1.1558    1.3067     0.0207    0.0347   1472.3222    1.0000       54.1733
σ     7.3846    0.4668     0.0074    0.0103   2309.3984    1.0003       84.9731
τ     5.0347    2.4221     0.0383    0.1032    576.5776    1.0026       21.2149
zⱼ    -0.8461    0.8249     0.0130    0.0371    543.0646    1.0028       19.9818
zⱼ     0.8675    0.8002     0.0127    0.0256    989.2219    1.0007       36.3979

Quantiles
parameters       2.5%      25.0%      50.0%     75.0%     97.5%
Symbol    Float64    Float64    Float64   Float64   Float64

α    61.2831    68.3926    70.8893   73.1427   78.1953
β     0.5016     2.1144     3.0451    3.9067    5.4755
β   -13.1368   -11.4724   -10.5733   -9.6324   -7.7688
β     4.0680     5.6634     6.5795    7.4830    9.0380
β    -1.4639     0.3336     1.1558    2.0353    3.7254
σ     6.5451     7.0536     7.3621    7.6882    8.3494
τ     1.8278     3.2063     4.4428    6.4645   11.0497
zⱼ    -2.5191    -1.3926    -0.8403   -0.2763    0.7263
zⱼ    -0.6093     0.2939     0.8572    1.4003    2.4570


Notice that some models are better off with a standard Centered Parametrization (as is our cheese case here). While others are better off with a Non-Centered Parametrization. But now you know how to apply both parametrizations in Turing. Before we conclude, we need to recover our original αⱼs. We can do this by multiplying zⱼ[idx] .* τ:

τ = summarystats(chain_ncp)[:τ, :mean]
αⱼ = mapslices(
x -> x * τ, chain_ncp[:, namesingroup(chain_ncp, :zⱼ), :].value.data; dims=
)
chain_ncp_reconstructed = hcat(
MCMCChains.resetrange(chain_ncp), Chains(αⱼ, ["αⱼ[\$i]" for i in 1:length(unique(idx))])
)
Chains MCMC chain (1000×23×4 Array{Float64, 3}):

Iterations        = 1:1000
Number of chains  = 4
Samples per chain = 1000
Wall duration     = 15.18 seconds
Compute duration  = 27.18 seconds
parameters        = α, β, β, β, β, σ, τ, zⱼ, zⱼ, αⱼ, αⱼ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
parameters       mean       std   naive_se      mcse         ess      rhat   ess_per_sec
Symbol    Float64   Float64    Float64   Float64     Float64   Float64       Float64

α    70.6381    3.9661     0.0627    0.2270    276.3623    1.0032       10.1686
β     3.0184    1.3067     0.0207    0.0343   1430.7232    1.0010       52.6427
β   -10.5484    1.3543     0.0214    0.0345   1795.1054    1.0000       66.0499
β     6.5794    1.3084     0.0207    0.0366   1155.6678    1.0000       42.5222
β     1.1558    1.3067     0.0207    0.0347   1472.3222    1.0000       54.1733
σ     7.3846    0.4668     0.0074    0.0103   2309.3984    1.0003       84.9731
τ     5.0347    2.4221     0.0383    0.1032    576.5776    1.0026       21.2149
zⱼ    -0.8461    0.8249     0.0130    0.0371    543.0646    1.0028       19.9818
zⱼ     0.8675    0.8002     0.0127    0.0256    989.2219    1.0007       36.3979
αⱼ    -4.2599    4.1534     0.0657    0.1869    543.0646    1.0028       19.9818
αⱼ     4.3677    4.0290     0.0637    0.1291    989.2219    1.0007       36.3979

Quantiles
parameters       2.5%      25.0%      50.0%     75.0%     97.5%
Symbol    Float64    Float64    Float64   Float64   Float64

α    61.2831    68.3926    70.8893   73.1427   78.1953
β     0.5016     2.1144     3.0451    3.9067    5.4755
β   -13.1368   -11.4724   -10.5733   -9.6324   -7.7688
β     4.0680     5.6634     6.5795    7.4830    9.0380
β    -1.4639     0.3336     1.1558    2.0353    3.7254
σ     6.5451     7.0536     7.3621    7.6882    8.3494
τ     1.8278     3.2063     4.4428    6.4645   11.0497
zⱼ    -2.5191    -1.3926    -0.8403   -0.2763    0.7263
zⱼ    -0.6093     0.2939     0.8572    1.4003    2.4570
αⱼ   -12.6830    -7.0114    -4.2305   -1.3910    3.6565
αⱼ    -3.0677     1.4797     4.3158    7.0499   12.3705