1. Introduction
Natural resources management is the balance between harvesting and its ecological implications. It is important to harvest in such a way that a species is sustainable and not becoming endangered or going extinct. For instance, according to the Food and Agriculture Organization of the United Nations, three quarters of the world's fish stocks are fully exploited or over-exploited and the proportion of those stocks that are too intensively exploited is growing. These statistics prove the fact that natural resources need to be managed with an effective and carefully defined objective in order to prevent over-harvesting and to allow the depleted stock to replenish. As a consequence, scarce resource management increasingly involves restoration and conservation objectives, along with the more conventional ecological and economic objectives that are identification of desirable levels of the natural resource and profitability from harvesting. Recent examples are the restoration plans discussed and/or adopted by the European Commission for several collapsed stocks in the E.U. waters, or the international commitment by the countries present at the 2002 Johannesburg Summit on Sustainable Development to return fisheries to levels allowing their maximum sustainable yield by 2015. For example, the objective of the precautionary approach promoted by the International Council for the Exploitation of the Sea region is to maintain spawning stock above a limit reference point Blim, while keeping fishing mortality below a limit Flim. A criticism of this approach is that it adopts a viewpoint which is too ichthyocentric, as it focuses on the conservation of fish populations and stocks only. Social and economic considerations are not included and the question of an acceptable income for the manager is not considered.
The key idea of this paper is to maximize harvest in a sustainable manner, since we want the greatest catch to supply the demand for the natural resource, but we do not want to deplete the population to keep diversity of resources and allow it to be harvested in the future. The concept of sustainable harvesting refers to methods designed not to over-exploit the resources, leads to the definition of measures and rules including fines delivered by authorities to avoid over-harvesting. The theoretical problem is, therefore, to determine a cost rule for fines in order to allow the conservation of the natural resources that are exploited by humans according to a sustainable perspective. In this respect, the amount of fines and/or prohibition on harvesting are to ensure that the natural resource population does not fall below a certain threshold that guarantees its natural renewal. But it must also allow the manager to make profits to prevent him/her going bankrupt.
Efficiency in managing the exploitation of manager resources has been widely analyzed in resource literature. In general terms, these studies use deterministic models that consider that an efficient policy consists of maintaining the exploitation levels of the harvesting ground at steady-state values. Clarke and Reed [Reference Clark and Reed3,Reference Reed and Clark15] introduced price and growth uncertainty in a forest harvest model, modeling the price process as geometric Brownian motion and assuming stock growth to be age or size dependent. Recently, some papers consider the case when the growth of the a fish stock is stochastic, for example Danielsson [Reference Danielsson5], Weitzman [Reference Weitzman20] and Nostbakken [Reference Nostbakken12], and others consider the price is stochastic, for example Murillas and Chamorro [Reference Murillas and Chamorro11] and Nostbakken [Reference Nostbakken12]. In these cases, optimal control theory has proven to be a suitable technique to design optimal harvesting strategies (see, e.g., [Reference Conrad and Clark4,Reference Kharroubi, Lim and Ly Vath8,Reference Reed and Heras16]).
The purpose of this study is to analyze how uncertainty in stock growth and price, prohibition to harvest if the quantity of natural resource available is smaller than a level and tax influence the optimal harvest of the natural resource. In our modeling, managers are controlled at fixed dates, while harvesting is continuous depending on the quantity of natural resource available in the harvesting region (the natural resource population evolves according to a logistic stochastic differential equation). They are prohibited to harvest if the quantity of natural resource is lower than a level when there is a control, and they must pay a fine in case of exceeding their harvesting quota at the maturity of the problem. They are therefore seeking to maximize their profit, that is, the quantity of natural resource to harvest given the condition of prohibition and the fine to be paid if their quota is exceeded.
Unlike the other articles dealing with this issue (see, e.g., [Reference Nostbakken, Thébaud and Sorensen13]), the selling price of natural resource is not constant; it seems reasonable to assume that the price depends on the quantity of natural resources remaining in the harvesting region. Consequently, it is endogenous to the problem of sustainable harvesting. That can be justified because the evolution of the price according to scarcity is a basic rule in economics. Given that, we will show how the resolution of this problem allows, on the one hand, to explain the behavior of the manager according to the amount of the fines, and, on the other hand, to fix a rule of price for the fines to guarantee a sustainable harvesting.
The remainder of this article is structured as follows. Section 2 presents the problem formulation: the function of the expected profit for manager and its two value functions. Section 3 characterizes the value functions by a verification result involving Hamilton Jacobi Bellman (HJB in short) equations. Section 4 provides numerical results and interpretations which allow us to understand how the manager adapts his/her strategy w.r.t. the fines. This allows to fix a level of fines to insure the sustainability of the resource. Concluding remarks are offered in Section 5.
2. Problem formulation
2.1. The model
Let $(\Omega, {\mathcal {F}}, \mathbb {P})$ be a complete probability space. We assume that this space is equipped with two one-dimensional standard Brownian motions $B$ and $W$. We denote by $\mathbb {F} := ({\mathcal F} _t)_{0\leq t \leq T}$ the right continuous complete filtration generated by these two Brownian motions, where $T$ is a positive constant which corresponds to the maturity of the problem. We assume that the correlation between the two Brownian motions is given by $\langle B, W \rangle _t = \rho \; t$.
In the sequel, we consider a manager who can harvest in a harvesting area, and we denote by $X_t$ the quantity of natural resource available in this area at time $t$. In the past, several articles, see for example Schaefer [Reference Schaefer18] or Pella and Tomlinson [Reference Pella and Tomlinson14], proposed to use a logistic model to represent the natural resource growth if there is no harvest, this model is given by
where $\eta$ and $\lambda$ are positive constants, $\eta \lambda$ corresponds to the intrinsic rate of population growth and $1/ \lambda$ is the carrying capacity of the environment. The model is interesting since it is well known that, for natural resource stocks, the growth rate is inversely related to the stock level because of natural constraints, and that is well represented with the logistic growth model (see, e.g., in [Reference Conrad and Clark4]). However, since the evolution of the natural resource depends on perturbations due to environmental and other factors, we add a term which models these perturbations by using a Brownian motion, which is called the classical logistic stochastic differential equation (see, e.g., [Reference Sarkar17]) and given by
where $\eta$, $\lambda$, and $\gamma$ are three positive constants. It is well known that the previous SDE admits a unique strong solution that does not reach either zero or infinity in finite time. Furthermore, it has a closed-form formula (see, e.g., [Reference Skiadas19]). The product $\eta \lambda$ corresponds to the intrinsic rate of population growth and $1/ \lambda$ is the carrying capacity of the environment. This model is, for example, used in Nostbakken [Reference Nostbakken12] or Kvamsdal et al. [Reference Kvamsdal, Poudel and Sandal10]. We assume that the manager can harvest this resource and we denote by $\alpha _t$ the harvest rate at time $t$. For a given strategy $\alpha =(\alpha _t)_{0 \leq t \leq T}$, $X^{\alpha }_t$ denotes the associated quantity of natural resource available at time $t$, thus this one follows the stochastic differential equation
The manager sells the harvest on the market at time $t$ for the price $P_t$ by unit, where the price $P$ evolves according to the following stochastic differential equation
where $\sigma$ is a positive constant and $\mu$ is a map from $\mathbb{R} _+$ to $\mathbb{R} _+$ which corresponds to the drift of the price. We can see in the literature that some authors choose to model the price by a geometric brownian motion (see, e.g., [Reference Murillas and Chamorro11] or [Reference Nostbakken12]), that means the map $\mu$ is a constant in our case. We propose to add a dependence of $\mu$ w.r.t. the quantity of the resource since we can remark, in fish or wood markets for instance, that if the quantity of the resource is low then the price is expensive. In Kvansdal et al. [Reference Kvamsdal, Poudel and Sandal10], the authors assume that the price is mean-reverting and depends on the harvest rate. More precisely, a higher harvest makes the price lower. Here, we model a scarce resource management where the price depends more on the available quantity of resource than on the harvest.
$(\mathbf {H}\mu )$ $\mu : \mathbb{R} _+\rightarrow \mathbb{R} _+$ is a nonincreasing and Lipschitz continuous: there exists a positive constant $L$ such that
for all $x, x' \in \mathbb{R} _+$.
In Assumption $(\mathbf {H}\mu )$, the monotonicity condition means the greater is the quantity of natural resource available the lower is the price of the natural resource.
We consider a positive increasing sequence $(T_i)_{1 \leq i \leq N}$ where each $T_i$ represents the time at which the regulatory body checks the quantity of natural resource available $X^{\alpha }$ with $T_N=T$. We assume that $T_i$ is a constant for any $i \in \{1, \ldots, N\}$. If $X^{\alpha }_{T_i} \gt \Gamma$, then the manager can continue to harvest, if $X^{\alpha }_{T_i} \leq \Gamma$, then the manager can no more harvest until the next checking time. If so, the first time the manager is permitted to resume harvesting can be represented mathematically as follows: $\tau ^{\alpha }_i := \inf \{T_k, \; k \geq i : X^{\alpha }_{T_k} \gt \Gamma \}$.
We define the set ${\mathcal {A}}$ of admissible controls as the set of strategies $\alpha$ which are an $\mathbb {F}$-adapted process defined in $[0, \bar a]$, $X^{\alpha }$ is nonnegative and $\alpha$ is null on $[T_i, \tau ^{\alpha }_i)$ for any $1 \leq i \leq N$. The harvest rate is upper bounded by the constant $\bar a$. This last assumption is natural since the manager has some technical constraint and he/she cannot harvest more than a given quantity, which depends for example, on the fishing boat or the truck's dump body volume when harvesting trees.
The standard assumption that an agent seeks to maximize the expected present value of net revenues from the harvesting on the interval $[0,T]$ to the dynamic constraints is made. Thus, the objective of the manager is given by
and finding an optimal strategy $\alpha ^{*} \in {\mathcal {A}}$ such that
where $\beta$ is a positive constant corresponding on the discount rate, $(\cdot )^{+}$ denotes the positive part, $C$ is a positive increasing convex function representing the cost of harvesting, and $f$ is a map from $\mathbb{R} _+\times \mathbb{R} _+$ to $\mathbb{R} _+$ which corresponds to a tax that the manager must pay if at time $T$ the quantity of natural resource available $X^{\alpha }_T$ is lower than the level $\Gamma$. This tax depends on the quantity of natural resource available and also on the natural resource price at time $T$. Indeed, if the tax does not depend on the natural resource price, then the manager would be willing to pay it if the natural resource price is high since the earnings by selling the harvest will hedge the tax. On the contrary, if the natural resource price is low, then these earnings do not hedge the tax, and the manager would not accept to pay.
$(\mathbf {H}f)$ $f: \mathbb{R} _+\times \mathbb{R} _+ \rightarrow \mathbb{R} _+$ is a nondecreasing and Lipschitz function w.r.t. both of its arguments: there exists a positive constant $L$ such that
for all $(x,y), (x',y') \in \mathbb{R} _+\times \mathbb{R} _+$, and $f(0,y)=0$ for any $y \in \mathbb{R} _+$.
2.1.1. An example of explicit solution
We describe in this paragraph a case where an explicit solution to (2.1) can be computed.
Take $f\equiv 0$, $\gamma \equiv 0$, and $C(a)=-a^{2}/2$ for $a\in [0,\bar a]$. Suppose that $X^{a}$ and $P$ are given by
for $t\in [0,T]$, where $\mu$ and $\eta$ are constants and $\lambda$ and $\sigma$ are positive constants. We recall that the process $B$ is a standard one-dimensional Brownian motion. The value function is then given by
Proposition 2.1 Suppose that
and
where
Then, an optimal strategy is given by
and we have
where $F$ is the cumulative distribution function of ${{\mathcal {N}}(0,1)}$.
Proof. We proceed in four steps.
Step 1. We first notice that the value function can be rewritten as
where $\bar {\mathcal {A}}$ is the set of strategies $\alpha$ which are $\mathbb {F}$-adapted processes defined in $[0, \bar a]$ such that $X^{\alpha }$ is nonnegative.
Step 2. Denote by $X^{\bar a}$ the solution to the SDE
Then $X^{\bar a}$ is uniquely defined as a solution to a locally Lipschitz ordinary differential equation and by using the classical results about Riccati equation, we get
From (2.3), we get that $X^{\bar a}$ is nondecreasing. Then, from (2.4), we get
for all $t\in [0,T]$.
Step 3. We next have
for any adapted process $\alpha$ valued in $[0,\bar a]$. Indeed, denote by $\delta X$ the process $X^{\alpha }-X^{\bar a}$. This process is solution to
with $\Delta _t= \eta (\lambda - X^{\alpha }_t-X^{\bar a}_t)$. Therefore, we get
for all $t\in [0,T]$.
Step 4. We deduce from (2.5), Step 2 and Step 3 that
for any $t\in [0,T]$ and any adapted process $\alpha$ valued in $[0,\bar a]$. Therefore, $\bar {\mathcal {A}}$ is the set of adapted processes $\alpha$ valued in $[0,\bar a]$ and we have
Maximizing the term inside the integral we get from the first-order condition
and we have
Following a computation similar to that of the Call and Put prices in the Black & Scholes model, we have
and
which allows to get the final expression for $V_0(x,p)$.
In the previous example, we are able to derive a computable representation of the value function and give the associated optimal strategy. This model remains relevant as it takes into account the harvesting effort via the quadratic term $\alpha _t^{2}$. Moreover, the explicit computation of the optimal strategy can be done since the conditions on the parameters ensure the constraint related to $\Gamma$. We notice that this optimal strategy is nondecreasing in the price resource $P_t$ and the maximal effort $\bar a$. This behavior is quite natural as for a higher price, the manager should harvest more to increase the gain.
Unfortunately, we cannot always compute explicit solutions for our optimization problem due to the complexity of the state space. We therefore provide in the sequel a PDE characterization of the value function.
2.2. The value function
In order to provide an analytic characterization of the value function $V_0$ defined by (2.1), we need to extend the definition of this control problem to general initial conditions.
Unfortunately, the considered controlled system is not Markovian. Indeed, the control process $\alpha$ is subject to the constraint that is fixed only at each time $T_k$ but holds over $[T_k,T_{k+1})$. Thus, we need to keep in mind the constraint and we therefore consider two cases (we can harvest or we can not) and two value functions. This approach is inspired by Bruder and Pham [Reference Bruder and Pham1] who consider a delayed controlled system. They enlarge the controlled system to make it Markovian. Similarly, we enlarge our system by adding a parameter which indicates whether the agent is allowed to harvest or not on the considered period $[T_k,T_{k+1})$. However, we notice that our resulting partial differential equation (in short PDE) is different from theirs since we get a coupled system whereas they get a recursive one.
For any $t \in [0,T]$, $x \geq 0$ and $i \in \{0,1\}$, we denote ${\mathcal A} _{t,i}(x)$ the set
where $q(t) := \sup \{j , T_j \leq t\}$.
Let ${\mathcal {Z}}:= \mathbb{R} _+ \times (0,+\infty ) \times \{0,1\}$. For $z=(x,p,i) \in {\mathcal {Z}}$ and $\alpha \in {\mathcal A} _{t,i}(x)$, we denote by $Z^{t,z,\alpha } := (X^{t,x,\alpha }, P^{t,z,\alpha },I^{t,z,\alpha })$ the triple of processes defined by
For any $t \in [0,T]$ and $z\in {\mathcal {Z}}$, we consider the value function $v$ defined by
We also consider the two value functions $v_0$ and $v_1$ defined on $[0,T] \times \mathbb{R} _+ \times (0,+\infty )$ by
The value function $v_0$ corresponds to the case where at time $t$ the manager can not harvest until the next checking time, while the value function $v_1$ corresponds to the case where at time $t$ the manager can harvest.
3. HJB characterization
We use the HJB equation to characterize the value functions $v_0$ and $v_1$. The HJB equations related to the value functions $v_0$ and $v_1$ are for any $x \in \mathbb{R} _+$ and $p \in (0,+\infty )$
and
where ${\mathcal {L}}^{a}$ is the operator associated to the diffusions
Let $C^{0}$ be the set of continuous functions and $C^{1,2}$ be the set of functions that are differentiable with continuous derivative in their first argument and twice differentiable with continuous second derivatives in their second argument. We have the following verification result.
Theorem 3.1 Let $w^{0}$ and $w^{1}$ be two functions in $C^{1,2}([T_j, T_{j+1}) \times \mathbb{R} _+ \times (0,+\infty )) \;\cap \; C^{0}([T_j,T_{j+1}] \times \mathbb{R} _+ \times (0,+\infty ))$ for any $j \in \{0, \ldots, N-1\}$, with $T_0=0$, and satisfying a quadratic growth condition, that is, there exists a positive constant $C$ such that
(i) Suppose that for any $x \in \mathbb{R} _+$ and $p \in (0,+\infty )$, we have
(3.3)\begin{equation} \left\{\begin{array}{l} - \partial_t w_0 (t,x,p) - {\mathcal{L}}^{0} w_0(t,x,p) \geq 0, \quad t \in [0,T] - \{T_j\}_{1 \leq j \leq N} \\ w_0(T^{-}_j,x,p) \geq w_0(T_j,x,p) {\mathbb 1}_{x \leq \Gamma} + w_1(T_j,x,p) {\mathbb 1}_{x \gt \Gamma}, \quad j \in \{1, \ldots N-1\} \\ w_0(T^{-},x,p) \geq{-}f((\Gamma-x)^{+},p) \end{array}\right. \end{equation}and(3.4)\begin{equation} \left\{\begin{array}{l} - \partial_t w_1(t,x,p) -\displaystyle\sup_{0 \leq a \leq \bar a}\{ {\mathcal{L}}^{a} w_1(t,x,p) + pa-C(a)\} \geq 0, \quad t \in [0,T] - \{T_j\}_{1 \leq i \leq N} \\ w_1(T^{-}_j,x,p) \geq w_0(T_j,x,p){\mathbb 1}_{x\leq \Gamma} + w_1(T_j,x,p){\mathbb 1}_{x \gt \Gamma}, \quad j \in \{1, \ldots N-1\} \\ w_1(T^{-},x,p) \geq{-}f((\Gamma-x)^{+},p). \end{array}\right. \end{equation}Then, the function $w$ defined by $w(t,z) :=w_0(t,x,p){\mathbb 1}_{i=0} + w_1(t,x,p){\mathbb 1}_{i=1}$ satisfies $w(t,z) \geq v(t,z)$ on $[0,T] \times {\mathcal {Z}}$.(ii) Suppose further that for any $z \in {\mathcal {Z}}$, there exists a measurable function $\hat \alpha (t,z)$ valued in $[0, \bar a]$ such that if $i =0$, we have
$$-\partial_t w(t,z) - {\mathcal{L}}^{0} w(t,z) = 0$$and if $i=1$, we have\begin{align*} & \partial_t w(t,z) + \sup_{a \in [0, \bar a]} [ {\mathcal{L}}^{a} w(t,z) + pa -C(a) ]\\ & \quad =\partial_t w(t,z) + {\mathcal{L}}^{\hat \alpha(t,z)} w(t,z) + p\hat \alpha(t,z)-C(\hat \alpha(t,z)) =0 \end{align*}with$$w(T^{-}_j,z) = w(T_j,x,p,0) {\mathbb 1}_{x \leq \Gamma} + w(T_j,x,p,1) {\mathbb 1}_{x \gt \Gamma}, \quad (j,z) \in \{1, \ldots, N-1\} \times {\mathcal{Z}}$$and$$w(T^{-},z) ={-}f((\Gamma-x)^{+},p)$$the stochastic differential equations\begin{align*} X^{t,x,\hat{\alpha}}_s & = x + \int_t^{s} \eta X^{t,x,\hat{\alpha}}_u ( \lambda - X^{t,x,\hat{\alpha}}_u) du + \int_t^{s} \gamma X^{t,x,\hat{\alpha}}_u dB_u - \int_t^{s} \hat{\alpha}_u du\\ P^{t,z,\hat{\alpha}}_s & = p + \int_t^{s} \mu(X^{t,x,\hat{\alpha}}_u) P^{t,z,\hat{\alpha}}_u du + \int_t^{s} \sigma P^{t,z,\hat{\alpha}}_u dW_u \\ I^{t,z, \hat \alpha}_s & = i {\mathbb 1}_{t \leq s \lt T_{q(t) + 1}} + \sum_{k=q(t)+1}^{N-1} {\mathbb 1}_{X^{t,x,\hat{\alpha}}_{T_k} \gt \Gamma} {\mathbb 1}_{T_k \leq s \lt T_{k + 1}} \end{align*}admit a unique solution, denoted by $\hat Z^{t,z}_s$ given an initial condition $Z_t=z$, and the process $\{\hat {\alpha }(s, \hat Z^{t,z}_s), \; t \leq s \leq T\}$ lives in ${\mathcal A} _{t,i}(x)$. Then,$$w=v \quad \text{on } [0,T] \times {\mathcal{Z}}$$and $\hat \alpha$ is an optimal Markovian control.
Proof. In the proof, to simplify the notation, we introduce $K^{t,k,\alpha }_s := (X^{t,x,\alpha }_s, P^{t,z,\alpha }_s)$ and $k :=(x,p)$ for any $z \in {\mathcal {Z}}$ and $\alpha \in {\mathcal A} _{t,i}(x)$.
(i) We prove by induction that $w \geq v$ on $[T_j, T_{j+1}]$ for any $j \in \{0, \ldots, N-1\}$.
We first consider the case $j=N-1$ and $i=0$, that means the manager can not harvest on $[T_{N-1},T_N]$, thus $v(t,z)=\mathbb {E}[-e^{-\beta (T-t)} f( ( \Gamma - X^{t,x,0}_T)^{+},P^{t,p,0}_T)]$.
Since $w^{0}$ is $C^{1,2}([T_{N-1},T_N) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{N-1},T_N] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{N-1},T_N) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,0}(x)$, $s \in [t,T_{N})$, and any stopping time $\tau$ valued in $[t,T]$, by Itô's formula
We choose $\tau = \tau _n := \inf \{s \geq t : \int _t^{s} (|\partial _x w_0(u, K^{t,k,\alpha }_u) $ $X^{t,x,\alpha }_u|^{2} + |\partial _p w_0(u, K^{t,k,\alpha }_u) P^{t,z,\alpha }_u|^{2}) du \geq n\} \wedge T$ and we remark $(\tau _n)_{n \geq 1}$ is an increasing sequence going to $T$ when $n$ goes to $\infty$. By taking the expectation, we get
Since $w_0$ satisfies (3.3), we have
By the quadratic growth condition on $w_0$ and the integrability condition on $K^{t,k,\alpha }$, we may apply the dominated convergence theorem and send $n$ to infinity
By sending $s$ to $T_N$, we obtain by the dominated convergence theorem
which implies
We now consider the case $j=N-1$ and $i=1$. Since $w^{1}$ is $C^{1,2}([T_{N-1},T_N) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{N-1},T_N] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{N-1},T_N) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,1}(x)$, $s \in [t,T_N)$, and any stopping time $\tau$ valued in $[t,T]$, by Itô's formula
We choose $\tau = \tau _n := \inf \{s \geq t : \int _t^{s} (|\partial _x w_1(u, K^{t,k,\alpha }_u)$ $X^{t,x, \alpha }_u|^{2} + |\partial _p w_1(u, K^{t,k,\alpha }_u) P^{t,z,\alpha }_u|^{2}) du \geq n\} \wedge T$ and we remark $(\tau _n)_{n \geq 1}$ is an increasing sequence going to $T$ when $n$ goes to infinity. This stopping time ensures that the coefficients appearing in the stochastic integrals are bounded so they are martingales. By taking the expectation, we get
By using (3.4), we get
By sending $n$ to infinity, we obtain by the dominated convergence theorem
By sending $s$ to $T^{-}_N$, we obtain by the dominated convergence theorem
Which implies for any $\alpha \in {\mathcal A} _{t,1}(x)$
Thus, $v(t,z) \leq w(t,z)$ for any $[T_{N-1},T_N] \times {\mathcal {Z}}$.
We now suppose the result holds on $[T_{j},T_{j+1}]$ for one $j \in \{1, \ldots,N-1\}$. We first consider the case $i=0$. Since $w^{0}$ is $C^{1,2}([T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{j-1},T_j] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,0}(x)$, $s \in [t,T_{j})$, and any stopping time $\tau$ valued in $[t,T_j]$, by Itô's formula
By using the same technics that previously we get
By using the condition at time $T^{-}_j$ for $w_0$, we get
We now consider the case $i=1$. Since $w^{1}$ is $C^{1,2}([T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{j-1},T_j] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,1}(x)$, $s \in [t,T_{j})$, and any stopping time $\tau$ valued in $[t,T_j]$, by Itô's formula
By using the previous arguments, we obtain
By using the condition at time $T^{-}_j$ for $w_1$, we get
Then for any $\bar \alpha \in {\mathcal A} _{T_j,I^{t,i}_{T_j}}(X^{t,x,\alpha }_{T_j})$, we get
which implies for any $\alpha \in {\mathcal A} _{t,i}(x)$ we get
Thus, $w_1(t,x,p) \geq v(t,z)$.
(ii) We prove by induction that $w=v$ on $[T_j,T_{j+1}]$ for any $j \in \{0, \ldots, N-1\}$.
We first consider the case $j=N-1$ and $i=0$. We apply Itô's formula to $e^{-\beta u}w(u, \hat {Z}^{t,z}_u)$ between $t \in [T_{N-1},T_N)$ and $s \in [t,T)$ (after a localization for removing the stochastic integral term in the expectation)
Thus, we get
We now consider the case $j=N-1$ and $i=1$. We apply Itô's formula to $e^{-\beta u}w(u, \hat {Z}^{t,z}_u)$ between $t \in [T_{N-1},T_N)$ and $T_N$ (after a localization for removing the stochastic integral term in the expectation)
Which implies
Thus, $w(t,z)=J(t,z,\hat \alpha )=v(t,z)$ on $[T_{N-1},T_N] \times \mathbb{R} _+ \times (0,+\infty )$ with $i=1$.
We now suppose the result holds on $[T_j, T_{j+1}]$ for one $j \in \{1, \ldots, N-1\}$. We first consider the case $i=0$. Since $w$ is $C^{1,2}([T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{j-1},T_j] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )$ by using the previous technics
By using the condition at time $T^{-}_j$ for $w$, we get
We now consider the case $i=1$. Since $w$ is $C^{1,2}([T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{j-1},T_j] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,1}(x)$, by Itô's formula
By using the condition at time $T^{-}_j$ for $w$, we get
4. Numerical results
4.1. The discrete problem
In this section, we introduce the numerical tools that we use to solve numerically the HJB equations linked to $v_0$ and $v_1$ and associated to the stochastic control problem. We use a finite difference scheme mixed with an iterative procedure which leads to the resolution of a Controlled Markov Chain problem. This class of problems is intensely studied by Kushner and Dupuis [Reference Kushner and Dupuis9]. The convergence of the solution of the numerical scheme toward the solution of the HJB equation, when the time-space step goes to zero, can be shown using the standard local consistency argument, that is, the first and the second moments of the approximating Markov chain converge to those of the continuous process $(X,P)$. We refer to [Reference Budhiraja and Ross2,Reference Hindy, Huang and Zhu6,Reference Jin, Yin and Zhu7] for numerical schemes involving a Controlled Markov Chain control problem.
We begin by localizing the problem on the bounded domain $[0,T]\times [0,x_{{\rm max}}]\times [p_{{\rm min}},p_{{\rm max}}]$, where $x_{{\rm max}}$, $p_{{\rm min}}$, and $p_{{\rm max}}$ are nonnegative constants. Then, we assume the following Neumann boundary conditions on the localized boundary
Let $\delta$, $h$, and $k$ be the discretization steps along the directions $t$, $x$, and $p$ respectively. For $(t,x,p)$ in the time-space grid
where $n_t=T/\delta +1$, $n_x=x_{{\rm max}}/h+1$ and $n_p=(p_{{\rm max}}-p_{{\rm min}})/k+1$.
We consider approximations of the following form
Let us introduce the following quantities which are used to approximate the value functions $v_0$ and $v_1$
In Table 1, we define the Markov chain states and the associated transition probabilities that we obtain when we apply the finite difference approach.
Thus, using the above notations and discretizing the space of controls as follows
where $n_a\in \mathbb {N}^{*}$, we approximate the HJB equations associated to the functions $v_0$ and $v_1$ for any $(x,p) \in [0,x_{{\rm max}}]\times [p_{{\rm min}},p_{{\rm max}}]$ by the following iterative scheme by starting with $v_0^{\delta,0}\equiv 0$ and $v_1^{\delta,0}\equiv 0$
for $t\in [0,T]- \{T_j\}_{1 \leq j \leq N}$,
for $j\in \{1,\ldots,N-1\}$,
and
for $t\in [0,T]- \{T_j\}_{1 \leq j \leq N}$,
for $j\in \{1,\ldots,N-1\}$,
For any $(x,p) \in [0,x_{{\rm max}}]\times [p_{{\rm min}},p_{{\rm max}}]$, the above iterative scheme combined with the boundary conditions is explicit and fully implementable on the enlarged grid
with a given stopping criterion $\varepsilon$, that means the iterative scheme is stopped when the relative error is less than $\varepsilon$.
Remark 4.1 Since the first and the second moments of the Markov chain defined in Table 1 converge to those of the continuous process $(X,P)$ as the time and space steps go to zero. Hence, the convergence of our scheme may be obtained using the same analysis developed in [Reference Kushner and Dupuis9].
4.2. Numerical interpretations
The numerical computation are done using the following set of data.
• Dynamics values
◦ $\eta =0.7$, $\lambda =0.5$, $\gamma =0.2$, $\mu =0.1$, $\sigma =0.1$, $\rho =0.01$.
◦ $T=1,\quad \beta =0.1$.
◦ Drift function: $\mu (x)=\mu + 0.5 \times \exp (-0.2 x)$.
◦ Penalty function : $f(x,p)=\kappa px$ with $\kappa =5$.
◦ Cost function : $C(x)=x^{2}$.
◦ Regulation parameters : $N=10\text { (number of checks)}, \quad \Gamma =0.2308$.
• Grid values
◦ Localization: $x_{{\rm max}}=1,\quad p_{{\rm min}}=0.1, \quad p_{{\rm max}}=1.1, \quad \bar {a}=0.5$.
◦ Discretization: $n_x=40$, $n_p=40, \quad n_t=100, \quad n_a=10$.
◦ Stopping criterion : $\varepsilon =0.01$.
Remark 4.2 The choice of the parameters values is arbitrary since we study a general natural resource exploitation model. Nevertheless, our numerical algorithm could be easily adapted to any specific model, for instance fishery or forest management, and parameterized by estimated values from real samples.
We plot the shape of the value functions $v_0$ and $v_1$ sliced in the plane $(x, p)$ for a fixed date $t$. We can see, as expected, $v_1\geq v_0$. In fact, we can see in line three of Figure 1 that the spread $v_1 - v_0$ is always positive. Obviously, if the manager can harvest the payoff will be greater. In the first graph of the second line of Figure 1, where we fix $(t,p)$, we can see that the two functions are nondecreasing w.r.t. $x$ which is natural due to the fact that greater the size of the natural resource, the more the manager can harvest and the less he/she is penalized at terminal time $T$. On the other hand, in the second graph of the second line of Figure 1, where we fix $(t,x)$ and take $x\gt\Gamma$, the two functions are nondecreasing in $p$, because the higher the price, the wealthier the manager becomes when he/she harvests and sells. Conversely, in the first line of Figure 1, when $x\lt\Gamma$, we can see that the value functions $v_0$ and $v_1$ are nonincreasing in $p$ which is due to the penalty function $f$ that is nondecreasing w.r.t. $p$ (i.e., the higher the price, the more the manager is penalized by the regulator).
We plot the shape of the value functions $v_0$ and $v_1$ sliced in the plane $(t, x)$ for a fixed price $p$. In Figure 2, as expected, $v_0$ and $v_1$ are decreasing w.r.t. $t$ when $x$ is large. In fact, the manager will have more time to harvest the further he/she is from the terminal date $T$. If $x$ is small, the monotony depends on the parameters $\mu$ and $\eta$. On the one hand, when $x$ is small, we can see that $v_0$ and $v_1$ are increasing w.r.t. $t$ because the manager knows a priori that he/she is going to pay the tax since the quantity of natural resource will likely be smaller than $\Gamma$ at the terminal date $T$. On the other hand, if $t$ is small, the quantity of the natural resource increases with time, but so does the price, implying that the monotony of the value functions is not obvious. In fact, because $\mu$ is more important than $\eta$, the price in this case increases faster than the quantity of natural resource, so if the maturity is important, it is not beneficial for the manager because the tax will be more important.
In Figure 3, we choose a smaller drift for the price. We remark that the monotony of the value functions $v_0$ and $v_1$ is the same as in the previous figure if $x$ is large. But now, the value functions $v_0$ and $v_1$ are decreasing when $x$ is small. This is because, unlike in Figure 2, the drift of the price is small, so the price increases more slowly than the available quantity of natural resource. Thus, if the maturity is important, it is advantageous for the manager. In fact, the tax will be less important since the quantity of natural resources available has had enough time to grow.
We plot the shape of the optimal harvest strategy $\alpha ^{*}(t,x,p)$ sliced in the plane $(x,t)$ for a fixed price $p$ (Figure 4). We can see two main regions: a harvest region (with different harvest rates) and a No-harvest region (dark blue). When we are far from maturity $T$, it is not optimal to harvest when $X$ is under $\lambda$ because under this quantity the resource quantity increases naturally so we want to let this happen to reach the maturity with $X$ greater than $\Gamma$ and thus avoid the penalization. As we get closer to $T$, it is best to harvest when $X\lt\lambda$ with different rates $a$, allowing us to reach $T$ without being penalized (i.e., $X_T\gt\Gamma$) and thus optimizing the profit generated by selling the harvest. The rates are greater as the natural resource population grows and this is due to the cost function $C$ (the cost of harvesting).
In the following, we introduce a profit and loss measure, denoted P&L, defined as follows
where $\mathcal {T}^{i}:=\{T_i,T_i+\delta,\ldots,T_{i+1}\}$ with the convention $T_0=0$. In fact, this measure represents the payoff of the agent when adopting a given strategy $\alpha$. This measure will allow us to compare the effectiveness of the optimal control $\alpha ^{*}$ against a given naive strategy.
After computing the optimal strategy $\alpha ^{*}$ and the value function via the iterative procedure, we simulate the correlated Brownian motions $B$ and $W$ on the horizon $[0,T]$ and we adjust the dynamics of $X$ and $P$ according to the optimal control computed previously. Figure 5 represents a mean over 3,000 simulated paths of $X$ and $P$ controlled by the optimal strategy $\alpha ^{*}$ and two other naive strategies $\alpha ^{1}$ and $\alpha ^{2}$. The first naive strategy $\alpha ^{1}$ consists in harvesting the maximum $\bar {a}$ at all time when we are authorized till $T$ (in red) and the second one $\alpha ^{2}$ consists in waiting until a certain time $t_0$ which is chosen by the manager (in Figure 5 we take $t_0=0.5$) then harvesting the maximum $\bar {a}$ when we are authorized till time $T$ (in green). In Figure 5, the starting point is $X_0=0.7$ and $P_0=0.5$ and the P&Ls of the three strategies are respectively (with $95\%$ confidence level bounds): $\text {P & L}(\alpha ^{*})= 0.0873$ $(\pm 0.0002)$, $\text {P & L}(\alpha ^{1})= -0.0182$ $(\pm 0.0024)$, and $\text {P & L}(\alpha ^{2})=0.0315$ $(\pm 0.0005)$. We can see that our computed strategy is better than the two others. Indeed, with our strategy, the manager begins to harvest continuously at a rate $a$ smaller than the maximum $\bar a$, which allows him/her to attain terminal date $T$ with a natural resource population above $\Gamma$ avoiding the penalization that occurs if $X_T\lt\Gamma$. On the one hand, adopting the strategy $\alpha ^{1}$, the manager at time $T$ harvests more but he/she is penalized because the quantity of resource is under $\Gamma$ at the terminal time $T$. On the other hand, using the strategy $\alpha ^{2}$, he/she is not penalized because the quantity of resource at time $T$ is above $\Gamma$ but he/she harvests for a shorter period of time (starting the harvesting at $t_0=0.5$), hence, resulting in lower revenue.
As in Figure 5, Figure 6 represents a mean over 3,000 simulated paths of $X$ and $P$ for the optimal control and the same two other naive strategies. But this time we choose to start with $X_0=0.3$ and $P_0=0.5$. The P&Ls of the three strategies are respectively (with $95\%$ confidence level bounds): $\text {P & L}(\alpha ^{*})=0.0302$ $(\pm 0.0008)$, $\text {P & L}(\alpha ^{1})= -0.1032$ $(\pm 0.0027)$, and $\text {P & L}(\alpha ^{2})=-0.0331$ $(\pm 0.0022)$. We can see that our strategy is still better than the two others. On the one hand, starting at time $t=0$ from a position under the threshold $\lambda$, the natural resource population tends to increase (mean-reverting effect), hence, as we can see in Figure 6, our optimal strategy is to wait until $X$ reaches a certain level over $\Gamma$ before starting to harvest (around time $t=0.2$). Doing this allows the manager to avoid the risk of being under the penalization barrier $\Gamma$ at time $T$. On the other hand, using the first naive strategy $\alpha ^{1}$, the natural resource population is quickly under $\Gamma$ and at time $t=0.2$, the regulator does not allow the manager to harvest anymore. Moreover, the manager is penalized as the natural resource population does not surpass $\Gamma$ at time $T$. On the contrary, if we wait up to time $t_0=0.75$ before starting to harvest with the maximum rate $\bar {a}$ (the naive strategy $\alpha ^{2}$), the natural resource population grows (since $X_t\lt\lambda$) till time $t_0=0.75$ and then decreases (because the harvesting starts at this date) but will be above $\Gamma$ at time $T$. Hence, we are not penalized but still our optimal strategy $\alpha ^{*}$ outperforms $\alpha ^{2}$ in terms of P&Ls.
In Table 2, we choose to compute our optimal strategy and the corresponding simulations for $\Gamma =0.4872\gt\lambda =0.3$. With this configuration, we know that if the natural resource population drops under $\Gamma$, it is likely to stay under $\Gamma$. Therefore, the manager is obligated to keep the natural resource population $X$ over $\Gamma$ at all times in order to avoid the penalization at the maturity $T$. Except for $\Gamma$ and $\lambda$, we use the same values of the parameters defined in the beginning of this numerical part and we represent in Table 2 the value of the P&L and the natural resource population $X$ at time $T$ starting with $X_0=0.7$ and $P_0=0.5$. These quantities were computed using a mean over 3,000 trajectories under the optimal control $\alpha ^{*}$ for different values of the penalty constant $\kappa$ (with $95\%$ confidence level bounds). We can see that the P&L is a decreasing function w.r.t. $\kappa$ which is natural because the less the manager is penalized, the more he/she takes risks and the richer he/she is. Although we can remark that for $\kappa =1$ the natural resource population at time $T$ is under $\Gamma$ because the penalization is not severe enough, hence the manager prefers to be penalized and harvests a little more which makes him/her richer. Therefore, we assume that for our set of data, to create a fair balance between the biological requirements and the maximization of the profit induced by harvesting, a suitable choice for the penalty constant is $\kappa =2$. This amount of fines insures the double objective of the sustainable harvesting: the natural resource population does not fall below a certain threshold that guarantees its natural renewal, and the manager makes profits to prevent him/her from going bankrupt.
5. Conclusion
In this paper, we have investigated the problem of sustainable harvesting of a natural resource. We built a model where harvesting is continuously depending on the quantity of resource available in the harvesting region. Manager tries to maximize his/her profit under the constraint of fines when the quota is exceeded. We have also introduced the fact that the selling price of natural resource depends on the quantity (stock) of resource remaining in the harvesting region.
We have shown some interesting results. First, we derived an optimal strategy from a verification Theorem. We then numerically observed that this optimal strategy provides better gain than naive ones. Second, we delimit a level of fines which insures the double objective of the sustainable harvesting: on the one hand, the natural resource population stays above a certain threshold ensuring its natural renewal; on the other hand, the manager is free to attain a certain level of harvesting allowing acceptable profits. These results give a better understanding of the manager's behavior according to the amount of the fines, and how to define a pricing rule for the fines to guarantee a sustainable harvesting.