Optimal dynamic regulation with asymmetric information in continuous time: The case of electric power

In this article, we analyze the mechanism design for the remuneration to the reduction of energy losses of a natural monopoly through a dynamic principal-agent model in continuous time. The objective of this research is to characterize the optimal regulation that induces reductions in electrical energy losses. In our methodology, we use a differential equation, its HJB representation, and an exponential utility function. The results suggest that the optimal contract is based on the agent's continuation value as a state variable. The article contributes to the analysis of control problems in conditions of incomplete information and incorporates information asymmetries and incentives in regulation. Among the future lines of research are the application of the model to specific energy markets and the empirical evaluation of the effectiveness of the proposed regulation


Introduction
The dynamics of electric power distribution companies is crucial to understand how regulators motivate these companies to operate efficiently through economic incentives. In this research, we focus on characterizing the optimal regulation that reduces the losses of electric power from a natural monopoly. This is a control problem under incomplete information, in the tradition of the mechanism design literature, where the regulator's goal is to minimize the firm's energy losses while incentivizing the permanent provision of the service. Regulation of markets that require the operation of only one firm, due to technological efficiency, with an emphasis on service sales prices, ignores the incentives of the companies and the inability of the regulator to obtain complete information about the firm (Laffont, 1994). Fortunately, theoretical works that incorporate information asymmetries and incentives in analyzing regulation already number a good amount (e.g., Alasseur et al. 2019;Martimort et al., 2020;Hiriart and Martimort, 2012), which we can use to advance our understanding of the energy market and its policy implications.
Unlike Alasseur et al. (2019), our principal does not equate to the aggregation of energy consumers, or if preferred, to the representative consumer who directly remunerates the agent. In Colombia, for example, due to Law 143 of 1994, the prices charged to regulated final users are defined on the basis of energy distribution costs, which include levels of energy and power losses. To the extent that the value of sales plus 109 the transfers received by the company must be sufficient to recover the value of its investments and costs (see article 44), while the missing money to pay for the total consumption of residential users, who pay a lower price than the cost-of-service provision 1 , is covered with resources from the national budget (see article 47), it is not consumers who directly compensate the company for its energy losses in full, but the regulator does. For that reason, our principal is defined in terms of social welfare expectation, which incorporates both the expected utility of the agent and the utility of the representative consumer as in Laffont (1994). Unlike Laffont (1994), our model is not static but dynamic towards infinity.
Methodologically, we follow Sannikov (2008). Although Sannikov's (2008) model does not incorporate the utility of aggregate users, distortionary taxes, or firm sales as we do in this research, it is useful for presenting the solution to our dynamic model, in which the product (distributed energy) is described by a standard Brownian motion and the company's effort. One of the main aspects of Sannikov's model is the role of the continuation value of the firm which is typically neglected in static modelling. In this respect, Grochulski and Zhang (2023) and Wang and Yang (2022) show that temporary suspension of the agent may work better than termination as an incentive device in the canonical dynamic principal-agent moral hazard model. We use the martingale representation theorem as expressed in Sannikov (2008) and used in Wu et al. (2022) and Sung (2022) to deal with the path of the continuation value.
This article contributes to the literature in several respects. First, our effort is focused on deriving a regulation, or contract, with a theoretical perspective that makes sense in practice, by clearly including the informational limits of regulators regarding the regulated firm, as well as the firm's reaction to the new regulation. Second, our analysis makes explicit that the contractual environment, or tariff regulation, is dynamic, which has implications for the incentives of the firm, which weighs its short-and long-term scenarios. In a dynamic scenario, the firm's decisions about the work (effort) it makes to reduce energy losses depends on its value, which is also very intuitive.

The model
The distribution of energy X t is observable by both the principal (the regulator) and the agent (the distribution company). The principal does not observe the effort A t of the agent, but instead uses the realizations of X t to offer the agent compensation for incurring the costs of executing an effort. The principal offers the agent a contract that specifies a non-negative transfer flow T t (X t ; 0 ≤ s ≤ t) ∈ [0, ∞), based on the observed energy 110 deliveries. The agent perceives a utility u(. ) derived from both the transfers and its income from energy distribution. We assume that the utility u(. ) : [0, ∞) → [0, ∞) is a C 2 increasing and concave function such that u(0) = 0 and u ′ (x) → 0 when x → ∞ . For a level of effort At, the agent's expected utility is where the discount rate r is, for simplicity, constant throughout time; R(. )is the income obtained from energy distribution; A t is the effort and T t is a monetary transfer to the firm from the regulator that is financed by taxes. The process of the quantity of electricity delivered, X t , is described by a constant σ and a Brownian motion Z = {Z t , ℱ , 0 ≤ t < ∞} on (Ω, ℱ, ) such that up to time t, the energy distribution follows the following dynamic: The agent's effort is a measurable stochastic process A = {A t ∈ , 0 ≤ t < ∞} progressively measurable with respect to ℱ , where the set of feasible efforts, A, is compact with minimum element 0. We also assume that there is a γ 0 > 0 such that h(a) ≤ γ 0 a for all a in a ∈ .
The utility obtained by aggregate users for the consumption of energy at time t, is S(X t ) and we assume that the foregone income, by value of T t , to give it to the regulator (taxes), and its transfer to the firm, has a social cost λ, i.e. we are in the most plausible scenario where taxes are distorting. Thus, the differential of the representative consumer's total utility at time t is The expectation of social welfare for the principal is then: As is standard in the literature, we say that an effort process {A t , 0 ≤ t < ∞} is incentivecompatible with respect to {T t , 0 ≤ t < ∞} if it maximizes the agent's expected utility.

111
The principal-agent problem The problem of the regulator (the principal) is to offer the firm (the agent) a contract, which consists of two parts: a transfer flow {T t , 0 ≤ t < ∞} contingent on the realized energy distribution, and the request for an incentive-compatible effort level {A t , 0 ≤ t < ∞}, that maximizes the expectation of social welfare, subject to that such a contract reports to the firm, a required value (opportunity cost) of at least Ŵ : In order to characterize optimal regulation in the following sections, we follow Sannikov (2008), and denote as W t , a state variable defined as the total utility obtained by the firm after a time t, in such a way that in the optimal contract the variable W t that is observable by the regulator, changes with the amount of energy distributed, and determines both the transfers received by the agent at each t and the effort that is requested.

Characterization of firm value
The firm's continuation value represented by W t , is related to the firm's total compensation as described by the following expression: where T t = T(W t ) and A t = A(W t ). By the Martingale Representation Theorem of Karatzas and Shreve (1991), the above equation can be expressed as: Now, we rewrite the equation 6 as the following to process: and equating 8 with the derivative of 7 we obtain: where rY(W t ) is the sensitivity of the firm's continuation value with respect to the energy distributed. In the optimal contract, Y(W t ) takes the minimum value that induces an effort level A(W t ).
In this article, we use a utility function of the form which is standard in the literature of contract theory (see e.g. Williams, 2015 andLi andWilliams, 2015). For ease of exposition, we write the utility function as follows: Where Note that the cost of effort, h: → R is continuous, increasing, convex, and normalized, such that h(0) = 0. To find the minimum value of Y(W t ), that allows or induces an effort level a(W t ) that maximizes the difference between the expected change in W t and the firm's effort cost h(A), we present the following proposition: Proposition 1: For a strategy A let Y t be the process representing W(T, R, A) mentioned above. A is optimal if and only if Proof: Consider an arbitrary alternative strategy A * . Define as the time t expectation of the agent's payment, when he plans to follow strategy A after time t, in face of the cost of A * before t. The derivatives are Notice that Thus, multiplying by rY t rY t σdZ t = rY t σdZ t * + rY t (A s * − A s )dt We can now write dV t as Using our utility function (10) We want a value A * that maximizes It can be shown that the probability measure of V t is a supermartingale and that the continuation value is bounded from below 2 . Therefore, the strategy A is at least as good as any alternative A * . □.
From our Proposition 1, it follows that, the minimum volatility of the continuation value, necessary to induce an effort a ∈ , is given by rγ(a)σ where γ: That is, γ(a) is the value of Y(W t ), when the firm maximizes the difference between the expected change in W t and the cost of effort h(a). Note that the first order condition for

Welfare characterization
At the same time, the maximum social welfare, denoted as F(W t ), that is obtained when the regulator delivers a value W t to the firm, is related to the optimal choices of A(W t ) and T(W t ) through a Hamilton-Jacobi-Bellman (HJB) equation. Given that the W t process is Markovian, it is possible to use dynamic programming and the verification theorem applied to the dynamics of a diffusion processes with Poisson jumps (see e.g. Oksendal and Sulem, 2009;Fleming and Soner, 2006;and Hanson, 2007). Thus, the optimal value function of this problem is (we omit the time sub-indices for simplicity): 2 The proof of Proposition 2 is in Sannikov (2008).

115
where f(T, A) = [S(X) − R(X) − (1 + λ)T + u(T t + R(X), A)] and the optimum of R(X) is solved by the firm when knowing the optimum of transfers and effort. The verification theorem allows us to write this function as the following non-linear second-order Hamilton-Jacobi-Bellman (HJB) integral-differential equation: On the other hand, for each W ∈ R, the maximum in the HJB equation satisfies.
The verification theorem, states that F = F guarantees the existence of an optimal policy rule T * = t(W) and A * = a(W), which solve the HJB equation. The characterization of the maximized HJB equation is given by: That is, the maximum social welfare for a given Y, denoted as F(W t ), that is obtained when the principal delivers to the agent a value W t , is related to the optimal choices of a(W) and t(W t ) through the Hamilton-Jacobi-Bellman (HJB) equation. Expression 23 can be written as:

The optimal regulation
An optimal contract specifies transfers received by the firm {T t , t ≤ τ}, an effort {A t , t ≤ τ } requested in the contract, that is incentive-compatible, and a stopping time τ when the firm receives the value Wτ generating the maximum social welfare. As is usual in the economic literature of contracts, the optimal contract corresponds to the arguments that maximize the principal's objective function: where 0 ( τ ) is the maximum social welfare 3 , associated with the continuation value τ of the firm at time τ. The optimal contract, must also satisfy the condition that the transfers given to the firm, represent an initial benefit 0 ≥̃ greater than the gain that would be received elsewhere ̃ (outside option), i.e. the participation constraint is Denoting as t(W t ) and ã(W t ) the optimizers of the HJB associated with the function F≥F0, in the optimal contract the firm's value starts at W0 and changes according to where γ is described by expression (15) and T = t(W t ), A t = ã(W t ).

Analytical solution
In this section, we review the implications of the first-order conditions for the problem presented in (23).
Recall that the regulator's objective function includes both the utility of the regulated firm and the consumer surplus. To analyze the model's solution, we make explicit the condition that the consumer surplus is an increasing function of the effort made by the firm to provide energy and to reduce energy losses. Thus, our expression is as follows: where S(a) is increasing with respect to a.

Solution to the firm's problem
From the first order condition with respect to a and using expression ( 119

Numerical solution
In this section we show the numerical solution to the HJB equation at its maximum, given by: To obtain a numerical solution, we first find some restrictions. Consider F ′ (W) independently of t. The first order condition for t gives: where t = a 2 2 + C, i.e., implicitly there exists R that is a function of W and R(W). From the second order condition for t we have: (1 − F ′ (w))u tt < 0 and (1 − F ′ (w))(−η 2 e [−ηR+C] ) < 0 from which (1 − F ′ (w)) > 0 . This implies F ′ (w) < 1 . From both first and second order and F ′ (W) using t = a 2 2 + C we obtain S aa − (1 + λ)(− 2 ) + F ′′ (w)rσ 2 < 0 and replacing the previous expression for F ′′ (w), we get the condition S aa − S a a + (1 + λ)(1 − a 2 ) < 0 Again, from the first order condition with respect to a F ′′ (w) = (1 + λ)a − S a rσ 2 a r ≠ 0; a ≠ 0; σ 2 ≠ 0 Since F ′′ (w) does not depend on a and so that it does not equal zero, we impose a linear dependency of S a on a with an intercept equal to zero, i.e. S a = ma. That gives the following expressions.

The numerical method
In this section, we describe the numerical method we use to find the numerical solution. Specifically, we use the Runge-Kutta method to solve the maximized HJB equation 23 of second order. To do this, we first describe the method for a first-order differential equation, then its generalization to the case of k stages, and finally for the second-order equation that concerns us. We follow Dormand and Prince (1980), Sanz (1991), Peña (2009) y León Camejo et al. (2015.
First, consider the method that provides the solution to the integral equation, related with the problem at its initial conditions, which is Noticing that y(t; t 0 , y 0 ) is a solution, we write the solution in its integral form.

123
We use a method for numerical integration to approximate the following integral ∫ f(s; y(s))ds t t 0 As an example, consider an approximation using the trapezoidal rule, where ∫ f(s; y(s))ds with h = t f − t 0 , f(t 0 ; y(t 0 )) = f(t 0 ; y 0 ) known, and unknown f(t f ; y(t f )). This because is to be approximated using the trapezoidal rule. With an algorithm for the Euler method of size The reader should note that we have carried out a two-stage process: In the first step, y i * is calculated and in the second step, the desired approximation is obtained. This process is known as the twostage Runge-Kutta method, which can be written as { g 1 = hf(t i−1 , y i−1 ) g 2 = hf(t i−1 + c 2 h, y i−1 + a 21g 1 ) y i = y i−1 + b 1g 1 + b 2g 2 The Runge-Kutta method can be extended for the m-stages case, where y i = y i−1 + ∑ b jg j m j=1 and { g 1 = hf(t i−1 , y i−1 ) g 2 = hf(t i−1 + c 2 h, y i−1 + a 21 g 1 ) g 3 = hf(t i−1 + c 3 h, y i−1 + a 31 g 1 + a 32 g 2 ) ⋮ g m = hf(t i−1 + c m h, y i−1 + a m1 g 1 + a m2 g 2 + ⋯ + a mm−1 g m−1 ) With c j , j = 2, … , m b j , j = 1, … , m y a jk , j = 1, … , m, k = 1, … , j − 1 as coefficients.
These can be grouped in the following matrix form.
The second order equation of our regulation problem is rγ(a) 2 σ 2 /2 The algorithm that solves the equation consists of the following steps: 1.

3.
For each W, calculate a, t, R, S, u

7.
The number of steps on the variable W is 20.
The calibration of the parameters is presented in the following table.

Results
From Figure 1a, we can see that the social welfare function at incentive-compatible points of the numerical solution follows a concave trajectory, which corresponds to a very good approximation of our analytical functions. This opens the possibility for our model to be implemented in a wide number of studies or research related to information asymmetries in continuous time.

Empirical analysis
The Mining and Energy Planning Unit (UPME), under the Ministry of Mines and Energy, summarizes the information on the production, transformation and consumption of energy in Colombia, in what they call the Colombian Energy Balance (BECO), which contains the record of energy flows, from its extraction or production at their different sources, to its consumption and disappearance. With the BECO, we can identify the relevant energy trends for this article.
In particular, the National Interconnected System Electric Energy Balance, contains the account "Losses" expressed in Giga-watt hours (GWh). Losses, account for the value of the annual final consumption minus the useful energy, which in turn is equal to the final consumption by the percentage of efficiency. In interpretation, losses are energy dissipated in events such as transportation.
We observe the quarterly stock prices of ISA as well as its market capitalization (MKTCap), the aggregate final consumption of electric power and the financial statements of ISA company from 2006 to 2020 and analyze the data using several autoregressive models. First, we use an AR(1) model with MKTCap as the dependent variable using the following equation: where y t is the observed time series of MKTCap, ϕ is the autoregressive coefficient, and e t is the error term. We also explored an AR(1) model including the second difference in Sales, the first difference in Costs, and the second difference in Transfers. Second, we use an ARIMA(2,0,2) model: (1 − B) 2 (1 + ϕ 1 B + ϕ 2 B 2 )y t = (1 + θ 1 B + θ 2 B 2 )e t 130 where y t is the observed time series of MKTCap, B is the lag operator, et is the error term, ϕ 1 and ϕ 2 are the autoregressive coefficients, and θ 1 and θ 2 are the moving average coefficients. We also included the second differences of Sales, Costs, and Transfers. Third, an ARIMA(1,2,1) model for Sales: (1 − B) 2 (1 − B 1 2)(1 + ϕ 1 B)y t = (1 + θ 1 B)e t where y t are the observations of Sales, B, et is the error, ϕ 1 is the autoregressive coefficient, and θ 1 is the moving average coefficient. We also estimated the model including MKTCap and Costs as regressors. Under these models, the sings of the estimates are not consistent with our theoretical model. We obtain the same results with a VAR model and several specifications.
We further analyze our data through a Vector Error-Correction model. We present the main results in the following tables.

Conclusions
In this article, we obtain the dynamics of a tariff regulation on an electric energy company, making explicit that the incentives of the firm involve weighing short and long-term scenarios. In this dynamic scenario, the firm's effort to reduce energy losses depends on the continuation value. The derivation of incentivecompatible regulation explicitly includes the informational limits of the regulator with respect to the regulated firm as well as the firm's reaction with respect to the dynamics of the regulation. Our model incorporates the social welfare, following the tradition of the literature on regulation as well as the dynamic towards infinity described by a standard Brownian motion from which we obtain new insights about the optimal contracts.
Interestingly, the function of the maximum social welfare at the incentive-compatible points of the numerical solution follows a concave trajectory, which corresponds to a very good approximation of our analytical functions. This opens the possibility that our model can be implemented in a wide number of studies related to information asymmetries in continuous time. It is worth noting that this is a theoretical study in which we show a new methodology for the analysis of problems in which the agent's effort is not directly observable by the principal. Particularly, we represent continuous-time contracts using a volatility factor in the agent's continuation value.
It is also worth highlighting the importance of numerical methods, which are useful to approach the optimal solution established by the Hamilton Jacobi and Bellman equation. The numerical solution of the second-order HJB differential equation using the Runge-Kutta method allowed us to characterize some interesting aspects. First, social welfare, denoted as F (Wt) is concave and has a decreasing behavior with respect to the firm's continuation value Wt. Second, the consumer surplus derived from energy consumption S(W) and the firm's income R(W ) are increasing in Wt, which is economically intuitive. Third, our results are consistent with the idea that the welfare trajectory is higher for lower levels of volatility and lower tax distortions, when the firm's continuation value is higher. The flow of transfers T(W t ) and the incentivecompatible effort A(W ) are convex and decreasing in W t . Finally, the utility function u(T, R, A) is increasing in W t , showing positive signs with respect to W t .
Our proposed solutions can be used in diverse problems and are also useful tools to approach theoretical and numerical optimal solutions for problems posed through the HJB equation. Additionally, they can be used to understand realistic scenarios of electricity generating companies, subject to regulation by the regulatory authority, analyzing trade-offs between the benefits for the company and for the consumer, according to the regulations in place. As an exploratory exercise using observational data, in this work we compare the analytical and numerical solutions obtained by solving the Hamilton-Jacobi-Bellman equation