Differences

This shows you the differences between two versions of the page.

--- the_complete_and_consistent_data_base_coco_for_the_national_scale [2020/02/11 15:42] – [COCO1 Estimation procedure] matsz
+++ the_complete_and_consistent_data_base_coco_for_the_national_scale [2020/02/13 09:26] – [COCO2: Estimation procedure] matsz
@@ Line 327: / Line 327: @@
 & + \sum_{i,t} wgt^{up}((max(y_{i,t}^{up},y_{i,t})-y_{i,t}^{up}))/abs(y_{i,t}^{up}))^2\\
 & + \sum_{i,t} wgt^{lo}((min(y_{i,t}^{lo},y_{i,t})-y_{i,t}^{lo}))/abs(y_{i,t}^{lo}))^2\\
-\text {s.t.}\\
-y_{i,t}^{LO}<y_{i,t}<y_{i,t}^{UP}\\
-\text {Accounting identities defined on} y_{i,t}\\
-\text {Identity of land use from different sources}
 \end{split}
 \end{align}
+\begin{align*}
+\begin{split}
+&\text {s.t.}\\
+&y_{i,t}^{LO}<y_{i,t}<y_{i,t}^{UP}\\
+&\text {Accounting identities defined on} y_{i,t}\\
+&\text {Identity of land use from different sources}
+\end{split}
+\end{align*}
 where //i// represents the index of the elements to estimate (crop production activities or groups, herd sizes etc.), //t// stands for the year, wgtx are weights attached to the different parts of the objective (\(wgt^{dat} = wgt^{hp} = 10, wgt^{ini} = 1, wgt^{up} = wgt^{lo} = 100)\), and
@@ Line 416: / Line 420: @@
 \(CORF\) =	ratio of on farm content to the standard content
-and CORF is contrained to equal to one except that we permit CORF  1 for FRMI.
+and CORF is contrained to equal to one except that we permit CORF $\neq$ 1 for FRMI.
 Production in dairies and on farm may be added to obtain the total production that enters the market balances:
@@ Line 486: / Line 490: @@
 This procedure has developed as a path dependent compromise between computation time and presumed quality. It starts with an estimation of land use in combination with agricultural land balance, including the land transition between LU classes. This determines the utilisable agricultural area (UAA) and non-agricultural land use. Step 2 distributes crop areas within the fixed UAA from step 1 and estimates crop production and yields. Step 3 only tackles the complete animal sector data (activities, markets, EAA). The crop production is taken as given, when market balance and EAA are estimated for the crops and derived processed products (step 4). However, with all steps completed some final checks may modify the results (e.g. delete tiny activity levels or estimate another crop area from another crop output value and thus change the UAAR). Furthermore the crop estimation may have slightly changed the ratio of cropland to productive grassland. Therefore the accounting identities ensured in steps 1 are not necessarily fulfilled in a strict sence anymore. Hence a final reconciliation of land use is added for full consistency:
-**Figure 3. Overview on main estimations in for the consolidation of national data in Europe (in coco1.gms)**
+**Figure 3: Overview on main estimations in for the consolidation of national data in Europe (in coco1.gms)**
 {{::figure3.png?600|}}
-Results are not always fully satisfactory (perhaps impossible given some raw data). For example the resulting prices (unit values) are far from a priori expectations for a number of series, in particular less important ones. This is because, apart from some additional security checks, unit values are by and large considered a free balancing variable calculated to preserve the identity between largely fixed EAA values and fixed production (in coco1_estimb). The priority for EAA values has been reduced somewhat in recent years but a more thorough revision would require to estimate production, market balances and EAA simultaneously rather than consecutively (first (a), then (c) for crops). As this is infeasible for all crops at the same time the whole estimation would need to be split up differently in the crop sector, perhaps first for the aggregates and then within those.
+Results are not always fully satisfactory (perhaps impossible given some raw data). For example the resulting prices (unit values) are far from a priori expectations for a number of series, in particular less important ones. This is because, apart from some additional security checks, unit values are by and large considered a free balancing variable calculated to preserve the identity between largely fixed EAA values and fixed production (in coco1_estimb). The priority for EAA values has been reduced somewhat in recent years but a more thorough revision would require to estimate production, market balances and EAA simultaneously rather than consecutively (first $(a)$, then $(c)$ for crops). As this is infeasible for all crops at the same time the whole estimation would need to be split up differently in the crop sector, perhaps first for the aggregates and then within those.
 Furthermore it should be mentioned that the main parts of COCO are handled in a program (‘coco1.gms’) looping over MS because there are no direct linkages between them. However, for practical reasons it will be useful to run COCO in country groups that have the same coverage of years. The longest series (as off 1984) can be established for EU15((Belgium and Luxembourg are aggregated in COCO for reasons of data availability.)) countries except Germany. For the New MS it turned out that data before 1989 are often very unreliable and create considerable burden in the data maintenance. These countries (and Germany) are only completed for years from 1989 onwards therefore. Norway also offers reliable series as of 1984. In the case of the Western Balkan countries it is rather hopeless to provide very recent data as key data are still missing such that the series can only be completed from 1995 onwards. Furthermore for the Western Balkan counties it was necessary to transfer certain coefficients and shares from (previously consolidated) neighbouring countries to the Western Balkan, such that a certain sequence is necessary for a reasonable application of COCO1:
@@ Line 498: / Line 502: @@
 ====COCO2: Data Preparation====
+The data consolidation in COCO2 only covers a few special topics:
+  * producer prices of dairy products and vegetable oils
+  * consumer prices
+  * consumer losses and nutrient intake after losses
+  * feed stuff quantities without market balances (by-product, fish emal)
+  * loss rates of fodder for preliminary balancing of animal nutrients
+  * corrections of certain LULUCF coefficients based on UNFCCC
+An overview is given in the following figure.
+**Figure 4: Overview on main elements in the finalisation step for the consolidation of national data in Europe (in coco2.gms)**
+{{::figure_4.png?600|}}
+In spite of only limited subtasks tackled in coco2.gms, the multitude of different data inputs is comparable to that in COCO1.
+**Include file //‘coco2_collect.gms’//**
+Various input files are collected with some adjustments to match to CAPRI definitions and with some gap filling. As the consumer prices follow from a top down expenditure allocation problem, the input data range from macroeconomic information to very detailed prices of food items.
+  * Consolidated data from COCO1
+  * Macroeconomic information from Eurostat and UNSTATS: Exchange rates, population, GDP deflator, private consumption of households in current prices.
+  * Price index information: Aggregate food price index, relative (to EU) food price index, harmonised indices of consumer prices (HICPs) with item weights all from Eurostat
+  * Expenditure by product groups (from Eurostat and national sources)
+  * Auxiliary data for special cases (Prices for some milk products in selected countries, fish meal information etc)
+  * Country Sheets of the Western Balkan and Turkey: Exchange rate, inhabitants, inflation rate, food expenditure shares
+  * Disaggregate absolute consumer prices for selected narrowly defined food items (ILO and Eurostat)
+Where available, producer prices for milk products were already included from Eurostat statistics (Agricultural prices and price indices) in COCO1. Completeness was not achieved in COCO1, however, because processed dairy products are not part of the EAA. Here we complete some gaps using price information for some Member States and (partly assumed) relationships among dairy product prices and their fat and protein contents.
+Data on total consumer expenditures as well as expentitures by food groups are included from various sources as described in Chapter 2.2.2.5, partly extended using general price index information.
+Consumer price index weights and price indices for food aggregates (2005=100) are coming from Eurostat tables on HICP. Supplementary information for Albania, Bosnia and Croatia comes from national agencies. The price index weights are used to extend older series on food expenditure by product groups (say “meat”) which have been discontinued (see below under file coco2_shares.gms).
+Finally we use very narrowly defined absolute consumer prices (e.g. for spaghetti) and price indices. The earlier years (before 2008) had been provided by ILO which has discontinued this activity. For a subset of those Eurostat offers matching information as “detailed average prices (table prc_dapYY) that has been used to extend the ILO series. These prices are mapped to CAPRI regions, products and units (//‘coco2_ilo_addup.gms’//).
+Price indices for food and non-alcoholic beverages from HICP as well as the general food price index are used to complete the disaggregate ILO prices for single typical food items.  (like “Wheat bread white unsliced not wrapped”) using a Hodrick-Prescott filter and the expectation that their changes should follow the price index informaiton collected.
+Finally another HPD estimator is used to adjust the dissagregate prices to be (somewhat) in line with Eurostat information on relative food price levels across Europe.
+**Include file //‘coco2_shares.gms’//**
+Expenditure shares are defined and completed top-down using simple OLS estimates against related statistical expenditure information or, as a last fall back option, based on a trend.
+The food expenditure share completions start with data from COICOP level 3 giving results on food and non-alcoholic beverages. Further disaggregation relies on historical Eurostat data (HIST), on the above mentioned index weights from HICP and partly national data (Germany and Spain).
+A conveninent expenditure group is potatoes as these expenditure shares may be extrapolated based on COCO1 human consumption multiplied by producer price as regressors for OLS.
+====COCO2: Estimation procedure====
+**Include file //‘coco2_def.gms’//**
+The approach to determine consumer prices is to distribute food expenditure on groups with consumption quantities given from COCO1 results such that endogenous consumer prices link endogenous expenditure with exogenous quantities. Deviations of estimated expenditure and consumer prices from their supports is penalised in an entropy framework. Estimation is done year by year, starting with the most recent year where hard data are usually available to a greater extent than for the oldest years in the database. Including consumer price changes (always relative to the previously solved year) serves to stabilise the results to some extent such that the objective does not only have supports for the consumer prices, but also for their changes. The entropy problem is solved by maximizing:
+\begin{align}
+\begin{split}
+max_t &- \sum_{m,j,k} CPS_{m,j,2}*HCOM_{m,j,k}/1000/TOFO_{m,t}*\\
+&PE_{m,j,k}*LOG(PE_{m,j,k}/PQ_k)\\
+&-\sum_{m,j,k} CPS_{m,j,2}*HCOM_{m,j,k}/1000/TOFO_{m,t}*\\
+&PED_{m,j,k}*LOG(PED_{m,j,k}/PQ_k)\\
+&-\sum_{m,FOPOS,k} EXS_{m,FOPOS,2}/TOFO_{m,t}*\\
+&PEX_{m,FOPOS,k}*LOG(PEX_{m,FOPOS,k}/PQ_k)\\
+&-\sum_{m,j,k} PFAC_{m,k}*LOG(PFAC_{m,,k}/PQ_k)*1000\\
+\end{split}
+\end{align}
+where //m// represents the region, //j// the food item with consumer price, FOPOS the food group, //t// stands for the current estimation year, t_1 for the year estimated before and k for the number of support points (=3).
+Parameters are
+| \(HCOM_{m,j,t}\) |Human consumption, result from COCO1|
+| \(UVAD_{m,j,t\_1}\)	|Consumer price from last simulation of year t+1|
+|\(CPS_{m,j,k}\)	|Support points for consumer prices |
+|\(DCPS_{m,j,k}\)	|Support points for consumer price changes|
+|\(EXS_{m,FOPOS,k}\)	|Support points for group expenditures|
+|\(TOFACS_{m,k}\)	|Support points for total food expenditure slack|
+|\(PQ_k\)		|A priori probabilities for support points|
+|\(TOFO_{m,t}\)	|Total food expenditure and entropy variables|
+|\(PE_{m,j,t}\)	|Probability of support points for consumer prices|
+|\(PED_{m,j,t}\)	|Probability of support points for consumer price changes|
+|\(CP_{m,j}\)	|Consumer prices|
+|\(DCP_{m,j}\)	|Consumer price changes|
+|\(PEX_{m,FOPOS,t}\)	|Probability of support points for group expenditure|
+|\(PFAC_{m,k}\)	|Probability of support points for food expenditure slack|
+|\(EX_{mFOPOS}\)	|Group expenditures|
+|\(TOFAC_m\)	|Food expenditure slack|
+Constraints are as follows:
+Summing up probabilities for support points
+\begin{equation}
+\sum_{k\forall_{m,j}(CP.L_{m,j}\ge 0\wedge HCOM_{m,j,i}\ge 0)} PE_{m,j,k}=1
+\end{equation}
+\begin{equation}
+\sum_{k\forall_{m,j}(DCPS_{m,j}\ge 0\wedge HCOM_{m,j,i}\ge 0)} PE_{m,j,k}=1
+\end{equation}
+\begin{equation}
+\sum_{k\forall_{m,j}(EX.L_{m,FOPOS}\ge 0)} PE_{m,FOPOS,k}=1
+\end{equation}
+\begin{equation}
+\sum_{k\forall_{m}(TOFAC.LO_m\ge TOFAC.UP_m)} PFAC_{m,k}=1
+\end{equation}
+Define consumer price changes from support points
+\begin{equation}
+DCP_{m,j} = \sum_{k\forall_{m,j}(CP.L_{m,j}\ge 0\wedge HCOM_{m,j,i}\ge 0 \wedge DCPS_{m,j,2}\ge 0)} PED_{m,j,k}*DCPS_{m,j,k}
+\end{equation}
+Of course consumer prices changes are also related to the last simulation result (which is for T+1 due to backward looping)
+\begin{equation}
+DCP_{m,j} =UVAD_{m,j,t\_1}-CP_{m,j}
+\end{equation}
+Define consumer prices from support points and probabilities
+\begin{equation}
+CP_{m,j} = \sum_{k\forall_{m,j}(CP.L_{m,j}\ge 0\wedge HCOM_{m,j,i}\ge 0)} PE_{m,j,k}*CPS_{m,j,k}
+\end{equation}
+Define group expenditure from support points and probabilities
+\begin{equation}
+EX_{m,FOPOS} = \sum_{k\forall_{m,j}(EX_{m,FOPOS}\ge 0)} PEX_{m,FOPOS,k}*EXS_{m,FOPOS,k}
+\end{equation}
+Define total expenditure slack from support points and probabilities
+\begin{equation}
+TOFAC_m=\sum_{k\forall_{m}(TOFAC.LO_m\ge TOFAC.UP_m)} PFAC_{m,k}*TOFACS_m
+\end{equation}
+Exhaustion of food expenditure may be relaxed with a slack factor different from one. However, this “last resort” to achieve feasibility in the expenditure allocation problem is limited to years and countries with precarious data and subject to strong penalties.
+\begin{equation}
+\sum_{FOPOS} EX_{m,FOPOS}=TOFO_{m,t}*TOFAC_{m,k}
+\end{equation}
+Consistency of group expenditure
+\begin{equation}
+EX_{m,FOPOS}=\sum_{j\forall_{m,FOPOS}(j\in FOPOS\wedge HCOM_{m,j} \ge 0)}CP_{m,j}*HCOM_{m,j}/1000
+\end{equation}
+For most countries the exhaustion of total expenditure is the only evident hard constraint (and even this is relaxed in problem cases). However, as the penalties for group expenditure are set high, and furthermore as the range of expenditure supports defines additional implicit hard constraints, the problem may turn out infeasible (typically solved by additional leeway). To meet the expenditure constraints the solver would tend to concentrate deviations from supports on the most important expenditure items while setting the less important items close to their supports. A more balanced distribution of deviations from supports was achieved in practice by weighting all contributons to the overall objective (except the last one for the total expenditure slack) with expected expenditure shares. The weights may be interpreted as expected expenditure shares because supports are specified in a symmetric way such that the central, second (of three) supports, which is used in the objective function, is equal to the expectation.
+**Include file //‘coco2_solve.gms’//**
+The initialisation, solving, reporting and storage is organised in the next include files with a few elements worth mentioning
+  * The initialisation tries to ensure positive consumer margins by the assignments of expected values and by specifying bounds on estimated consumer prices. The reference point for these margins is an average of EU and national prices that reflects the importance of domestic sales vs. imports.
+  * Bounds and spread of supports around expected consumer prices are set high for items without ILO style prices (say “table olives” TABO) or where the fit of available price information is questionable (e.g. cabbage prices for “OVEG”).
+  * A checking parameter (“p_checks”) permits to check the iniitalisation in case of infeasibilites. The most frequent case observed in the last years is that lower bounds on oils expenditure become binding, suggesting the need for some systematic mismatch of price and expenditure information for this group.
+====COCO2: Final completions====