Differences

This shows you the differences between two versions of the page.

--- the_regionalised_data_base_capreg [2020/02/07 10:23] – matsz
+++ the_regionalised_data_base_capreg [2020/02/07 12:00] – matsz
@@ Line 17: / Line 17: @@
 The following table shows the availability of the different regional tables as they have been used in the current database (with series completed up to 2014). However, the current coverage concerning time and sub-regions differs dramatically between the tables and within the tables between the Member States. A second problem consists in the relatively high aggregation level especially in the field of crop production. Hence, additional sources, assumptions and econometric procedures must be applied to close data gaps and to break down aggregated data.
+**Table 6 Availability of regional datain current database after 1983**
+^ Table ^ Official availability^
+| Land use| from 1974 yearly |
+| Crop production (harvested areas, production and yields)| from 1975 yearly |
+| Animal production (livestock numbers) | from 1977 yearly |
+| Agricultural accounts on regional level | from 1980 yearly |
+| Structure of agricultural holdings and labour force | 2000, 2003, 2005, 2007, 2010, 2013 |
+<sup> Source: capri\dat\capreg\regio_data_all.gdx </sup>
+====Methodology applied in the regional data consolidation====
+In the last major update of 2015 the original data had been first stored in the TSV format designed by EUROSTAT:
+  * Unordered List ItemIn a first step, these files had been converted by an excel macro into csv format and an overall set with all items including their long text has been created to prepare further processing.
+  * In a second step these alredy GAMS readable files are stored in GDX format in folder “dat\capreg” and under version control. Meta data are added in the process as well.
+The results of these two steps is a single large tables, which comprise time series of all data retrieved from Eurostat for all tables: land use, crop production, animal populations, cow’s milk collection and agricultural accounts.
+The starting point of the methodological approach is the decision to use the consistent and complete national data base (COCO) as a frame or reference point for any regionalization. In other words, any aggregation of the main data items (areas, herd sizes, gross production and intermediate use, unit value prices and EAA-positions) of the regionalized data over regions must match the national values. This is the general rule with some exceptions.
+Given that starting position, the following approaches are generally applied:
+  * Unordered List ItemData as loaded from the regional statistics are subject to some manual consistency checks (in gams\capreg\check_and_cor_regio.gms) as well as checks for regional consistency. The latter is mainly true for animal herd sizes where we have data at the same or even more disaggregated level as found in COCO.
+  * Gaps in regional data are completed and data only given at a higher aggregation level as required in CAPRI are broken down by using existing national information.
+  * Fall back and other rules for assignments are structurally and (often) numerically identical for all regional units and groups of activities and inputs/outputs.
+  * Econometric analysis or additional data sources are used to close gaps.
+All the approaches described in the following sub sections are only thought as a first crude estimate. Wherever additional data sources are available, their content should be checked and is often used to overcome the list of these ‘easy to use’ estimates presented in here. Examples are (some) data for Norway, Sweden or Luxembourg that have been collected from national sources. The procedures described in here can be thought as a ‘safety net’ to ensure that regionalized data are technically available but not as an adequate substitute for collecting these data from additional sources.
+=== Prices ===
+The agricultural domain of REGIO does not cover regionalized prices. For simplicity, the regional prices are therefore assumed to be identical to sectoral ones((There is no easy way to relax this assumption if no further data sources are available.)):
+\begin{equation}
+UVAG_r=UVAG_s
+\end{equation}
+Young animal prices are a special case since they are not included in the COCO data base (the current methodology of the EAA does not value intermediate use of animals) but are necessary to calculate income indicators for intermediate activities (e.g. raising calves). Only exported or imported live animals are implicitly accounted for by valuing the connected meat imports and exports.
+Young animals are valued based on the ‘meat value’ and assumed relationships between live and carcass weights. Male calves (ICAM, YCAM) are assumed to have a final weight of 55 kg, of which 60 % are valued at veal prices. Female calves (ICAF, YCAF) are assumed to have a final weight of 60 kg, of which 60 % are valued at veal prices. Young heifers (IHEI, YHEI) are assumed to have a final weight of 300 kg, of which 54 % are valued at beef. Young bulls (IBUL, YBUL) are assumed to have a final weight of 335 kg, of which 54 % are valued at beef. Young cows (ICOW, YCOW) are assumed to have a final weight of 575 kg, of which 54 % are valued at beef. For piglets (IPIG, YPIG), price notations were regressed on pig meat prices and are assumed to have a final weight of 20 kg of which 78 % are valued at pig meat prices. Lambs (ILAM, YLAM) are assumed to weight 4 kg and are valued at 80 % of sheep and goat meat prices. Chicken (ICHI, YCHI) are assumed to weight 0.1 kg and are valued at 80 % of poultry prices.
+Another special case are sugar beet prices. They are still determined in a program (//‘sugar\price_est.gms’//) inherited from the 2003 EuroCARE sugar study (Henrichsmeyer et al. 2003). It determines sugar beet prices according to the sugar prices, levies and partial survey results in the 90ies. The estimation results are subsequently used to determine the beet price differentiation also in subsequent years. It is noteworthy that the same program is applied in CAPREG (via quotasprices.gms) and in CAPMOD (via data_prep.gms) to determine base year beet prices.
+=== Activity Levels===
+In cases where data on regional activity levels are missing, a linear trend line is estimated for regional and Member State time series in the definition of the regional database. The gap is then filled with a weighted average between the trend line – using a weight of R² - and a weighted average of the available observations around the gap, using a weight of 1-R². The specific formulation has the following properties. In cases of a strong trend in a time series, the back-casted and forecasted numbers will be dominated by the trend as the weight of R² will be high. With decreasing R², the estimated values will be pulled towards known values.
+Apart from gap filling another problem is that in annual cropland statistics at the regional level only cover a few crop activities (cereals with wheat, barley, grain maize, rice; potatoes, sugar beet, oil seeds with rape and sunflower; tobacco, fodder maize; grassland, permanent crops with vineyards and olive plantations). The COCO data base, however, covers some 30 different crop activities. In order to break these aggregates down to COCO definitions, the national shares of the aggregate are used.
+As an example, this approach is explained for cereals. Data on the production activities WHEA (wheat = SWHE+DWHE), BARL (barley), MAIZ (grain maize) and PARI (paddy rice) as found in COCO match directly the level of disaggregation in the regional data. Therefore, the mapped regionalized data are directly set equal to the corresponding values in the regional “raw” data. The difference between the sum of these 4 activities and the aggregate data on cereals in the regional raw data must be equal to the sum of the remaining activities in cereals as shown in COCO, namely RYE (rye and meslin), OATS (oats) and OCER (other cereals). As long as no other regional information is available, this difference from the regional raw data is hence broken down applying national shares.
+The approach is shown for OATS in the following equations, where the suffix r stands for regional data:
+\begin{align}
+\begin{split}
+LEVL_{OATS,r}=(CEREAL_r \\& \quad -WHEAT_r-BARLEY_r-MAIZEGR_r-RICE_r) \cdot \\& \quad
+& \frac{LEVL_{OATS,COCO}}{(LEVL_{OATS,COCO}+LEVL_{RYE,COCO}+LEVL_{OCER,COCO})}
+\end{split}
+\end{align}
+Similar equations are used to break down other aggregates and residual areas in the regional data ((If no data at all are found, the share on the utilisable agricultural area is used.)). The Farm Structure Survey (FSS) provides crop areas for a larger number of crops but this survey is usually conducted only every three years. Data from FSS, when available, is also used to aproximate crop areas at regional level.