Since the 1960s economists have translated Newton’s universal law of gravitation to economic and social interactions including trade, whereby the flow between two units is directly proportional to the product of their economic sizes and inversely proportional to the distance between them. Briefly: when supply and demand increase in locations A and B, respectively, the flow of goods from A to B increases. Also, as the distance between A and B decreases the flow of traded goods from A to B increases. In this exercise we explore the gravity model of international trade, extend it to account for geopolitical and cultural parameters, and check the effects of distance over time.
First, we build a simple gravity model. To do so, we download data for trade flow, GDP of and distance between origin and destination countries, do some data transformation, and fit a regression. For all your Stata work, it is strongly advised you make use of a do-file with a sensible name.
Before you start working with data, it is important to know where any output files (including your do-files) will be saved on your computer. Such a folder is called your working directory or current directory.
If you use UU’s SolisWorkspace, you can go to File -> Change working directory, and find your local disk (and a suitable folder on your local disk). Otherwise you’ll only need to find a suitable folder on your local disk. Make sure the folder you choose has been given a sensible name, so you can easily find it at a later point.
. * cd C:\User\Location\FolderCourse\SubfolderCase
For this exercise you will need to download the data file from here: https://www.dropbox.com/s/6apelk086izbenc/col_regfile09.dta?dl=0. If you would like to explore an extended and updated dataset you will find one on the CEPII website following the link below: http://www.cepii.fr/cepii/en/bdd_modele/bdd_modele.asp. To import the data into Stata it will be necessary for it to be saved in a location Stata can access. For this reason, download the suitable file to the folder you specified in your working directory above.
. use col_regfile09, clear
Following Head (2003) we estimate the gravity equation in logarithmic form. To do that, we take a natural logarithm of the following variables: trade flow between countries (flow), GDP of origin (gdp_o) and destination (gdp_d) countries, and distance (dist). To automate the process of applying the same transformation to multiple variables we use a loop. Loops allow Stata to execute one or more commands, specified in the braces, to each element listed before the braces. In this case, for each variable listed a new variable will be generated calculating the natural logarithm of each observation.
. foreach x in flow gdp_o gdp_d dist { 2. gen log`x'=log(`x') 3. } (495,098 missing values generated) (171,252 missing values generated) (119,822 missing values generated)
Previously, we have used simple regression models to illustrate Okun’s law. In this exercise, we will extend our methods to multiple regression by regressing flow on more than one explanatory variable (gdp_o, gdp_d, and dist).
. reg log* Source │ SS df MS Number of obs = 624,145 ─────────────┼────────────────────────────────── F(3, 624141) > 99999.00 Model │ 3855252.63 3 1285084.21 Prob > F = 0.0000 Residual │ 3713093.1 624,141 5.94912544 R-squared = 0.5094 ─────────────┼────────────────────────────────── Adj R-squared = 0.5094 Total │ 7568345.73 624,144 12.1259609 Root MSE = 2.4391 ─────────────┬──────────────────────────────────────────────────────────────── logflow │ Coef. Std. Err. t P>|t| [95% Conf. Interval] ─────────────┼──────────────────────────────────────────────────────────────── loggdp_o │ .8713428 .0013879 627.80 0.000 .8686225 .8740631 loggdp_d │ .7062829 .0013229 533.88 0.000 .70369 .7088758 logdist │ -1.198455 .0037274 -321.52 0.000 -1.20576 -1.191149 _cons │ -4.172614 .035314 -118.16 0.000 -4.241828 -4.103399 ─────────────┴────────────────────────────────────────────────────────────────
The asterisk in log* signals to Sata to include all variables whose name begins with “log”. In this case, the model includes four variables: the dependent variable logflow and three independent variables loggdp_o, loggdp_d, and logdist.
If you use Stata with SolisWorkspace, you need to install outreg2 (and any other package) every time you open Stata. Otherwise you only need to install each package once.
. ssc install outreg2 checking outreg2 consistency and verifying not already installed... all files already exist and are up to date.
Export the regression output to the folder specified in your working directory above. The optional argument “word” specifies the file type the output should be saved as, while replace overwrites existing text files with the same name.
. outreg2 using gravity, word replace gravity.rtf dir : seeout
Second, we extend the model above by controlling for the presence of a shared border, a common official language, colonial history, regional trade agreements, GATT/WTO membership, and a common currency.
. reg log* contig comlang_off col_hist rta gatt_o gatt_d comcur Source │ SS df MS Number of obs = 624,145 ─────────────┼────────────────────────────────── F(10, 624134) = 68672.82 Model │ 3964868.57 10 396486.857 Prob > F = 0.0000 Residual │ 3603477.16 624,134 5.77356331 R-squared = 0.5239 ─────────────┼────────────────────────────────── Adj R-squared = 0.5239 Total │ 7568345.73 624,144 12.1259609 Root MSE = 2.4028 ─────────────┬──────────────────────────────────────────────────────────────── logflow │ Coef. Std. Err. t P>|t| [95% Conf. Interval] ─────────────┼──────────────────────────────────────────────────────────────── loggdp_o │ .8739599 .0014697 594.67 0.000 .8710794 .8768404 loggdp_d │ .7202029 .0014101 510.75 0.000 .7174392 .7229667 logdist │ -1.017497 .0043098 -236.09 0.000 -1.025945 -1.00905 contig │ .6663002 .019088 34.91 0.000 .6288885 .703712 comlang_off │ .4030361 .0085301 47.25 0.000 .3863174 .4197549 col_hist │ 1.784972 .0198994 89.70 0.000 1.74597 1.823974 rta │ .6136453 .0155947 39.35 0.000 .5830801 .6442105 gatt_o │ -.0629879 .0071939 -8.76 0.000 -.0770877 -.0488882 gatt_d │ -.2496916 .0071023 -35.16 0.000 -.2636119 -.2357714 comcur │ .8100091 .0261462 30.98 0.000 .7587634 .8612548 _cons │ -5.847367 .0400869 -145.87 0.000 -5.925936 -5.768798 ─────────────┴──────────────────────────────────────────────────────────────── . outreg2 using gravity, word gravity.rtf dir : seeout
Open the gravity.rtf file in Word to see what outreg2 does.
We now regress flow on GDP of and distance between origin and destination countries for each year separately. This means we need to run as many regressions as there are years in the sample.
statsby runs the code specified after the colon separately for each sub-sample specified in by() and collects output specified after statsby in a new dataset. The new dataset replaces data in memory. We collect the following output:
Each row in the new dataset shows results for a single year.
. statsby _b, clear by(year): reg log* (running regress on estimation sample) command: regress log* by: year Statsby groups ────┼─── 1 ───┼─── 2 ───┼─── 3 ───┼─── 4 ───┼─── 5 x................................................. 50 .........
We can visualise changes in the effect of distance on flow over time using a time series line graph. Stata recognizes data as time-series data if you specify the time dimension using tsset.
. tsset year, yearly time variable: year, 1948 to 2006 delta: 1 year
Now, we can plot a graph of the effect of distance between origin and destination countries on trade flow over time.
. tsline _b_logdist
Suggested reading: