Allin Cottrell
1.2
2016-01-26
Compute predictions from ordered probit
Computes predictions based on results from an ordered probit
model. Two functions are offered; they each take two arguments,
namely the matrix of estimated "cut points" and the series of z-hat
values. The first of these can be obtained by selecting the
appropriate trailing rows of the $coeff matrix, and the second is
available via the $yhat accessor. See also the example script.
ordered_Pmat: returns an n x m matrix, where n is the number of
observations and m is the number of distinct response values (i.e.
the number of "cut points" plus one). Each row holds the estimated
probability of each of the m responses.
ordered_pred: returns a series containing the predicted response at
each observation; that is, the response for which the estimated
probability is highest.
```
# compute the estimated probability for each response at
# each observation
scalar n = $nobs
scalar ncut = rows(cut)
scalar m = ncut + 1
matrix Pmat = zeros(n, m)
scalar zhi
loop i=1..n --quiet
zhi = zhat[$t1+i-1]
if ok(zhi)
Pmat[i,1] = cdf(N, cut[1]-zhi)
loop j=2..ncut --quiet
Pmat[i,j] = cdf(N, cut[j]-zhi) - cdf(N, cut[j-1]-zhi)
endloop
Pmat[i,m] = 1 - cdf(N, cut[ncut]-zhi)
else
Pmat[i,] = 0/0
endif
endloop
return Pmat
```

```
# produce a series containing the response values
# with the greatest estimated probability
series pred = NA
scalar n = $nobs
scalar ncut = rows(cut)
scalar prob, probmax
scalar zhi
loop i=1..n --quiet
zhi = zhat[i]
if ok(zhi)
probmax = cdf(N, cut[1]-zhi)
jmax = 1
loop j=2..ncut --quiet
prob = cdf(N, cut[j]-zhi) - cdf(N, cut[j-1]-zhi)
if (prob > probmax)
probmax = prob
jmax = j
endif
endloop
prob = 1 - cdf(N, cut[ncut]-zhi)
if (prob > probmax)
probmax = prob
jmax = ncut + 1
endif
pred[i] = jmax - 1
endif
endloop
return pred
```

/*
Replicate the example in Wooldridge, Econometric
Analysis of Cross Section and Panel Data (MIT Press,
2002), section 15.10
*/
include oprobit_predict.gfn
open pension.gdt
# demographic characteristics of participant
list DEMOG = age educ female black married
# dummies coding for income level
list INCOME = finc25 finc35 finc50 finc75 finc100 finc101
# response variable
series y = pctstck / 50
# estimate ordered probit
probit y choice DEMOG INCOME wealth89 prftshr
# save the z-hat values
series zhat = $yhat
# how many response values are there?
scalar nvals = rows(values(y))
scalar k = rows($coeff)
# save the last (nvals-1) coefficients in matrix cut
matrix cut = $coeff[k-nvals+2:]
# get the predicted responses
series pred = oprobit_pred(cut, zhat)
print y pred -o
# percentages correctly predicted, overall
pcorr = 100*sum(y == pred)/$T
# show predicted vs actual cross-tab
xtab pred y --column