Our measures of dividend policy are derived from aggregations of Compustat data. The observations in the underlying 19622000 sample are selected as in Fama and French (2001, p. 4041): “The Compustat sample for calendar year t … includes those firms with fiscal yearends in t that have the following data (Compustat data items in parentheses): total assets (6), stock price (199) and shares outstanding (25) at the end of the fiscal year, income before extraordinary items (18), interest expense (15), [cash] dividends per share by ex date (26), preferred dividends (19), and (a) preferred stock liquidating value (10), (b) preferred stock redemption value (56), or (c) preferred stock carrying value (130). Firms must also have (a) stockholder’s equity (216), (b) liabilities (181), or (c) common equity (60) and preferred stock par value (130). Total assets must be available in years t and t1. The other items must be available in t. … We exclude firms with book equity below $250,000 or assets below $500,000. To ensure that firms are publicly traded, the Compustat sample includes only firms with CRSP share codes of 10 or 11, and we use only the fiscal years a firm is in the CRSP database at its fiscal yearend. … We exclude utilities (SIC codes 49004949) and financial firms (SIC codes 60006999).”
Within this sample we count a firmyear observation as a dividend payer if it has positive dividends per share by the ex date, else we count it as a nonpayer. To aggregate this firmlevel data into useful time series, two aggregate identities are helpful.
Payers_{t }= New Payers_{t} + Old Payers_{t} + List Payers_{t} , (6)
Old Payers_{t} = Payers_{t1 } New Nonpayers_{t}  Delist Payers_{t} ._{ }(7)
The first identity describes the number of firms in the payers category and the second describes its evolution. Payers is the total number of payers at time t, New Payers is the number of initiators among last year’s nonpayers, Old Payers is the number of payers that also paid last year, List Payers is the number of firms that are payers this year and were not in the sample last year, New Nonpayers is the number of omitters among last year’s payers, and Delist Payers is the number of last year’s payers that are not in the sample this year. Note that analogous identities hold if one switches Payers and Nonpayers everywhere. Also note that lists and delists are with respect to our sample, which involves several screens. Thus new lists include both IPOs that survive the screens in their Compustat debut as well as established Compustat firms when they first survive the screens. It also includes a large number of established NASDAQ firms, appearing in Compustat for the first time in the 1970s. Similarly, delists include both delists from Compustat and firms that simply fall below the screens.
We use these aggregate totals to define three basic measures of the dynamics of dividend policy, or the propensity to pay (PTP) dividends, among certain subsets of firms.
, (8)
, (9)
. (10)
In words, the propensity to initiate PTP New is the fraction of surviving nonpayers that become new payers. The propensity to continue paying PTP Old is the fraction of surviving payers that continue paying. It can also be viewed as one minus the propensity to omit dividends. The propensity to list as a payer PTP List is selfexplanatory.
Note that these variables capture the decision whether to pay dividends, not how much to pay. We take this approach for several reasons. First, these are the natural dependent variables in a theory in which investors categorize shares based on whether they pay dividends. (Wings make a “bird,” regardless of their length.) Second, the payout ratio may be determined more by profitability than by explicit policy, whereas the decision to initiate or omit dividends is always a policy decision. Third, Fama and French (2001) document a decline in the number of payers, and no comparable pattern in the payout ratio. Nonetheless, the payout ratio is useful in discriminating among certain alternative interpretations, and we examine it later.
Table 1 lists the aggregate totals and the dividend policy variables. The sample displays similar characteristics to the sample in Fama and French (2001). For our purposes, the most notable feature of the data is the time variation in the dividend policy variables. The propensity to initiate starts out high in the early years of the sample, then drops dramatically in the late 1960s, rebounds in the mid 1970s, drops again in the late 1970s and remains low through the end of the sample. The propensity to continue paying displays less variation, as expected. The propensity to list as a payer displays the most variation. As Fama and French point out, it has declined steadily in the past few decades.
B. Demand for dividends measures
We relate these dividend policy choices to several stock market measures of the uninformed demand for dividendpaying shares. Conceptually, an ideal measure would be the difference between the market prices of firms that have the same investment policy and different dividend policies. In the frictionless and efficient markets of Miller and Modigliani (1961), of course, this price difference is zero. But uninformed demand combined with limits to arbitrage, as discussed above, can lead to a time varying price difference.
Our first measure, which we simply call the dividend premium because it is the broadest measure, is motivated by this intuition. It is the difference in the logs of the average markettobook ratios of payers and nonpayers – that is, the log of the ratio of average markettobooks.^{11} We define markettobook following Fama and French (2001). Market equity is end of calendar year stock price times shares outstanding (Compustat item 24 times item 25).^{12} Book equity is stockholders’ equity (Item 216) [or first available of common equity (60) plus preferred stock par value (130) or book assets (6) minus liabilities (181)] minus preferred stock liquidating value (10) [or first available of redemption value (56) or par value (130)] plus balance sheet deferred taxes and investment tax credit (35) if available and minus post retirement assets (330) if available. The markettobook ratio is book assets minus book equity plus market equity all divided by book assets.
We then average the markettobook ratios across payers and nonpayers in each year. The equal and valueweighted dividend premium series are the difference of the logs of these averages. These variables are listed by year in Table 2 and the valueweighted series are plotted in Figure 1. The figure shows that the average payer and nonpayer markettobooks diverge significantly at short frequencies. It reveals several interesting patterns. Dividend payers start out at a premium, by this measure, in the first years of the sample. The valuation of nonpayers then spikes up in 1967 and 1968 and falls sharply, in relative terms, through 1972. The dividend premium takes another dip in 1974, and for over two decades now payers have traded at a discount by this measure. The discount widened in 1999 but closed somewhat in 2000.
We do not and will not claim to fully understand what moves the dividend premium variable. Some anecdotal remarks from Malkiel (1999) may help to put these patterns in historical perspective. Malkiel describes a crash in growth stocks in the first years of our sample, which may account for the relatively low price of nonpayers by this measure in these years. Malkiel characterizes 1967 and 1968 as a speculative wave and the next few years as a bear market; the bear market may have increased the attractiveness of dividend payers and accounted for the rising dividend premium in this period. This peak also coincides with the implementation of the Nixon dividend controls. The sharp fall in 1974 may be associated with the removal of those controls or have a connection to ConEd’s poorly received dividend omission earlier that year. Another interesting note is that 1986 Tax Reform Act, which significantly reduced the tax disadvantage to cash dividends, did not reduce the dividend discount. This impression is consistent with the more rigorous analysis of Hubbard and Michaely (1997). Finally, the widening of the discount in 1999 coincides with the last full year of the Internet boom, and its narrowing in 2000 reflects the ensuing crash.
The primary disadvantage of the dividend premium variable is that it may also reflect the relative investment opportunities of payers and nonpayers, as opposed to uninformed demand for dividendpaying shares. We consider this interpretation at length in our discussion of noncatering explanations for the results that follow.
Our second measure is the difference in the prices of Citizens Utilities cash dividend and stock dividend share classes. As noted earlier, between 1956 and 1989 the Citizens Utilities Company had two classes of shares outstanding on which the payouts were to be of equal value, as set down in an amendment to the corporate charter. In practice, the relative payouts were close to a fixed multiple. Long (1978) describes the case in great detail. We measure the CU dividend premium as the difference in the log price of the cash payout share and the log price of the stock payout share. The 1962 through 1972 data were kindly provided by John Long and the 1973 through 1989 data are from Hubbard and Michaely (1997).^{13} Table 3 reports the CU premium year by year.
By its nature, the CU premium does not reflect anything about investment opportunities. This reduces the number of alternative explanations for why it fluctuates, but it also means that arbitraging the CU premium entails no fundamental risk, only noisetrader risk, so the amount of sentiment that it reflects may be muted. Other disadvantages include the fact that CU is just one firm; the stock payout share is more liquid than the cash payout share; there was a oneway, oneforone convertibility of the stock payout class to the cash payout class, truncating the ability of the price ratio to reveal procashdividend sentiment; certain sentimentbased mechanisms outlined above involve categorization of firms rather than shares, so a case in which one firm offers two dividend policies may lead to weaker results; and the experiment ended in 1990, when CU switched to stock payouts on both classes.
Our third measure of uninformed demand for dividends is the average announcement effect of recent initiations.^{14} Intuitively, if investors are clamoring for dividends, they may make themselves heard through their reaction to initiations. Asquith and Mullins (1983) find that initiations are greeted with a positive return on average, but they do not study whether this effect varies over time. We define a dividend initiation as the first cash dividend declaration date in CRSP in the twelve months prior to the year in which the firm is identified as a Compustat New Payer. Since Compustat payers are defined using fiscal years while CRSP allows us to use calendar years, the resulting asynchronicity means that the number of initiation announcements identified in CRSP for year t does not equal the number of Compustat New Payers in year t. Another difference arises because the required CRSP data are not always available.
Given an initiation in calendar year t, we calculate the cumulative abnormal return over the threeday window from day –1 to day +1 relative to the CRSP declaration date as the cumulative difference between the firm return and the CRSP valueweighted market index. To control for the differences in volatility across firms and time (see Campbell, Lettau, Malkiel and Xu (2000)), we scale each firm’s threeday excess return by the square root of three times the standard deviation of its daily excess returns. The standard deviation of excess returns is measured from 120 calendar days through five trading days before the declaration date. Averaging these across initiations in year t gives a standardized, cumulative abnormal announcement return A. To determine whether the average return in a given year is statistically significant, we compute a test statistic by multiplying A by the square root of the number of initiations in year t. This statistic is asymptotically standard normal and has more power if the true abnormal return is constant across securities (Brown and Warner (1980) and Campbell, Lo, and MacKinlay (1997)), which is a natural hypothesis in our context. Table 3 reports the average standardized initiation announcement effects year by year.
Our last measure of the demand for dividendpaying shares is the difference between the future returns on valueweighted indexes of payers and nonpayers. Under the rather stark version of catering outlined in the previous section, managers rationally initiate dividends to exploit a market mispricing. If this is literally the case, then a high rate of initiations should forecast low returns on payers relative to nonpayers as the overpricing of payers reverses. The opposite should hold for omissions.
Table 4 reports the correlations among the demand for dividends measures. We correlate the first three measures at year t with the excess real return on payers over nonpayers r_{D } r_{ND} in year t+1 and the cumulative excess return R_{D}  R_{ND} from years t+1 through t+3. If these variables capture a common factor in uninformed demand for dividends, we expect the dividend premium, the CU premium, and announcement effects to be positively correlated with each other, and negatively correlated with the future excess returns of payers. Table 4 shows that these correlations are as expected, with two exceptions: the CU premium and the initiation effect are negatively correlated, and the initiation effect and oneyearahead excess returns are positively correlated. The dividend premium is correlated with all of the other variables in the expected direction, however. This suggests that the dividend premium may be the single best reflection of the common factor. In any case, given that each measure has its own advantages and disadvantages, it is reassuring that they correlate roughly as expected.
C. Dividend policy and demand for dividends
Here we document the basic relationships between the dividend policy and the measures of the demand for dividendpaying shares. Figure 2 plots the propensity to initiate dividends versus the dividend premium. The propensity to initiate is shifted one year so that the figure captures the relationship between this year’s dividend premium and next year’s propensity to initiate. The figure reveals a strong positive relationship, consistent with catering. In the first half of the sample, the dividend premium and subsequent initiations move almost in lockstep. The premium then submerges in the late 1970s, leading the propensity to initiate down once again.
The dividend premium has been negative for over two decades now, and the propensity to initiate has also remained low. The figure gives a visual impression that the relationship has broken down in this period. This is misleading. In the logic of the theory, as long as dividends are discounted, there is little reason to initiate them. Beyond some range, small changes in the size of the discount are unlikely to induce changes in the rate of initiation.
To examine the relationship in the figure more formally, Table 5 regresses the dividend policy measures on the lagged demand for dividends measures:
, (11)
where PTP is the propensity to pay dividends in various subsamples, P^{DND} is the market dividend premium (valueweighted or equalweighted), A is the average initiation announcement effect, and P^{CU} is the Citizens Utilities dividend premium. All independent variables are standardized to have unit variance and all standard errors are robust to heteroskedasticity and serial correlation to four lags using the procedure of Newey and West (1987).
The first column of Panel A performs the regression that is pictured in Figure 2. A onestandarddeviation increase in the valueweighted market dividend premium is associated with a 3.90 percentage point increase in the propensity to initiate in the following year, or roughly threequarters of the standard deviation of that variable.^{15} It explains a striking 60 percent of the variation in the propensity to initiate dividends. The second column shows that the effect of the equalweighted dividend premium is essentially the same.^{16} The remaining columns show the effect of other variables, and the results of a multivariate horse race. The lagged initiation announcement effect and the CU premium have significant positive coefficients, as predicted. But they disappear in a multivariate regression that includes the dividend premium. This is consistent with an earlier indication that the dividend premium may best capture the common factor in these variables.
Panel B reports analogous results for the propensity to continue. The dividend premium effect is again as predicted by catering. One way to phrase the result is that when nonpayers are at a premium, payers are more likely to omit. The coefficient is smaller than the coefficient in Panel A, reflecting the lower variation in the propensity to continue than the propensity to initiate, as suggested by certain versions of the model. Indeed, to the extent that some omissions are forced by profitability circumstances, which we control for in the next section, it may be surprising that the dividend premium has as strong an effect as it does. The other columns of Panel B show that the other measures of demand do not have explanatory power for the propensity to continue, however.
Panel C shows that the propensity to list as a payer is also positively related to the dividend premium. The relatively large coefficient here again reflects the greater variation in the dependent variable. Using a dividend premium variable defined just over recent new lists has at least as much explanatory power. The CU premium also has a strong univariate effect here. But as before, the dividend premium wins the horse race.
Table 6 shows the relationship between dividend policy and our fourth measure of demand, the future excess returns of payers over nonpayers. In Panel A, the dependent variable is the difference between the returns on valueweighted indexes of payers and nonpayers. Panels B and C look separately at the returns on payers and nonpayers, respectively, to examine whether any results for relative returns are indeed coming from the difference in returns, which the theory emphasizes, and not payer or nonpayer returns alone. Each panel examines one, two, and threeyear ahead returns, and cumulative threeyear returns. The table reports ordinary leastsquares coefficients as well as coefficients adjusted for the smallsample bias analyzed by Stambaugh (1999). The pvalues reported in the table represent a twotailed test of the hypothesis of no predictability using a bootstrap technique described in the Appendix.
Panel A indicates that dividend policy does have predictive power for relative returns. A onestandarddeviation increase in the propensity to initiate forecasts a decrease in the relative return of payers of around eight percentage points in the next year, and thirty percentage points over the next three years. This strikes us as a substantial magnitude – a magnitude worth catering to. The predictive power of the standardized propensity to continue is similar. The propensity to list has no predictive power, however, unless a time trend is included, in which case it displays a similar level of predictability to the other dividend policy variables. The bottom panels confirm that the relative return predictability cannot be attributed to just payer or nonpayer predictability. As the theory suggests, it is the relative return that matters.
Tables 5 and 6 present the key empirical results. Firms are more likely to initiate dividends when the stock market premium for dividendpaying shares is high, by each of four measures. By some measures, including the dividend premium variable and future relative stock returns, firms are more likely to omit when demand is low. These results are consistent with the theory’s predictions.
IV. Explanations and discussion
