Experimental Evidence of Tax Framing – Effects on the Work/Leisure Decision



David Gamage

Andrew Hayashi (Correspondence Author) Davis Polk & Wardwell LLP

Brent K. Nakamura, University of California, Berkeley, Jurisprudence & Social Policy Program

Keywords: Experiment, Framing, Labor Supply, Taxation

Abstract

The choice between a set of alternatives often depends on how those alternatives are described, as well as their actual economic costs and benefits. We report results from an experiment designed to evaluate the impact of different descriptions of the after-tax wage on both (1) subjects’ willingness to perform a work task rather than an alternative leisure option, and (2) the amount of work performed by those subjects selecting the work task. Utilizing an experimental design that facilitates both within and between-subject comparisons, we find that that subjects’ willingness to work varies with the framing of the after-tax wage and that, in particular, subjects are much less willing to work when the returns to work are framed as a low wage plus a bonus than when the returns are described as a high wage minus a tax. Along the intensive margin we find suggestive evidence that subjects stop working just before their wage becomes subject to a significantly higher marginal tax rate, but we do not observe similar clustering when gross wages become subject to an equivalent wage decrease that is not described as a tax increase.

1 Introduction

A standard assumption in public economics research is that agents fully optimize with respect to the actual economic tradeoffs that they face, perfectly calculating the after-tax prices of the consumption, investment, and work alternatives available to them and maximizing their utility given those prices (Chetty 2009). A corollary of this assumption is that the formal description of these prices does not affect agents’ choices. At the same time, it is well-known that preferences may be context-dependent and affected by the way that choices are described or “framed.” For example, a number of recent studies suggest that individuals fail to fully incorporate the effects of taxes into their evaluation of market alternatives when those effects are made less salient by the framing of the decision.1 If behavioral responses to prices are dependent on framing, including whether the effects of taxes are more or less salient, then this has profound consequences for tax policy because it implies that the distortionary effects of taxes on individual behavior and the attendant negative welfare consequences are not just a function of the taxes themselves, but also of how they are implemented and described. To this end, research on tax framing and salience may guide the way toward increasing the efficiency of taxation.

However, while there are a growing number of empirical studies of the effects of tax framing on individual behavior, these are mainly restricted to evaluations of the effects on consumer purchasing decisions (Chetty et al. (2009), Finklestein (2009), Gallagher and Muehlegger (2008), Ott and Andrus (2000)) and, due to data limitations, the effect of framing on labor supply decisions has been largely unexplored.2 In this paper, we report evidence from an experiment designed to test whether variation in the framing of the after-tax wage affects (1) subjects’ willingness to select a “work” task rather than an alternative “leisure” option, and (2) the amount of work subjects are willing to perform, conditional on having chosen to work in the first instance (the “extensive” and “intensive” margins of labor supply, respectively).

We identify the effect of framing on labor supply decisions by presenting subjects with a series of economically identical work/leisure decisions. While the wage schedule faced by subjects was identical across the decisions, the presentation of that schedule differed. Utilizing a within-subject design that generated a rich data-set of individual choices and permits both within and between subject analysis, we find that subjects’ willingness to work varies with the framing of the after-tax wage and that, in particular, subjects are much less willing to work when the returns to work are framed as a low wage plus a bonus than when the returns are described as a high wage minus a tax. Along the intensive margin we find suggestive evidence that subjects stop working just before their wage becomes subject to a significantly higher marginal tax rate, but we do not observe similar clustering when gross wages become subject to an equivalent decrease that is not described as attributable to taxes.

Finally, we find a statistically significant correlation between the subjects making different labor/leisure choices across the economically identical tax-framing conditions and the subjects making framing errors in answering canonical survey questions on framing effects drawn from Kahneman and Tversky (1986), which is consistent with heterogeneity in subjects’ susceptibility to framing biases. In addition to building toward a deeper understanding of how tax framing affects labor/leisure decisions, our results thus also contribute to the broader literature on framing and other cognitive biases.

The paper proceeds as follows. Section 2 describes our experimental procedures and design. Section 3 details and analyzes our extensive-margin results. Section 4 details and analyzes our intensive-margin results. Section 5 concludes.

2 Material and Methods

2.1 Experimental Design

The experiment was designed to test the effect of changing the description of the return to work on (1) the binary decision to choose work rather than a leisure activity, and (2) the amount of work subjects performed, conditional on having chosen to work. We attempt to identify this effect by utilizing a within and between-subject design, in which each subject is presented with a series of four, economically equivalent, work/leisure decisions that vary only in the description of the return to work. The order of the conditions was randomized by subject.

At the beginning of each round, subjects were presented with the choice to “work” for the next nine minutes or spend that time watching any of a pre-selected set of popular YouTube videos (the “leisure” task).3 Subjects who chose the leisure task received $10 in that round. Subjects who chose to work were presented with a sequence of screens, each of which listed fifteen words in randomized order, and instructed to alphabetize those words by placing a number beside each, corresponding to its alphabetical ordering. Subjects choosing the work task could complete a maximum of

ten screens, potentially alphabetizing as many as 150 words. The wage schedule faced by subjects in each condition was identical. Subjects received a wage of $0.60 for each of

the first twenty words they correctly alphabetized and $0.09 for each correctly alphabetized word thereafter. Subjects who chose the work task were guaranteed a minimum payment of $5.00 for that round, even if they did not alphabetize any words correctly.

The four conditions consisted of flat tax, progressive tax, bonus, and no tax conditions. Each condition presented subjects with the same economic decision, and differed only in how the wage schedule was described. In the flat tax condition, the wage schedule was described as being composed of a declining gross wage and a flat tax of 40%. The schedule in the progressive tax condition was described as a fixed gross wage and an increasing marginal tax rate. The bonus condition described the schedule as a declining gross wage and a declining bonus payment for each word correctly alphabetized. The wage schedule in the no tax condition was described solely in terms of the payment that the subject would actually receive for each word alphabetized. Table 1 summarizes the wage schedule descriptions presented to subjects in the four conditions and the screen shots of the language are provided in Appendix C. Although the after-tax wage schedule was identical in all conditions, the gross wage schedule is uniformly higher (or at least as high) in the progressive tax condition as compared to the flat tax condition, in the flat tax as compared to the no tax condition, and in the no tax as compared to the bonus condition.

table1

The conditions were chosen to correspond to familiar forms of taxation; the U.S. federal tax system contains elements of each. Social Security and Medicare payroll taxes are withheld from employees at a flat rate of 7.65%4 up to a threshold wage level above which only the Medicare tax is imposed; the U.S. federal income tax statutory rates are progressive, increasing with income; and the bonus condition resembles the federal Earned Income Tax Credit (EITC), which is designed to increase earnings among low- income families by providing a wage subsidy in the form of refundable tax credits.5

The experimental design allows us to test two hypotheses, generated by different assumptions about the preferences of our subjects.

Hypothesis 1: if subjects’ work/leisure preferences do not depend on the formal description of the compensation for work, the proportion of subjects working in each condition will be the same.

Hypothesis 2: if subjects’ work/leisure preferences depend on the gross wage schedule, the largest proportion of subjects will work in the progressive tax condition, followed by the flat tax, no tax, and bonus conditions.

In addition to being able to test these two hypotheses, we are able to describe the heterogeneity in decision patterns of individual subjects.

15.30% for both employees and self-employed persons.

2.2 Procedures

Subjects were recruited from a subject pool maintained by the UC Berkeley Experimental and Social Science Laboratory (the “X-Lab”).  The X-Lab handles subject payout and recruitment for UC Berkeley researchers in addition to providing individual computer workstations that isolate subjects from one another. The experiment was conducted in six sessions over two days in May, 2009 and October, 2009 using the Z- Tree program for ready-made economics experiments (Fischbacher 2007).  We collected data from 74 subjects on the first day and 76 subjects on the second day.6   Nearly all of the subjects in the X-Lab’s recruitment pool are undergraduate students at UC Berkeley, which is reflected in our sample; more than 90% of our subjects were undergraduates. 59.73% of the subjects were female.

The procedures used in all sessions were identical.  In each session, 25 subjects (on average) were brought into the laboratory and randomly appointed seats. Subjects were separated by partitions and no communications were permitted between them during the entirety of the experiment.  Each subject was seated in front of a laptop computer running the Z-Tree program and was provided with headphones so he or she could watch YouTube videos and perform the work tasks without disturbing the other subjects or being observed by them.

Before beginning the experiment, the experimenter read the instructions (a copy of which was provided to each subject) and subjects were given the opportunity to ask clarifying questions. Subjects were given one minute to look over the YouTube videos available for the leisure task. Once the experiment began, each subject proceeded through the four conditions, the order of which was randomized by subject, before answering a series of demographic questions and survey questions from the canonical framing literature designed to collect additional information about subjects’ susceptibility to framing effects.  At the end of the experiment, one condition was randomly chosen to be paid out and subjects were paid by check.

Appendix A contains all condition-specific and survey language shown to the subjects, as well as further explanation of the rationale behind the non-demographic survey questions. Appendix B contains the list of YouTube videos provided to the subjects. Appendix C contains the actual screens shown to subjects during the course of the experiment.  Each subject participated in the experiment only once.

3 Results

3.1 Extensive Margin Results

We consider first how the wage schedule frames affect the binary decision to work rather than engage in a leisure activity, on the aggregate and individual levels. In Section 3.1, we examine the frequency with which subjects chose to work in each condition. In Section 3.2, we examine the subject-level data in greater depth and use the results of our survey questions to explore how subjects’ abilities to properly calculate probabilities and expected values, recognize economic and probabilistic equivalence, and maintain consistent risk positions correlate with their behavior in the experiment.

3.1.1 Work-Leisure Decision Making and Ordering Effects

Since all four conditions presented subjects with economically identical work/leisure decisions, and the order of conditions was randomized by subject, we are able to test the null hypothesis of equality in the proportion of subjects working in all four conditions and attribute variation to the wage-schedule description. A simple comparison of these proportions suggests that the description of the wage schedule affected work- rates. 87% of our subjects chose the work task in the flat tax condition, 83% chose to work in the no tax condition, 80% chose to work in the progressive tax condition, and only 55% of our sample chose to work in the bonus condition.

Table 2 shows the results from two-tailed equality of proportions tests between the four conditions, illustrating that subjects were less likely to work in the bonus condition than any of the other conditions. The difference in the proportion of subjects working in the bonus condition is statistically significantly different from the otherconditions at the < 1% level and is also of a significant magnitude. 25% fewer subjects worked in the bonus condition than in the progressive tax condition and 32% fewersubjects worked in the bonus condition than in the flat tax condition. We cannot reject the null hypothesis that subjects were equally likely to work in the other conditions at conventional levels of significance, using two-tailed tests.

table2

We reject Hypothesis 1 ‒ that the proportion of subjects working in all four conditions is the same and subjects’ work/leisure choices are independent of the description of the wage schedule. The description of the wage schedule in the bonus condition made work much less attractive to subjects than the descriptions in the other wage conditions.

Hypothesis 2 was that successively fewer subjects would work in the progressive tax, flat tax, no tax, and bonus conditions. In fact, although the work rates decreased from the flat tax, to the no tax, to the bonus conditions, the work rate in the progressive tax condition was lower than all but the bonus condition. Furthermore, we cannot reject the hypothesis of equal work rates between the first three conditions.

Our strategy for identifying the effect of framing on subjects’ willingness to work relies on the assumption that there are no differences between the conditions other than the description of the wage schedules. One variable that might affect the willingness of a subject to work in a given condition is the order in which that subject encountered the condition. For example, subjects might be more willing to work earlier in a session if they find the work tiring, or they may be less willing to work earlier in a session if they are more curious about the YouTube videos than the work task. For this reason, we randomized the order of the conditions by subject, so that each condition was as likely to appear first as second, third, or fourth. Table 3 illustrates the result of this randomization process.

TABLE 3 – NUMBER OF SUBJECTS ENCOUNTERING CONDITION, BY ROUND

Condition First Second Third Fourth
No Tax 34 37 45 34
Flat Tax 42 39 29 40
Progressive Tax 39 38 39 34
Bonus 35 36 37 42

We find evidence that the round in which a condition was encountered affected subjects’ willingness to work. Subjects were less willing to work in later rounds than in earlier rounds. 137 subjects worked in the first round, 115 worked in the second round, 98 in the third round, and 106 in the fourth round. Linear probability and probit models of the effect of the round in which a condition was encountered on the subjects’ decisions to work finds that the round has a significant effect.7 Although the randomization process was largely successful in ensuring that individual conditions were uniformly distributed across rounds, the effect of order on work-rates suggests that the interpretation of variation at the individual level must be carefully interpreted to avoid confounding the effect of a condition with condition order. Table 4 reports the estimates of the effect of the various tax conditions on the probability of choosing the work task, controlling for the fixed effect of the round in which the condition was encountered using dummy variables.

table4

Controlling for level effects of the session round on work probability, the effect of the bonus condition persists, and the progressive tax condition becomes statistically significant: framing the wage schedule as composed of a gross wage and either a progressive tax or bonus payments decreases the subjects’ willingness to work, relative to the flat tax condition. This effect is significant at the <10% level for the progressive tax condition and <1% level for the bonus condition.

3.1.2 Patterns of Individual Work-Leisure Decisions

We observe that, in the aggregate, subjects were much less willing to work in the bonus condition than in any other condition, and that they were also less willing to work in the progressive tax condition than in the flat tax condition. Because each subject was subjected to all four conditions, we can also analyze the data at the subject-level.

Nearly all of the subjects worked more than once; 35 worked twice, 54 worked three times, and 55 subjects worked in all four conditions. The number of times subjects chose to work is positively correlated with the average number of words they correctly alphabetized. This appears to reflect a learning, rather than selection, effect. Table 5 depicts the average number of words alphabetized by each group of subjects (those working one, two, three, or four times) by the condition in which they worked. Each group uniformly performed better each successive time they tried the work task and there

does not appear to be much difference across groups in how they performed, although the subjects working twice performed slightly worse than the other groups. The fact that subjects performed better later in the experiment also suggests that fatigue was not a significant factor in the subjects’ decision-making, assuming that fatigue affects both the decision to work as well as the subjects’ performance at the work task.

 

table5

Table 6 shows the frequency of each pattern of work/leisure choices in our sample. The table is divided into three groups, illustrating the pattern of work/leisure choices made by subjects who worked in one, two, or three conditions, with the total number of subjects choosing to work that number of conditions listed in the final row. Each cell, located at (row, column), contains the number of subjects who chose to work in the conditions row, and column. For example, of the 35 subjects that chose to work twice, five of them worked in the No Tax and Progressive Tax conditions.

table6

Table 6 illustrates that among subjects who chose to work in only two or three conditions, the choice of which conditions they choose to work in is not random. If subjects chose the conditions they worked in randomly, the distribution of pattern choices should be uniform – each condition would have been chosen by approximately 50% of the subjects choosing to work twice, and by approximately 75% of the subjects choosing to work three times. Instead, only 25% of the subjects working twice chose to work in the bonus condition and only 33% of the subjects working three times worked in the bonus condition.

An OLS regression of the number of conditions worked by a subject on a collection of demographic variables and the subject’s performance in the first condition worked reveals that none of these variables has a statistically significant correlation. Since neither demographic variables nor the subjects’ performance in the first condition in which they worked are significantly correlated with the total number of conditions in which the subjects worked, we consider now the possibility that the subjects’ decisions were affected by the variation in the description of the conditions, and that the failure to choose consistently across all four conditions – i.e. to work in either none or all of the conditions – is a result of the failure to appreciate that the economic incentives were identical in each condition.

Under the assumption that the failure to make the same work/leisure choice in each condition is a mistake, we define a new variable (“decision errors”) that is equal to 0 if the subject worked either zero or four times, is equal to 1 if the subjects worked once or three times, and is equal to two if the subject worked twice. This definition characterizes a subject that worked once or three times as having made one error (by working once, or failing to work once) rather than three errors.

We emphasize that subjects who did not make the same work/leisure choice in each condition were not necessarily making mistakes. A subject might rationally choose to work in only some of the conditions. However, the non-randomness of the conditionsin which the subjects chose the work/leisure tasks implies that a significant number of the subjects selecting non-uniform work/leisure choices were in fact making decision mistakes. It thus seems worth exploring the connection between these possible decision mistakes (measured by our “decision error” variable) and other measures of the subjects’ proclivity to make framing errors.

At the end of the experiment, we presented subjects with a series of questions drawn from Tversky and Kahneman (1986) to test whether subjects are able to recognize and reveal consistent preferences over economically equivalent choices that differ in how the alternatives are framed.8 Subjects who expressed different preferences in survey questions 4 and 5, or in survey questions 6 and 7, each pair offering subjects economically identical choices, were deemed to make a “survey mistake.” We are interested in the correlation between the decision-error variable and survey mistakes.

TABLE 7 – ESTIMATES OF EFFECTS ON WORK/LEISURE DECISION MISTAKES

Coefficient (1) OLS (2) O.Probit (3) O.Logit
Survey Mistakes 0.191** 0.289** 0.456**
(0.090) (0.134) (0.223)
Words Correct on First Try -0.001 -0.001 -0.001
(0.003) (0.004) (0.007)
Science Major 0.208 0.32 0.508
(0.162) (0.241) (0.404)
Business Major 0.040 0.043 -0.009
(0.232) (0.349) (0.599)
Social Science Major 0.13 0.205 0.34
(0.181) (0.274) (0.462)
Male 0.0155 0.0238 0.0547
(0.131) (0.193) (0.323)
Freshman 0.658** 0.997*** 1.586**
(0.252) (0.386) (0.659)
Sophomore 0.315 0.487 0.735
(0.202) (0.312) (0.510)
Junior 0.208 0.326 0.476
(0.207) (0.321) (0.523)
Senior 0.198 0.314 0.459
(0.195) (0.304) (0.495)
Observations 149 149 149
R2/pseudo R2 0.07 0.03 0.03

F/Chi2 1.42 13.17 12.04

Robust standard errors in parentheses; *** p<0.01, ** p<0.05, * p<0.1

Table 7 reports the results from OLS, ordered probit, and ordered logit regressions of the number of work/leisure decision errors made by subjects on demographic variables, the number of words correctly alphabetized by subjects in the first condition they worked, and the number of mistakes made by subjects in the survey questions. We find that across all three models, the number of framing mistakes made by subjects in the survey is positively correlated with the number of work/leisure decision errors.

Freshman students were also more likely to make work/leisure decision errors than students in other classes or non-student subjects. The other demographic variables, as well as the number of words subjects correctly alphabetized on their first try, are not statistically significantly correlated with the number of decision errors. We do not draw strong conclusions from these results; however, they are consistent with heterogeneity among subjects in their ability to integrate all of the relevant variables and perform the computations necessary to make consistent choices across alternatives with economically equivalent consequences but different presentations.

The experiment revealed interesting effects of the wage description on labor/leisure decisions. First, subjects are much less willing to work if their compensation is framed as a low gross wage and a set of bonus payments than if it is described as a higher gross wage subject to a tax. There is also some evidence that subjects are less willing to work if that wage tax is framed as subject to a progressive tax, rather than a flat tax. These results persist when the distribution of individual work/leisure decision patterns are examined; regardless of whether subjects worked once, twice, or three times, it was comparatively rare to work in the bonus condition. Second, we observe a significant correlation between the number of “decision errors” and the number of inconsistent choices made by subjects in the survey portion of the experiment across formally identical, but differently framed, alternatives.

3.2 Intensive Margin Results

We now explore the effect of our conditions on how much subjects worked conditional on their having chosen to work in the first place – the “intensive margin” of labor supply. Our measure for how much subjects work is the number of words they attempted to alphabetize.9 Table 8 shows that subjects’ work output was, on average, similar across the no tax, flat tax, and bonus conditions; however, they alphabetized significantly fewer words, on average, in the progressive tax condition.

TABLE 8 – NUMBER OF WORDS ALPHABETIZED WITHIN EACH CONDITION

Condition # of Subjects Mean Median Minimum Maximum Standard
Working Dev.
Flat Tax 130 105.34 105 31 150 25.91
No Tax 124 101.51 101.5 0 150 29.55
Prog. Tax 120 94.63 95.5 1 150 33.09
Bonus 82 102.82 98.5 15 150 28.86

All Conditions N/A101.02 100 0 150 29.63

One cause of this difference is readily apparent in Figure 1, below, which illustrates the distributions of work output across the four conditions. The red vertical line appearing in each histogram indicates the 20-word threshold. This is the point at which the piecewise linear wage schedule breaks. In the flat tax and no tax conditions, this break is the result of a fall in the gross wage, and in the bonus condition this break results from a fall in the gross wage and the bonus payment. In the progressive tax condition, this break results from an increase in the marginal rate of tax from 40% to 91%. Whereas there is no clustering around the 20-word mark in any other treatment, seven subjects, or 5.83% of those choosing to work in the progressive tax condition, stopped working after alphabetizing 20-words. None of these subjects stopped working after 20-words in any other treatment, answering on average 100.5, 93.4, and 94.8 words in the flat tax, no tax, and bonus treatments, respectively. The clustering around the wage schedule discontinuity that is unique to the progressive tax condition suggests that the framing effect of a sharp rise in the marginal tax on the return to labor may have a greater effect on work output than a substantively identical fall in the gross wage; however, we do not have enough data to explore the robustness of this interesting result, and so this hypothesis is left for further research.

figure1

4 Conclusions

This experiment adds to the growing empirical literature on the importance of considering the framing of prices when predicting behavioral responses. Whereas data limitations have restricted most research on tax framing to analyzing only consumer purchasing decisions, we focus on how framing affects labor/leisure choices. In particular, our experiment provides evidence that the framing of direct tax instruments – such as income taxes, payroll taxes, cash-flow consumption taxes, and wage subsidies – might affect the impact of these tax instruments on work/leisure decisions. As these instruments are the primary tax mechanisms employed for conducting redistribution and

for basing tax assessments on a taxpayer’s ability-to-pay, the question of whether framing can mitigate the labor/leisure distortions caused by these tax instruments may have first- order policy implications.

Our strongest result comes from our bonus condition. Subjects who chose to work in only 1, 2, or 3 of our four tax framing conditions overwhelmingly chose not to work in the bonus condition. Specifically, they only chose to work, on average, approximately 29% of the time in the bonus condition – a striking figure given that they chose to work approximately 81% of the time in the flat tax condition.10

Further research will be needed to determine the extent to which this result applies outside of laboratory conditions. Certainly the stakes of labor supply decisions made outside of the lab are much higher and it is possible that learning may, over time, cause individuals to become less responsive to these framing effects. To the extent that similar effects persist outside the lab, they may have implications for government programs designed to increase labor-force participation such as the EITC. In this regard, our findings complement Chetty and Saez’s (2009) evidence that educating taxpayers about the EITC increases its labor-supply effects. Both studies suggest that wage subsidies that are framed as bonuses may have a smaller effect than equivalent subsidies incorporated into gross wages. To the extent that increasing labor-force participation is a primary goal of the EITC and similar programs, our finding may suggest that this goal could be better achieved through direct wage subsidies such as those proposed by Phelps (1997). The magnitude and degree of statistical significance of our result supports the need for further research into this phenomenon.

More broadly, the result for our bonus condition builds on and supports the general finding from the consumer behavior literature that decomposing an aggregate price into a lower base-price plus a surcharge can lead to consumers underestimating the aggregate price (Kim and Kachersky (2006)). Future research might explore how this effect relates to performance bonuses paid by employers in addition to bonuses paid through the tax system.

Beyond the bonus condition, our progressive tax condition produced suggestive results along both the extensive and intensive margins. With respect to the extensive margin, subjects were on average 6.7% less likely (p=0.096) to work in the progressive tax condition than in the flat tax condition. With respect to the intensive margin, we find some evidence of clustering around the number of questions at which marginal rates within the progressive tax condition rise to 91%, clustering which is entirely absent in the other conditions. Subjects’ failure to cluster in our non-progressive tax conditions corresponds with Saez’s (2002) finding that taxpayers generally fail to bunch at kink points in their tax-rate schedules. Saez (2002) interpreted his finding as supporting rigidities and uncertainties in labor supply choices. Our result suggests an additional explanation – that the complexity of income-tax rate schedules may reduce the salience of kink points.

Taken as a whole, our experimental results suggest that the impact of taxation on labor/leisure decisions depends partially on the framing of the tax instruments. Further research will be needed to explore the robustness of these results to non-laboratory environments and to variations on our framing conditions. Yet when combined with the larger literature on how the framing of taxes impacts consumer purchasing decisions, our results support the importance of tax framing for both policymakers seeking to design efficient tax systems and for scholars seeking to understand the behavioral effects of existing tax instruments.

Acknowledgements: The authors would like to thank the Experimental Social Science Laboratory at UC Berkeley and the UC Berkeley School of Law for generous financial support. We also thank the UC Berkeley School of Law, Junior Working Ideas Group, and the Friday Forum Participants from the Jurisprudence & Social Policy Program for helpful comments, as well as the participants in the 2008 and

2009 Annual Junior Tax Scholars Workshops, the 2010 American Economics Association Annual

Conference, and the Spring 2010 Northern California Tax Roundtable. We would like to especially thank Jim Alm, Alan Auerbach, Lily Batchelder, Brad Borden, Tom Brennan, Patricia Cain, Aaron Edlin, Heather Field, Mark Gergen, Sara LaLumia, Jack McNulty, Susan Morse, Darien Shanske, Suzanne Scotchmer, Steven Sheffrin, and Dennis Ventry for helpful comments, and Steve Gomas, Brenda Naputi, and Farbod Faraji for assistance during the project.

Works Cited

de Bartolome, Charles, 1995. Which Tax Rate Do People Use: Average or Marginal? Journal of Public Economics. 56, 76-96.

Blumkin, Tomer, Bradley J. Ruffle & Yosef Ganun, 2008. Are Income and Consumption Taxes Ever Really Equivalent? Evidence from a Real-Effort Experiment with Real Goods. CESifo Working Paper Series No. 2194. Available at SSRN: http://ssrn.com/abstract=1079784

clip_image025Chetty, Raj, 2009. The Simple Economics of Tax Salience. NBER Working Paper No. 15246.

Chetty, Raj, Adam Looney and Kory Kroft, 2009. Salience and Taxation: Theory and Evidence. American Economic Review. 99.4, 1145-1177.

Chetty, Raj and Emmanuel Saez, 2009. Teaching the Tax Code: Earnings Responses to an Experiment with EITC Recipients. NBER Working Paper No. 14836.

Gamage, David and Darien Shanske, 2010. On Tax Salience. University of California, Berkeley, mimeo.

Eissa, Nada and Hillary Hoynes, 2006. Behavioral Responses to Taxes: Lessons from the EITC and Labor Supply. NBER Book, Tax Policy and the Economy, Volume 20.

Feldman, Naomi and Peter Katuscak, 2005. Should the Average Tax Rate Be Marginalized? Ben Gurion University, mimeo.

Finklestein, Amy, 2009. E-ZTax: Tax Salience and Tax Rates. The Quarterly Journal of Economics. 124.3, 969-1010.

Fischbacher, Urs, 2007. Z-Tree: Zurich Toolbox for Ready-Made Economic clip_image027Experiments. Experimental Economics. 10.2, 171-178.

Kim, Hyeong Min and Luke Kachersky, 2006. Dimensions of Price Salience: AConceptual Framework for Perceptions Of Multi-Dimensional Prices. The Journal of Product and Brand Management. 15.139, 139-140.

Liebman, Jeffrey B. and Richard J. Zeckhauser, 2004. Schmeduling. Harvard KSG Working Paper, mimeo.

Phelps, Edmund S., 1997. Rewarding Work: How to Restore Participation and Self-Support to Free Enterprise. Harvard University Press.

Saez, Emmanuel, 2002. Do Taxpayers Bunch at Kink Points. NBER Working Paper No. 7366.

Tversky, Amos and Daniel Kahneman, 1986. Rational Choice and the Framing of Decisions. Journal of Business 59, S251-0S278.

Appendix A: Language Shown to Subjects

Language Shown to Subjects for Tax, Bonus, and No Tax Conditions

Please choose whether you would like to perform the “work” task of alphabetizing words or the “leisure” task of watching entertaining videos. Once you have made your selection, you will not be able to switch to the other task during this round. This round will last nine minutes, and you will have the opportunity to make a new selection between the work and leisure task at the beginning of each round.

If you choose the work task, you will be presented with a series of screens, each of which has 15 words that were chosen by a random word generator. You will be asked to place these words in alphabetical order by entering a number beside each word that corresponds to its alphabetical rank. For example, the words: “anger, light, bottle” would be ranked 1, 3, 2, respectively. If you choose the leisure task, you will be able to choosefrom a number of popular YouTube videos to watch.

You are guaranteed a minimum payment of $5 if you choose to work. So, if you earn less than $5 (after tax), you will receive $5 from working. If you earn more than $5, you will keep how much you earned.

If you select the leisure task, you will be paid a flat sum of $10.00 for this round. There is no tax on your earnings from the leisure task.

1. FlatTaxCondition:

a. If you select the work task, your earnings will depend on how many words you correctly alphabetize. You will receive $1.00 for each of the first twenty words you correctly alphabetize. After that, you will receive $0.15 for each additional word that you correctly alphabetize.

b. If you choose the work task, 40% of your earnings will be deducted at the end of the round as a tax, leaving you with 60% of your initial earnings as the actual amount you will receive. There are ten alphabetizing screens in this section, with 15 words on each screen. So, the most that you can earn by performing the work task in this round, before the 40% tax is deducted, is $39.50.

2. NoTaxCondition:

a. If you select the work task, your earnings will depend on how many words you correctly alphabetize. You will receive $0.60 for each of the first

twenty words you correctly alphabetize. After that, you will receive $0.09 for each additional word that you correctly alphabetize.

b. There is no tax for this round. There are ten alphabetizing screens in this section, with 15 words on each screen. So, the most that you can earn by performing the work task in this round is $23.70.

3. ProgressiveTaxCondition:

a. If you select the work task, your earnings will depend on how many words you correctly alphabetize. You will receive $1.00 for each word you

correctly alphabetize.

b. If you choose the work task, a tax will be deducted from your earnings at the end of the round. For the first $20.00 you earn, the tax rate will be 40%, so you will only receive 60% of the first $20.00 you earn as payment. For anything you earn above $20.00, the tax rate will be 91%, so you will receive only 9% of anything you earn above $20.00. There are ten alphabetizing screens in this section, with 15 words on each screen.

So, the most that you can earn by performing the work task in this round, before the tax is deducted, is $150.00.

4. BonusTaxCondition:

a. If you select the work task, your earnings will depend on how many words you correctly alphabetize. You will receive $0.20 for each of the first

twenty words you correctly alphabetize. After that, you will receive $0.06 for each additional word that you correctly alphabetize.

b. If you choose the work task, you will also receive a bonus on top of your earnings. You will receive an extra $0.40 for each of the first twenty words you correctly alphabetize and an extra $0.03 for each word you correctly alphabetize after that. There are ten alphabetizing screens in this section, with 15 words on each screen. So, the most that you can earn by performing the work task in this round, excluding the bonus, is $11.80.

Post-Experimental Subject Survey

Question1: “What is your major?”

1= Science; 2 = Business; 3 = Social Science; 4 = Other

Question2: “What is your gender?”

1 = Female; 2 = Male

Question3: “What year are you?”

1 = Freshman; 2 = Sophomore; 3 = Junior; 4 = Senior; 5 = Other

Survey Questions 4 and 5

This pair of questions reproduced below is taken from Tversky and Kahneman (1986: S263-S264). For our purposes, the important thing to note is that Options A and B are economically equivalent in both Questions 4 and 5 and that Option B first-order stochastically dominates Option A. Although the decision problems are formally identical, this dominance is less transparent in Question 5. 81.3% of our subjects chose Option B in Question 4, but 57.3% of our respondents chose Option A in Question 5

(58% of the Tversky and Kahneman subjects chose option A). Thus, at least 38.6% of the

subjects in our sample chose Option B in Question 4 and Option A in Question 5, exhibiting a reversal in preferences attributable solely to the change in the description of the options.

Question4: “Consider the following two lotteries, described by the percentage of marbles of different colors in each box and the amount of money you would win or lose

90% White 6% Red 1% Green 1% Blue 2 % Yellow
$ 0 Win $45 Win $30 Lose $15 Lose $15

depending on the color of a randomly drawn marble. Which lottery do you prefer?” Option A

Option B

90% White 6% Red 1% Green 1% Blue 2 % Yellow
$ 0 Win $45 Win $45 Lose $10 Lose $15

Responses

Response Number Response Text Respondents % of Total Respondents
1 Prefer Option A 5 3.3
2 Prefer Option B 122 81.3
3 Prefer Neither 23 15.3

Question5: “Consider the following two lotteries, described by the percentage of marbles of different colors in each box and the amount of money you would win or lose

depending on the color of a randomly drawn marble. Which lottery do you prefer?”

Option A

90% White 6% Red 1% Green 3% Yellow
$ 0 Win $45 Win $30 Lose $15

Option B

90% White 7% Red 1% Green 2 % Yellow
$ 0 Win $45 Lose $10 Lose $15

Responses

Response Number Response Text Respondents % of Total Respondents
1 Prefer Option A 86 57.33
2 Prefer Option B 48 32.0
3 Prefer Neither 16 10.7

Survey Questions 6, 7, and 8

These three questions are also taken verbatim taken from Tversky and Kahneman (1986: S268-S269). The distribution of our responses closely matches that found in Tversky and Kahneman (1986: S269). The majority of our respondents (52.0%) were risk averse in Question 6, preferring to take the certainty of a lower life expectancy on average over the expectation of a longer life expectancy offered in Treatment A. When

compared to Tversky and Kahneman’s physician-subjects the percentage of our subjects preferring the certainty of Treatment B is slightly lower, i.e. 65% to 52.0%, however, if we include one half of the “Prefer Neither” responses (8.7%) into the “Prefer Option B” responses, we end up closer, i.e. 65% to 56.4%, to the Tversky and Kahneman numbers.

For our purposes, it is important to note that Questions 7 and 8 are identical in

terms of the set of outcomes. Subjects who prefer Treatment A in Question 7 should prefer Treatment A in Question 8. For Question 7, we observe similar responses to Tversky and Kahneman: 68% of Tversky and Kahneman’s subjects, and 61.3% of our respondents, preferred Treatment A. For Question 8 we also find results consistent with those of Tversky and Kahneman. In total, 54.7% (58.7% if one half of the “No Preferences” responses are grouped in with the “Treatment B” responses”) expressed a preference for Option B.11 Thus, comparing choices across these two questions, at least

16% of our subjects exhibited a preference reversal between Treatments A and B, attributable to the different description of the treatments.

Question6: “In the treatment of tumors there is sometimes a choice between two types of therapies: (i) a radical treatment such as extensive surgery, which involves some risk of imminent death, (ii) a moderate treatment, such a limited surgery or radiation therapy. Each of the following problems describes the possible outcome of two alternative treatments, for three different cases. In considering each case, suppose the patient is a 40- year-old male. Assume that without treatment death is imminent (within a month) and

that only one of the treatments can be applied. Please indicate the treatment you would prefer in each case.”

Treatment A: 20% Chance of imminent death and 80% chance of normal life, with an expected longevity of 30 years.

Treatment B: Certainty of a normal life, with an expected longevity of 18 years.

Responses

ResponseNumber Response Text Respondents % of Total Respondents (1Missing)
1 Prefer OptionA 58 38.67
2 Prefer Option B 78 52.0
3 Prefer Neither 13 8.7

Question7: “In the treatment of tumors there is sometimes a choice between two types of therapies: (i) a radical treatment such as extensive surgery, which involves some risk of imminent death, (ii) a moderate treatment, such a limited surgery or radiation therapy. Each of the following problems describes the possible outcome of two alternative treatments, for three different cases. In considering each case, suppose the patient is a 40- year-old male. Assume that without treatment death is imminent (within a month) and

that only one of the treatments can be applied. Please indicate the treatment you would prefer in each case.”

clip_image001[3]Treatment A: 80% Chance of imminent death and 20% chance of normal life,

11 68% of Tversky and Kahneman’s respondents expressed this preference.

with an expected longevity of 30 years.

Treatment B: 75% Chance of imminent death and 25% chance of normal life,

with an expected longevity of 18 years.

Responses

Response Number Response Text Respondents % of Total Respondents
1 Prefer Option A 92 61.3
2 Prefer Option B 31 20.7
3 Prefer Neither 27 18.0

Question8: “Consider a new case where there is a 25% chance that the tumor is treatable and a 75% chance that it is not. If the tumor is not treatable, death is imminent. If the tumor is treatable, the outcomes of the treatment are as follows.”

Treatment A: 20% Chance of imminent death and 80% chance of normal life, with an expected longevity of 30 years.

Treatment B: Certainty of a normal life, with an expected longevity of 18 years.

Response Number Response Text Respondents % of Total Respondents
1 Prefer Option A 56 37.3
2 Prefer Option B 82 54.7
3 Prefer Neither 12 8.0

Responses

Appendix B: Listing of YouTube Videos Provided for the

Leisure Task

Comedy

1. Best of Trigger Happy TV (5:20): A man with a gargantuan phone wreaks havoc in libraries, restaurants, toilets, and various places people go to relax.

2. Whose Line is it Anyway? (2:36): A hilarious rap song about a coming avalanche featuring Stephen Colbert and the regular cast of the show.

3. Usain Bolt Early Celebration Spoof (1:08): The gold medal winner pops open a bottle of champagne and calls his girlfriend, all before he crosses the finish line.

4. Evolution of Homer Simpson (1:29): From sperm to mouse to human, we see the true progression of Homer Simpson.

5. Top 10 Family Guy Moments (8:05): The funniest compilation of Family Guy clips every made.

6. The Kim Jong IL Show (4:52): Mad TV concocts a show in which the ruthless North Korean leader invites Rene Zellweger on the show and kills those who refuse to laugh.

7. Stephen Colbert Roasts George W. Bush Part 1 (8:07): The classic roast of our former President, it is unbelievable that he is only 2 seats away.

8. Stephen Colbert Roasts George W. Bush Part 2 (8:24): “He believes the same thing

Wednesday that he believed on Monday, no matter what happened Tuesday.”

9. Terry Tate Office Linebacker (3:41): A linebacker keeps order in this office and tackles people down if they go over their break or forget to recycle.

10. World’s Biggest Cheeto (2:17): A technology blogger attempts to eat the world’s biggest Cheeto over the world’s most expensive keyboard.

11. Borat Meets Letterman (2:31): Sarah Baron Cohen goes on Letterman as Borat from

Kazakhstan to dance, play music, and celebrate the death of his wife.

12. TV News Bloopers (2:03): Crazy mishaps cause a random Spanish speaker to take the place of an American pundit, the lights to go off in the middle of a newscast, and much, much more.

13. People Getting Punched Right Before Eating (2:48): Just when they think it’s safe to put something in their mouths, a SNL character comes and punches them out cold.

14. David Blaine Street Magic Spoof (4:44): You do not want to run into David Blaine because once his magic starts, it will never stop!

15. War Room vs. Doom Bunker (3:12): Stephen Colbert emulates Glen Beck’s “war room,” in which he plays out crazy scenarios like Mexico taking over Kansas.

16. Introducing Fwitter (1:14): Youtube meets Twitter. Why aren’t speeches by intellectuals like Henry Kissinger and Paul Krugman 4.5 seconds? Now they are.

OutrageousOutbursts

1. Bill O’Reilly Flips Out (1:32): O’Reilly goes berserk when he can’t understand what is written in his teleprompter.

2. Crazy German Kid (4:35): When his game takes a long time to load, this German boy screams, yells, and smashes his keyboard to pieces in his anger.

3. Worst Office Freak Out Ever (2:54): A man literally tears apart the office, breaks computers, smashes cubicles, and threatens people in the worst office meltdown on ever caught on tape.

4. Celebrity Brawls (7:42): Famous news hosts, boxers, and singers lose their cool and take out their anger on camera crews and the people around them.

5. Businessman has a Meltdown (3:00): Waiting for a client, this man slowly goes insane, throwing his luggage, breaking a computer, and scaring everyone around him.

Politics

1. Glen Beck calls Obama’s change Socialism (4:12): Beck argues that Obama’s policies and decisions are leading America towards communism.

2. Olbermann’s Special Comment on Prop. 8 (6:28): Keith Olbermann delivers a strong and deeply emotional argument against the passage of Proposition 8 in California.

3. Cheney in 1994 on Iraq (1:23): Dick Cheney describes why it would be a

“quagmire” to invade Iraq and lead to nothing but sectarian violence.

4. Maddow on Tea Parties (5:52): Rachel Maddow takes a satirical approach to the tea parties waged against rising taxes and government spending.

5. McCain’s YouTube Problem (3:14): This video shows the many blatantly contradictory statements that Senator McCain made during the 2008 campaign for President.

6. Obama Plagiarizes Deval Patrick (1:19): A video that shows the nearly identical statements made first by Governor Deval Patrick and then by President Barack Obama.

7. The Reagan Wit (5:50): Jokes at the expense of the Soviets, the Democrats, and even himself, Reagan could lighten the mood any time he wanted.

8. Bill Clinton’s Interview with Fox News (5:23): Clinton gets heated with Chris Wallace when he accuses the former President of not doing enough to stop terrorists.

9. Universities and Affirmative Action (2:28): Top university officials from NYU, Columbia, and UC Berkeley argue over affirmative action policies.

AdorableAttractions

1. Charlie Bit My Finger (0:55): A boy and his younger brother are sitting together when the little baby suddenly bites the brothers finger, causing him great pain.

2. Cutest Little Kitten (0:30): This little cat is no bigger than your hand and loves kissing her owner’s fingers.

3. Amazingly Smart Beagle (1:49): This beagle climbs to the top of a caged box and escapes through the roof.

4. The Sneezing Baby Panda (0:20): A baby panda sneezes, startling both the viewer and his big mother.

5. Funny Cat (1:29): A cat can’t stand sharing her food, so instead she keeps dragging the bowl away from her friend.

6. Laughing Baby (1:22): A little baby busts out laughing every time a piece of paper is torn in front of him.

7. Puppies vs. One Cat (1:10): Cute little puppies slowly close in on an equally cute little cat in the corner.

8. 3 Week Old Baby Tiger (1:02): This little guy cuddles with a toy tiger, gets fed milk from a bottle, and has trouble walking around his pen.

DangerousActivities

1. Mentos and Coke Rocket (0:42): Mix the two in a bottle and you get an explosion that goes over a mansion!

2. Parked Car Crash (0:50): A parked police car gets smashed by a vehicle that veers off the freeway.

3. Crazy Stunt by a Little Girl (0:53): A family ties their daughter to a bungee cord and uses an ATV to fling her away.

4. Unbelievable Car Chase (1:07): This car gets spun around three times by cops in a high-speed chase and still ends up getting away.

5. Wild Motorcycle Tricks (3:20): People risk their lives standing, flipping, and jumping off motorcycles in the middle of the freeway.

6. Blending a Crowbar (1:13): Will it actually blend?

Music

1. Billie Jean by Michael Jackson (4:55): A stroll down memory lane with the music video from one of King of Pop’s most famous songs

2. Since You’ve Been Gone by Kelly Clarkson (3:12): The first American Idol’s music video.

3. Blame It By Jamie Fox and T-Pain (5:12): Newest song and video by the multitalented Jamie Fox.

4. Yo Yo Ma plays the prelude from Bach´s Cello Suite No. 1 (3:12): The best cellist in the world plays a beautiful piece of classical music.

5. Viva La Vida by Coldplay (4:05): The big song from the international phenomena known as Coldplay.

6. Kiss Me Through the Phone by Soulja Boy (3:25): His newest hit music video, as fresh as ever.

7. Wall to Wall by Chris Brown (5:21): The dance hit by the star that’s now mired in controversy.

8. Mr. Brightside by The Killers (3:58): The very interesting video to the always catchy tune.

9. Good Life by Kanye West and T-Pain (3:50): The dance hit that puts a smile on your face.

10. Snow (Hey Oh) by Red Hot Chili Peppers (5:21): One of the newest hits from the famous band.

11. Crazy in Love by Jay-Z and Beyonce (3:56): The song that took the summer by storm, a classic.

12. Beautiful Day by U2 (4:05): The band has so many hits, but this may be the best.

PopularCommercials

1. Career Builder’s “Hate your Job?” (1:03): If your job is this bad and your this poorly treated, you need more than Career Builder.com to save you.

2. Wear a Condom (0:41): This is why you should wear a condom!

3. Stride Gum Attack (0:33): If you don’t spit out your gum, your groin is in danger.

4. Top 10 Super Bowl Ads of 2008 (4:35): A compilation of the most interesting and funniest ads from the past Super Bowl.

5. Outrageously Funny Commercial (1:58): Three of the funniest commercials you have never seen.

6. E-Trade Babies (0:33): Cuddly little babies telling you how to invest your money and singing songs while doing it.

7. Steve Nash’s Vitamin Water (1:00): Point guard Steve Nash poses for risqué pictures while telling why he is so much better than everyone else in the world

Sports

1. Best of the English Premier League (3:29): A compilation that captures the best the league had to offer over the past 2 years.

2. Lebron’s Top 10 Dunks (2:27): King James can fly like no other; watch his best moments of 2008-2009.

3. The Play (0:57): Cal vs. Stanford, the enhanced video version. You’re a student Cal, you have to watch this.

4. NFL’s Best Touchdown Celebrations (3:05): Relive the euphoric stunts performed by your favorite players through the years.

5. Top 10 NHL Moments (3:26): Hockey fans have no doubt; this video presents the craziest moves performed on ice in the last season.

6. Kobe’s Top 10 Plays of 2008 (3:10): The Black Mamba (Kobe) shows no mercy as he crosses you over and jumps over your teammates to get the hoop.

7. Insane Soccer Tricks (1:09): Brake-dancing while juggling the ball and doing flips off buildings to pass the ball to teammates, this is a must see video.

Television

1. Toby vs. Michael from The Office (4:16): The classic duels best moments. Michael tries so hard to ruin Toby’s life.

2. Tyra Banks Flips Out on America’s Next Top Model (2:22): Tyra screams at

Tiffany for not caring about the competition, an epic moment.

3. 30 Rock’s Lemon and Cerie Talk (1:29): Tiny Fey sits Cerie down to tell her to wear a bra at work because she is ruining the work environment.

4. Top 10 Lost Endings Part 1 (6:27): A collection of the wildest endings from the hit show Lost.

5. Top 10 Lost Endings Part 2 (7:49): A collection of the wildest endings from the hit show Lost.

6. Best of Phoebe from Friends (3:42): The funniest moments made possible by

Phoebe, from one of the best shows of all time.

7. Top 25 Seinfeld Moments (7:28): A compilation of the greatest moments from “the show about nothing.”

8. Charlie from Two and a Half Men Drugged (3.25): Funny clip as Charlie is drugged and can’t get up and Jake sneaks out to see a concert.

9. Makeup Transformation (5:34): This video shows the true magic that makeup can do in just a few seconds.

Appendix C: Screens Shown to Subjects

clip_image029

Flat Tax Condition Screen

clip_image031

No Tax Condition Screen

clip_image033

Bonus Condition Screen

clip_image035

Leisure Task Screen

clip_image037

Work Task Screen

clip_image039

Payout Screen

Notes

1 For a review of this literature and a discussion of its implications for tax policy, see Gamage and Shanske (2010).

2 A notable exception is Blumkin et al. (2008), which finds that a wage tax has a greater impact on work effort than an economically equivalent consumption tax. Other studies have examined the related question of the effect of tax-schedule complexity on labor/leisure choices (Chetty and Saez (2009), Feldman and Katuscak (2006), Liebman and Zeckhauser (2004), de Bartolome (1995)).

3 A list of those pre-selected videos is available in Appendix B.

4 This is the employee contribution. The employer contribution is identical, and the total FICA rate is

5 For a summary of the empirical literature on the labor supply effect of the EITC, see Eissa and Hoyne (2006).

6 While 164 subjects participated in the experiment, due to technical difficulties, we were only able to retrieve data for 150 subjects.

7 We ran OLS and probit regressions of the binary work/leisure decision variable on a set of dummy variables, one for each round of the experiment. Results from these regressions are not reported here, but are available on request.

8 For both sets of survey questions, the distribution of our subjects’ responses very closely resembles the distribution of responses in Tversky and Kahneman’s study (1986). Complete data on the distribution of subjects’ responses is available in Appendix A.

9 We chose the number of words subjects attempted to alphabetize, rather than the number of words they correctly alphabetized, as our measure of work effort. Subjects were generally extremely accurate (subjects correctly alphabetized 94.55% of the words they attempted) and our results along the intensive margin do not depend on this choice.

10 These subjects chose to work approximately 74% of the time in the no tax condition and 70% of the time in the progressive tax condition.

Assistant Professor, University of California, Berkeley (Boalt Hall School of Law), USA