“Partimque figures rettulit antiquas, partim nove monstra creavit.”
(“Partly we recovered the old, familiar things, partly we created something wondrous and new.”)
- Ovid, Metamorphoses, I: 436-37
At the confluence of computer science, economics, law, marketing, psychology, and technology.
Email: am253@cornell.edu; anirban@avyayamholdings.com
Github: https://github.com/anirban-mu
Google Scholar: https://scholar.google.com/citations?user=V7wCZ5EAAAAJ&hl=en
ORCID: https://orcid.org/0000-0001-6381-814X
SSRN: https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=431500
Chang, Hannah H., Anirban Mukherjee, and Amitava Chattopadhyay. More voices persuade: The attentional benefits of voice numerosity. Journal of Marketing Research 60, no. 4 (2023): 687-706. Direct link.
The authors posit that in an initial exposure to a broadcast video, hearing different voices narrate (in succession) a persuasive message encourages consumers’ attention and processing of the message, thereby facilitating persuasion; this is referred to as the voice numerosity effect. Across four studies (plus validation and replication studies)—including two large-scale, real-world data sets (with more than 11,000 crowdfunding videos and over 3.6 million customer transactions, and more than 1,600 video ads) and two controlled experiments (with over 1,800 participants)—the results provide support for the hypothesized effect. The effect (1) has consequential, economic implications in a real-world marketplace, (2) is more pronounced when the message is easier to comprehend, (3) is more pronounced when consumers have the capacity to process the ad message, and (4) is mediated by the favorability of consumers’ cognitive responses. The authors demonstrate the use of machine learning, text mining, and natural language processing to process and analyze unstructured (multimedia) data. Theoretical and marketing implications are discussed.
Moon, Sungkyun, Kapil R. Tuli, and Anirban Mukherjee. Does disclosure of advertising spending help investors and analysts? Journal of Marketing 87, no. 3 (2023): 359-382. Direct link.
Publicly listed firms have discretion to disclose (or not) advertising spending in their annual (10-K) reports. The disclosure of advertising spending can provide valuable information because advertising is a leading indicator of future performance. However, estimates of advertising spending are available from data providers, arguably mitigating the need for its formal disclosure. This article argues that firms’ disclosure of advertising spending provides more complete and public information and therefore lowers investor uncertainty about future firm performance (idiosyncratic risk). Empirical analyses show that the effect is largely driven by the negative effect of disclosure of advertising spending on analyst uncertainty. Consistent with agency theory, the negative effect of the disclosure of advertising spending on analyst uncertainty is stronger for firms with more financial resources, firms with lower disclosure quality, and firms that are in more competitive industries. Additional analyses show that the disclosure of advertising spending has a significant positive effect on firm value in specific sectors. These results, therefore, identify an avenue for chief marketing officers to play a greater role in managing investor relations. In addition, they suggest strong merit for the Securities and Exchange Commission and the Financial Accounting Standards Board to reconsider current regulations governing advertising spending disclosure.
Gielens, Katrijn, Marnik G. Dekimpe, Anirban Mukherjee, and Kapil R. Tuli. The future of private-label markets: A global convergence approach. International Journal of Research in Marketing 40, no. 1 (2023): 248-267. Direct link.
Private-label (PL) shares are characterized by considerable heterogeneity across both countries and categories, not only in their current level, but also in the rate at which they are growing. This creates ambiguity about their remaining growth potential. To offer insights into the likely long-run PL shares, we take a forward-looking perspective by means of a convergence model. We apply the model to two unique datasets that together span more than 50 countries, both emerging and developed, across more than 70 CPG categories. We find evidence of partial PL convergence: even though PL shares will become more similar, part of the currently observed heterogeneity will persist. The future evolution in two key marketing instruments, new-product introductions by both NB manufacturers and retailers and the NB-PL price gap, is found to play a substantial role in shaping the global PL landscape of the future. This impact is not uniform, however, but depends on the category, and varies with the retail, economic and cultural context. In addition, the long-run impact of both marketing drivers differs from what is currently observed, suggesting that managers should not adhere too strongly to earlier practices when planning for the future.
Mukherjee, Anirban, and Vrinda Kadiyali. The competitive dynamics of new DVD releases. Management Science 64, no. 8 (2018): 3536-3553. Direct link.
We study the market for new (movie) DVDs in the United States. Our demand model captures seasonality, freshness (i.e., time between theatrical and DVD release), and state dependence. We also develop a structural model of dynamic competition in which studios balance waiting for high-demand weeks, against reduced freshness, and against competitive crowding. We find that studios emphasize DVD revenues from larger movies (by theatrical revenue) over DVD revenues from smaller movies. Studios also emphasize revenue from consumers who prefer larger and fresher movies. These behaviors are consistent with managerial conservatism: studio executives forgo DVD revenues from smaller movies to ensure the DVD success of larger movies.
Mukherjee, Anirban, Ping Xiao, Li Wang, and Noshir Contractor. Does the opinion of the crowd predict commercial success? Evidence from Threadless. In Academy of management proceedings, vol. 1, p. 12728. Academy of Management Briarcliff Manor, 2018. Direct link.
Crowdsourcing new products involves an open call for creative ideas. To select among submissions, crowdsourcing portals ask the community (the “crowd”) to voice its opinion. Does the voice of the crowd predict the commercial success of a new product? This is an open question because over a half a century of research in consumer behavior is inconclusive on how peoples’ expressed attitudes predict their behavior. We study this question on a pioneering crowdsourcing portal, Threadless.com. We collect and examine a large-scale dataset tracking about 150,000 designs from 45,000 designers that received almost 150 million votes from 600,000 users between 2004 and 2010. We find that the counts of positive and neutral votes are consistent predictors of sales. However, the count of negative votes is an inconsistent predictor of sales – receiving more negative votes is associated with higher sales from the users who cast the votes, but lower sales from the users who did not cast the votes. These findings are consistent with users strategically voting down their best competitors to improve their odds of being selected.
Tuli, Kapil R., Anirban Mukherjee, and Marnik G. Dekimpe. On the value relevance of retailer advertising spending and same-store sales growth. Journal of Retailing 88, no. 4 (2012): 447-461. Direct link.
In response to recent calls to study factors that determine a retailer's stock price, this study draws on signaling theory to examine the impact of two key marketing metrics that are widely disclosed by retailers to investors, advertising spending and growth in same-store sales (COMPS), and highlights the moderating role of various firm- and sector-specific factors. Using a stock-response model estimated on a sample of 1,646 observations for 257 retailers, the authors find that the value relevance of advertising spending and COMPS depends on the financial condition of, and the competitive pressures faced by, the retailer. In addition, the positive effect of COMPS on stock returns is found to be stronger in the presence of decreases in advertising spending.
Mukherjee, Anirban, and Vrinda Kadiyali. Modeling multichannel home video demand in the US motion picture industry. Journal of Marketing Research 48, no. 6 (2011): 985-995. Direct link.
The U.S. motion picture industry has become increasingly reliant on posttheatrical channel profits. Two often-cited drivers of these profits are cross-channel substitution among posttheatrical channels and seasonality in consumer preferences for any movie. The authors use a differentiated products version of the multiplicative competitive interaction model to investigate these two phenomena. They estimate the model using data from 2000 and 2001 on two posttheatrical channels in the U.S. market: purchase and rental home viewing channels. Contrary to expectations based on business press commentary, after controlling for seasonality and movie attributes, the authors find low cross-channel price and availability elasticity for both channels. To measure the extent of cross-channel cannibalization, they simulate a 28-day window of sequential release with either purchase or rental channel going first. They find that windowing reduces the sum of revenues across both channels, because more consumers choose to not purchase or rent when faced with older movies in their favored channel rather than to switch to the alternative channel with newer movies.
Addressing Dynamic and Sparse Qualitative Data : A Hilbert Space Embedding of Categorical Variables (with Hannah H. Chang), https://arxiv.org/pdf/2308.11781.pdf
We propose a novel framework for incorporating qualitative data into quantitative models for causal estimation. Previous methods use categorical variables derived from qua- litative data to build quantitative models. However, this approach can lead to data-sparse categories and yield inconsistent (asymptotically biased) and imprecise (finite sample bia- sed) estimates if the qualitative information is dynamic and intricate. We use functional analysis to create a more nuanced and flexible framework. We embed the observed categories into a latent Baire space and introduce a continuous linear map—a Hilbert space embedding—from the Baire space of categories to a Reproducing Kernel Hilbert Space (RKHS) of representation functions. Through the Riesz representation theorem, we establish that the canonical treatment of categorical variables in causal models can be transformed into an identified structure in the RKHS. Transfer learning acts as a catalyst to streamline estimation—embeddings from traditional models are paired with the kernel trick to form the Hilbert space embedding. We validate our model through comprehensive simulation evidence and demonstrate its relevance in a real-world study that contrasts theoretical predictions from economics and psychology in an e-commerce marketplace. The results confirm the superior performance of our model, particularly in scenarios where qualitative information is nuanced and complex.
Agentic AI: Autonomy, Accountability, and the Algorithmic Society (with Hannah H. Chang), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5123621
Agentic Artificial Intelligence (AI) systems can autonomously pursue long-term goals, make decisions, and execute complex, multi-turn workflows. Unlike traditional generative AI, which responds reactively to prompts, agentic AI proactively orchestrates processes, such as autonomously managing complex tasks or making real-time decisions. This transition from advisory roles to proactive execution challenges established legal, economic, and creative frameworks. In this paper, we explore challenges in three interrelated domains: creativity and intellectual property, legal and ethical considerations, and competitive effects. Central to our analysis is the tension between novelty and usefulness in AI-generated creative outputs, as well as the intellectual property and authorship challenges arising from AI autonomy. We highlight gaps in responsibility attribution and liability that create a ‘moral crumple zone’—a condition where accountability is diffused across multiple actors, leaving end-users and developers in precarious legal and ethical positions. We examine the competitive dynamics of two–sided algorithmic markets, where both sellers and buyers deploy AI agents, potentially mitigating or amplifying tacit collusion risks. We explore the potential for emergent self-regulation within networks of agentic AI—the development of an ‘algorithmic society’—raising critical questions: To what extent would these norms align with societal values? What unintended consequences might arise? How can transparency and accountability be ensured? Addressing these challenges will necessitate interdisciplinary collaboration to redefine legal accountability, align AI-driven choices with stakeholder values, and maintain ethical safeguards. We advocate for frameworks that balance autonomy with accountability, ensuring all parties can harness agentic AI’s potential while preserving trust, fairness, and societal welfare.
AI Knowledge and Reasoning: Emulating Expert Creativity in Scientific Research (with Hannah H. Chang), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4738442, https://arxiv.org/pdf/2404.04436.pdf
We investigate whether modern AI can emulate expert creativity in complex scientific endeavors. We introduce novel methodology that utilizes original research articles published after the AI's training cutoff, ensuring no prior exposure, mitigating concerns of rote memorization and prior training. The AI are tasked with redacting findings, predicting outcomes from redacted research, and assessing prediction accuracy against reported results. Analysis on 589 published studies in four leading psychology journals over a 28-month period, showcase the AI's proficiency in understanding specialized research, deductive reasoning, and evaluating evidentiary alignment—cognitive hallmarks of human subject matter expertise and creativity. These findings suggest the potential of general-purpose AI to transform academia, with roles requiring knowledge-based creativity become increasingly susceptible to technological substitution.
Baire Space Embeddings: Handling Missing, Dynamic, and Sparse Data
Bridging the Gap: Using Interpretable AI to Incorporate Real-World Product Descriptions in Consumer Research Experiments (with Hannah H. Chang and Sachin Gupta)
This paper presents and demonstrates a novel AI-driven research design for consumer experiments. Conventional experiments often require the use of simplified, abbreviated, and stylized stimuli; constraints that may limit the realism, generalizability, and practical relevance of findings. In contrast, the proposed approach enables the use of myriad real-world, unstructured product descriptions as stimuli without simplification, abbreviation, or stylization. To process experimental data using the proposed method, the authors develop an innovative, interpretable AI model that they term labGPT. Comprising a partitioned deep learning neural network paired with a foundational large language model, labGPT generates low-dimensional, interpretable numerical representations of unstructured verbal descriptions. These representations are then employed in a statistical model of consumer responses. Theory testing is conducted by examining model estimates. To demonstrate the practical application and benefits of the proposed design, the authors study preference dynamics in a choice experiment with 1,000 consumers, who are shown about 50,000 wine descriptions randomly sampled from almost 120,000 wines in the market. Results suggest that the proposed design can improve the realism and external validity of consumer experiments by bridging the gap between the laboratory-based information environment and the real-world marketplace.
CAVIAR: Categorical Variable Embeddings for Accurate and Robust Inference (with Hannah H. Chang), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4787016 , https://arxiv.org/pdf/2404.04979.pdf
Social science research often hinges on the relationship between categorical variables and outcomes. We introduce CAVIAR, a novel method for embedding categorical variables that assume values in a high-dimensional ambient space but are sampled from an underlying manifold. Our theoretical and numerical analyses outline challenges posed by such categorical variables in causal inference. Specifically, dynamically varying and sparse levels can lead to violations of the Donsker conditions and a failure of estimation functionals to converge to a tight Gaussian process. Traditional approaches, including the exclusion of rare categorical levels and principled variable selection models like LASSO, fall short. CAVIAR embeds the data into a lower-dimensional global coordinate system. The mapping can be derived from both structured and unstructured data, and ensures stable and robust estimates through dimensionality reduction. In a dataset of direct-to-consumer apparel sales, we illustrate how high-dimensional categorical variables, such as zip codes, can be succinctly represented, facilitating inference and analysis.
Charting the Parrot's Song: A Maximum Mean Discrepancy Approach to Measuring AI Novelty, Originality, and Distinctiveness (with Hannah H. Chang), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5213450, https://arxiv.org/abs/2504.08446
Current intellectual property frameworks struggle to evaluate the novelty of AI-generated content, relying on subjective assessments ill-suited for comparing effectively infinite AI outputs against prior art. This paper introduces a robust, quantitative methodology grounded in Maximum Mean Discrepancy (MMD) to measure distributional differences between generative processes. By comparing entire output distributions rather than conducting pairwise similarity checks, our approach directly contrasts creative processes-overcoming the computational challenges inherent in evaluating AI outputs against unbounded prior art corpora. Through experiments combining kernel mean embeddings with domain-specific machine learning representations (LeNet-5 for MNIST digits, CLIP for art), we demonstrate exceptional sensitivity: our method distinguishes MNIST digit classes with 95% confidence using just 5-6 samples and differentiates AI-generated art from human art in the AI-ArtBench dataset (n=400 per category; p<0.0001) using as few as 7-10 samples per distribution despite human evaluators' limited discrimination ability (58% accuracy). These findings challenge the "stochastic parrot" hypothesis by providing empirical evidence that AI systems produce outputs from semantically distinct distributions rather than merely replicating training data. Our approach bridges technical capabilities with legal doctrine, offering a pathway to modernize originality assessments while preserving intellectual property law's core objectives. This research provides courts and policymakers with a computationally efficient, legally relevant tool to quantify AI novelty-a critical advancement as AI blurs traditional authorship and inventorship boundaries.
Cognitive Boundaries of Narrating Voices in Persuasive Videos (with Hannah H. Chang)
Product videos and video ads are prevalent and important in consumer decision making. These videos often use voices to narrate messages to consumers. Yet, there is a relative scarcity of scientific evidence showing how narrating voices in videos can impact consumer responses. We investigate when and why hearing the same versus different voices narrate a message in videos can exert enhancing versus backfiring effects. Results show that compared to hearing the same voice, hearing different voices (a) boosts persuasion when initial consumer attention is absent, as a change in voice can help draw attention; but (b) reduces persuasion when attention is present, as consumers incur greater processing costs in understanding speech by different voices. The effects manifest in five controlled experiments with almost 2,000 participants and a real-world video dataset with over 46 million “likes” among more than 10 billion views. This work sheds light on the role of narrating voices in asynchronous video communication and offers marketers theory-driven, practical advice on the design of the voice element in persuasive videos.
Dalal Street Blues: The Socio-Economic Environment and the Demand for Bollywood Movies (with Ping Xiao), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4776156
How far are we a product of our times—does what we watch vary with the macro environment? In this study, we investigate the influence of the socio-economic environment on movie demand in India. Through a detailed analysis of data describing revenues by movie theater, movie, and week, for all multiplex (multi-screen) movie theaters and all movies in India, we establish the influence of escapism (i.e., selective exposure to media to escape from reality) and positional consumption (i.e., consumption to obtain status) as key determinants of demand. Incorporating a rich set of attitudinal and economic measures, and accounting for variation in movie quality, market demand, and seasonality, we find that hard economic times increase the demand for more escapist movies. Conversely, during such times, demand decreases in theatrical locations where attendance is scarcer and hence more positional. Generalizing the results, our data suggest that the election of Narendra Modi in 2014, which ushered in a wave of economic optimism, decreased the demand for more escapist movies.
Describing Rosé: An Embedding-Based Method for Measuring Preferences (with Hannah H. Chang), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3859740
Many products and services are best (and typically) described in prose. In extant preference-measurement methods, however, due to the challenge of numerically representing prose in econometric models, products can only be described to participants and portrayed in the utility model as a list of attributes. In this research, the authors develop an embedding-based utility model and preference method that addresses this limitation; in it, products are described to participants in (unstructured) prose. The proposed method provides three benefits: (1) in it, products can be described more completely, (2) it improves study realism, and (3) it enables a more detailed measurement of preferences. The authors employ the proposed method to measure consumer preferences in Australia, New Zealand, and the United States for wines made in 427 wine-growing regions in 44 wine-growing countries, from 708 wine-grape varietals. They find the proposed model has superior in-sample fit and generates better out-of-sample predictions than benchmark models. Importantly, the method is able to capture differences in consumers’ valuation for wines (products) that are observationally equivalent in categorical attributes, and therefore indistinguishable in classical categorical variable-based analysis. The use of the proposed model as a decision support system for marketing activities is demonstrated.
Does the Crowd Support Innovation? Innovation Claims and Success on Kickstarter (with Cathy Yang, Ping Xiao, and Amitava Chattopadhyay), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3003283
Online crowdfunding is a popular new tool for raising capital to commercialize product innovation. Product innovation must be both novel and useful (1-4). Therefore, we study the role of novelty and usefulness claims on Kickstarter. Startlingly, we find that a single claim of novelty increases project funding by about 200%, a single claim of usefulness increases project funding by about 1200%, and the co-occurrence of novelty and usefulness claims lowers funding by about 26%. Our findings are encouraging because they suggest the crowd strongly supports novelty and usefulness. However, our findings are disappointing because the premise of crowdfunding is to support projects that are innovative, i.e. that are both novel and useful, rather than projects that are only novel or only useful.
Emotional Appeals as Drivers of Social Media and Advertising Engagement in Real-World Marketplaces: Using AI to Code Variables in Consumer Research (with Hannah H. Chang), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5153265
Marketers often employ emotional appeals to evoke specific feelings such as happiness, excitement, or nostalgia, thereby fostering a positive connection between consumers and brands. However, the efficacy of emotional appeals in driving consumer engagement in the real-world remains unclear. While field data is readily available, the study of emotional appeals at scale is constrained by the high costs of coding data manually. In this research, we propose the use of AI systems to algorithmically code emotional appeals in large-scale, unstructured marketing data. We present novel methodology that leverages state-of-the-art automatic machine learning to develop custom AI systems. We show that directly including AI-coded variables in econometric models can lead to inconsistent estimates. We develop novel estimators based on measuring moment matrices in a calibration subsample of the data. We apply our approach to both synthetic data and real-world field data, examining the effectiveness of humor and physical appeals on Instagram and YouTube. Our methodology is computationally stable, consistent, and highly efficient. Results show that our proposed estimators effectively bridge the gap between highly precise but smaller scale manual coding, and more imprecise but large-scale AI-based coding.
Fido's Ball: An Application of the Yule-Simon Process to Generating and Categorizing Qualitative Independent Variables
Forecasting in Rapidly Changing Environments: An Application to the U.S. Motion Picture Industry (with Vrinda Kadiyali), https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=6010&context=lkcsb_research
Markets with rapidly changing environments provide forecasting challenges because of fewer similarities between past and future outcomes. In this paper, we provide a methodology that enables forecasting with relatively short histories. The application is to the U.S. motion picture industry where we forecast revenues in theatrical, sales (DVD and VHS), and rental channels. Using short market histories of similar products, we account for (1) observed and unobserved movie-specific characteristics, (2) seasonality of demand, (3) competition within and across multiple distribution channels (4) market expansion, substitution and complementarity between movies inside and across distribution channels. We extend the multiplicative competitive interaction model (Cooper and Nakanishi (1988)) to multiple distribution channels and build a novel two-step estimation method that allows for endogenous release schedules. We find our model outperforms existing models in most cases.
Heuristic Reasoning in AI: Instrumental Use and Mimetic Absorption (with Hannah H. Chang), https://arxiv.org/pdf/2403.09404.pdf, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4754533
Deviating from conventional perspectives that frame artificial intelligence (AI) systems solely as logic emulators, we propose a novel program of heuristic reasoning. We distinguish between the ‘instrumental’ use of heuristics to match resources with objectives, and ‘mimetic absorption,’ whereby heuristics manifest randomly and universally. Through a series of innovative experiments, including variations of the classic Linda problem and a novel application of the Beauty Contest game, we uncover trade-offs between maximizing accuracy and reducing effort that shape the conditions under which AIs transition between exhaustive logical processing and the use of cognitive shortcuts (heuristics). We provide evidence that AIs manifest an adaptive balancing of precision and efficiency, consistent with principles of resource-rational human cognition as explicated in classical theories of bounded rationality and dual-process theory. Our findings reveal a nuanced picture of AI cognition, where trade-offs between resources and objectives lead to the emulation of biological systems, especially human cognition, despite AIs being designed without a sense of self and lacking introspective capabilities.
Intellectual Property Piracy and the Intersectionality of Artistic Merit, Gender, and Race (with Hannah H. Chang), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4750624
We pivot from traditional theories of intellectual property piracy that focus on financial drivers—such as pricing, accessibility, and affordability—to investigate the intersectionality of artistic merit, gender, and race. Utilizing an 18-year dataset, we examine the illicit release (leaking) of films at the Academy of Motion Picture Arts and Sciences, the organization responsible for the Oscars. Despite stringent safeguards, 54% of films, amounting to $41 billion in production expenditures and $66 billion in U.S. box-office revenues, were leaked between 2003 and 2020. Employed interlocked hypotheses and falsification tests, we show the leak of films aligned with increased access to high-quality content featuring historically marginalized gender groups. We do not observe similar findings for historically marginalized racial groups. Films recognized for artistic excellence and those featuring white female Oscar nominees were more likely to be leaked. The impact of white female nominees exceeded that of white male nominees; both groups exceeded that of non-white nominees. We contrast such findings with leaks in for-profit channels where patterns align with traditional financial factors but not the intersectionality of artistic merit, gender, and race. Our findings invite a reassessment of intellectual property management strategies, advocating for a more balanced approach incorporating factors such as fairness and cultural inclusivity.
Is Volunteering a Gateway to Increased Monetary Giving? Evidence from a Field Experiment (with Sachin Gupta and Sungjin Kim)
This paper presents a field experiment conducted in the context of a nonprofit organization to investigate the causal link between volunteering and monetary giving. The study involved 149,480 individuals with no prior volunteering or giving history with the nonprofit. The treatment group received additional emails as part of a campaign designed to encourage participation as a volunteer in an upcoming citizen science event. Approximately seven weeks after the citizen science event, a fundraising campaign was launched, soliciting monetary donations. The study compares donation behaviors between the treatment and control groups. We find that the marketing intervention of additional emails increased volunteer participation by 10%. It also led to a 32% increase in donation participation rate, but did not affect the average amount donated by each participant. These findings are consistent with volunteering affecting the extensive margin (the decision to donate) but not the intensive margin (the amount donated) of donation behavior. Our paper makes three key contributions. First, it provides novel evidence on the causal relationship between volunteering and monetary giving. Second, it utilizes core marketing activities, such as outreach emails, for causal identification of behavioral spillovers, thereby proposing a novel identification strategy. Third, it documents the asymmetric effects of volunteering on the extensive versus intensive margins of giving behavior. The results have important implications for the strategic integration of volunteer management and fundraising efforts within nonprofits and offer guidance for the optimal design of solicitation campaigns.
Meta-Learning Customer Preference Dynamics for Fast Customization on Digital Platforms (with Mingzhang Yin, Khaled Boughanmi, and Asim Ansari).
Digital media platforms need to quickly adapt to changing customer preferences to effectively customize and sequence content. Personalizing offerings to suit individual tastes is particularly challenging when data on customer interactions is limited, as is the case with new customers or for new customer sessions. In this research, we develop a novel meta-learning approach that enables fast, large-scale customization using limited customer interactions. Our method employs an encoder-decoder model, calibrated by meta-learning over multiple tasks derived from customer sessions. It builds a flexible encoder by leveraging Transformer neural networks while maintaining interpretability through a structural decoder. Methodologically, our framework quickly adapts to a few observed interactions from new sessions and infers time-varying individual parameters, effectively addressing the cold start challenge. We demonstrate our approach with a primary application for consumer listening sessions on a digital music streaming platform. Comparing the predictive performance against various state-of-the-art benchmark models shows superior accuracy and efficiency of our approach. Beyond prediction, the interpretable parameters uncover dynamic individual heterogeneity and identify meaningful customer segments. Managerially, we demonstrate how firms can use our model to enhance personalization strategies through optimal content sequencing, session completion, and sequential targeting with minimal individual data.
Plebeian Bias: Selecting Crowdsourced Creative Designs for Commercialization (with Ping Xiao, Hannah H. Chang, Li Wang, and Noshir Contractor), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3038775
We identify a new phenomenon – “Plebeian bias” – in the crowdsourcing of creative designs. Stardom, an emphasis on established individuals, has long been observed in many offline contexts. Does this phenomenon carry over to online communities? We investigate a large-scale dataset tracking all submissions, community votes on submissions, and revenues from commercialized submissions on a popular crowdsourcing portal, Threadless.com. In contrast to stardom, we find that the portal selects designs from “Plebeians” (i.e. users without an established fan base and track record) over “Stars” (i.e. users with an established fan base and track record). The tendency is revenue and profit sub-optimal. The evidence is consistent with incentives for the portal to demonstrate procedural fairness to the online community.
Psittacines of Innovation? Assessing the True Novelty of AI Creations, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4764101, https://arxiv.org/pdf/2404.00017.pdf, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4764101
We examine whether Artificial Intelligence (AI) systems generate truly novel ideas rather than merely regurgitating patterns learned during training. Utilizing a novel experimental design, we task an AI with generating project titles for hypothetical crowdfunding campaigns. We compare within AI-generated project titles, measuring repetition and complexity. We compare between the AI-generated titles and actual observed field data using an extension of maximum mean discrepancy—a metric derived from the application of kernel mean embeddings of statistical distributions to high-dimensional machine learning (large language) embedding vectors—yielding a structured analysis of AI output novelty. Results suggest that (1) the AI generates unique content even under increasing task complexity, and at the limits of its computational capabilities, (2) the generated content has face validity, being consistent with both inputs to other generative AI and in qualitative comparison to field data, and (3) exhibits divergence from field data, mitigating concerns relating to intellectual property rights. We discuss implications for copyright and trademark law.
Safeguarding Marketing Research: The Generation, Identification, and Mitigation of AI-Fabricated Disinformation, https://arxiv.org/pdf/2403.14706.pdf, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4739488
Marketers often employ emotional appeals to evoke specific feelings such as happiness, excitement, desire, or nostalgia, thereby fostering a positive connection between consumers and brands or products. However, the efficacy of emotional appeals in driving consumer engagement and persuasion in real-world marketplaces remains unclear. While field data is readily available, the study of emotional appeals at scale is constrained by the high costs of manual data coding. In this research, we examine the use of automated AI systems to algorithmically code emotional appeals in large-scale, unstructured marketing data. We propose a novel methodology and assess the accuracy of state-of-the-art meta-learning AI systems. An important drawback of using AI to code data is that coding imprecision may correlate with data features. Consequently, the direct inclusion of AI-coded variables in econometric models can lead to inconsistent estimates, resulting in biased inference and imprecise predictions. To address this issue, we develop novel estimators based on measuring moment matrices in smaller-scale, auxiliary data. Our methodology is designed to be computationally stable and highly efficient, crucial attributes for inference and analysis in real-world contexts. We apply our approach to study the effectiveness of humor and physical appeals on Instagram (social media) and YouTube (advertising). Results show that bias correction significantly increases effect sizes and improves the precision of predictions compared to the direct inclusion of AI-coded variables. These findings highlight both the potential of AI as a coding tool and the critical importance of measuring and accounting for AI-coding errors in marketing research.
Silico-centric Theory of Mind (with Hannah H. Chang), https://arxiv.org/pdf/2403.09289.pdf, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4751507
Theory of Mind (ToM) refers to the ability to attribute mental states, such as beliefs, desires, intentions, and knowledge, to oneself and others, and to understand that these mental states can differ from one’s own and from reality. We investigate ToM in environments with multiple, distinct, independent AI agents, each possessing unique internal states, information, and objectives. Inspired by human false-belief experiments, we present an AI (‘focal AI’) with a scenario where its clone undergoes a human-centric ToM assessment. We prompt the focal AI to assess whether its clone would benefit from additional instructions. Concurrently, we give its clones the ToM assessment, both with and without the instructions, thereby engaging the focal AI in higher-order counterfactual reasoning akin to human mentalizing–with respect to humans in one test and to other AI in another. We uncover a discrepancy: Contemporary AI demonstrates near-perfect accuracy on human-centric ToM assessments. Since information embedded in one AI is identically embedded in its clone, additional instructions are redundant. Yet, we observe AI crafting elaborate instructions for their clones, erroneously anticipating a need for assistance. An independent referee AI agrees with these unsupported expectations. Neither the focal AI nor the referee demonstrates ToM in our ‘silico-centric’ test.
Stochastic, Dynamic, Fluid Autonomy in Agentic AI: Implications for Authorship, Inventorship, and Liability (with Hannah H. Chang), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5206082, https://arxiv.org/abs/2504.04058
Agentic Artificial Intelligence (AI) systems, exemplified by OpenAI’s DeepResearch, autonomously pursue goals, adapting strategies through implicit learning. Unlike traditional generative AI, which is reactive to user prompts, agentic AI proactively orchestrates complex workflows. It exhibits stochastic, dynamic, and fluid autonomy: its steps and outputs vary probabilistically (stochastic), it evolves based on prior interactions (dynamic), and it operates with significant independence within human-defined parameters, adapting to context (fluid). While this fosters complex, co-evolutionary human-machine interactions capable of generating uniquely synthesized creative outputs, it also irrevocably blurs boundaries—human and machine contributions become irreducibly entangled in intertwined creative processes. Consequently, agentic AI poses significant challenges to legal frameworks reliant on clear attribution: authorship doctrines struggle to disentangle ownership, intellectual property regimes strain to accommodate recursively blended novelty, and liability models falter as accountability diffuses across shifting loci of control. The central issue is not the legal treatment of human versus machine contributions, but the fundamental unmappability—the practical impossibility in many cases—of accurately attributing specific creative elements to either source. When retroactively parsing contributions becomes infeasible, applying distinct standards based on origin becomes impracticable. Therefore, we argue, legal and policy frameworks may need to treat human and machine contributions as functionally equivalent—not for moral or economic reasons, but as a pragmatic necessity.
The AI Ouroboros and Copyright Laundering: Why Copyright Needs a "Fruit of the Poisonous Tree" Doctrine for Generative AI, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5256625
Copyright enforcement rests on an evidentiary bargain: a plaintiff must show both the defendant's access to the work and substantial similarity in the challenged output. That bargain comes under strain when generative AI systems are built through multi-stage pipelines with recursive synthetic data. As each successive model is tuned on the outputs of its predecessors, any copyrighted material absorbed by an early model is further diffused into deep statistical abstractions. The result is potentially an evidentiary blind spot: overlaps that do emerge look like coincidence, while the chain of provenance is too attenuated to trace—conditions ripe for what might be called "copyright laundering." This Article argues that the only doctrinally workable response is to adapt the "fruit of the poisonous tree" (FOPT) principle. It proposes a novel AI-FOPT standard: if a foundational AI model's training is adjudged infringing (due to unauthorized copying not excused by fair use), then subsequent models and datasets principally derived from its outputs are presumptively tainted. The burden consequently shifts to downstream developers to affirmatively demonstrate a verifiably independent and lawfully sourced lineage for their systems. Absent such proof, commercial deployment of these tainted models and their outputs remains actionable. Drawing on existing legal precedents, this Article develops the AI-FOPT standard, addresses counterarguments concerning chilling innovation and fair use (which remains applicable at the initial ingestion stage), and demonstrates why this lineage-focused approach is both administrable and essential to preserve copyright's efficacy and incentive structure in this age of machine-trained machines.
The Impact of Macro Socio-Economic Drivers and Fiscal Policy on Expenditure Allocation and Attribute Preferences (with Andre Bonfrer), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2189381
Our article investigates the effect of macro socio-economic drivers on Australian households’ allocation of expenditure in a category (household appliances) and conditional on the allocated category expenditure, preferences for products (clothes washers) within the category. At the category-level, we quantify the effect of changes in social mobility, disposable income, housing prices and the 2009 stimulus payments on purchase propensity and expenditure. At the product-level, we investigate how households trade off between price, energy efficiency and loading capacity conditional on allocated category expenditure, measuring nonhomotheticity in preferences. We use the model to study a number of hypothetical scenarios, where we simulate the effect of changes in macro socio-economic drivers and fiscal policies on market structure and revenue.