Wednesday, Aug 6: 10:30 AM - 12:20 PM
4182
Contributed Papers
Music City Center
Room: CC-102B
Main Sponsor
Business and Economic Statistics Section
Presentations
Transfer learning is a framework for specifying and refining knowledge about the effects that transfer between a set of source (training) and of target (prediction) data. An open problem is addressing the empirical phenomenon of negative transfer, whereby the transfer learner performs worse on the target data after taking the source data into account than before. In this talk, I will introduce a Bayesian perspective on negative transfer and a method to address it. The key insight is that negative transfer can stem from misspecified prior information about non-transferable causes of the source data. Our proposed method does not require prior knowledge of the source data, and is thus applicable in the presence of latent confounders. Moreover, the learner need not have access to observations in the target task (be able to "fine-tune"), and instead makes use of proxy (indirect) information. Our theoretical results show that the threat of negative transfer does not depend on the informativeness of the proxy information, and our method applies even when only noisy indirect information is available. This talk is based on the paper available at https://arxiv.org/abs/2411.03263.
Keywords
Bayesian methods
machine learning
transfer learning
robust inference
latent confounders
nuisance variables
We study the problem of rank elicitation within the Mallows model, where a permutation $\pi$ is sampled proportional to \( \exp(-\beta d(\pi, \sigma)) \), with $\sigma$ representing a central ranking and $\beta$ as the dispersion parameter. Specifically, we study a general class of $L_\alpha$ distances \( d_\alpha(\pi, \sigma) = \sum_{i=1}^n (\pi(i) - \sigma(i))^\alpha \). For any $\alpha \geq 1$ and any $\beta > 0$, we present a Polynomial-Time Approximation Scheme (PTAS) that achieves two key objectives: (i) estimating the partition function \( Z_n(\beta, \alpha) \) within a multiplicative factor of \( 1 \pm \epsilon \), and (ii) generating samples $\epsilon$-close (in total variation distance) to the true distribution. Leveraging this approximation, we design an efficient Maximum Likelihood Estimation (MLE) algorithm that jointly infers the true underlying ranking, the dispersion parameter, and the distance parameter metric. This work is the first to study metric learning alongside preference learning in the context of Mallows models.
Keywords
Rank Elicitation
Mallows Model
MLE
Preference Elicitation
The selection of skilled funds in the financial market is critical for both investors and researchers. While existing methods for identifying skilled funds under multiple testing frameworks are abundant, most overlook the challenges posed by heavy-tailed and serially dependent data. In this talk, we propose a general framework for multiple testing of alpha in a simple signal-noise model, leveraging adaptive regression techniques based on the data's tail properties. For heavy-tailed data, we introduce a quantile-adjusted method to correct alpha bias. Using a sample-splitting strategy, we derive symmetrically distributed test statistics under the null and compute a data-driven significance threshold. We also extend the model to account for time series dependence. This work is supported by the startup fund of City University of Hong Kong.
Keywords
False discovery rate
Factor models
High-dimensional time series
Sample-splitting
Robust inference
Participation inequality-where most users contribute little content-in the rapidly expanding creator economy threatens platform sustainability. To address this, we introduce a novel theory-based AI-driven approach for platform strategists involving three steps: (1) A generative AI (GenAI) agent perceives its environment through data understanding and pattern recognition; (2) It learns creators' engagement states and behaviors via dynamic learning; and (3) It conducts counterfactual analyses using real-world data to generate personalized, engagement-state-based strategies. Using online literary markets for empirical investigation, we show that our GenAI agent can identify multiple engagement states and generate tailored, AI-optimized incentives that enhance platform revenue while reducing participation inequality. Our research demonstrates how advanced AI models can improve platform managers' strategic decision-making and support content creators' entrepreneurial efforts.
Keywords
Generative AI, strategic management, participation inequality, creator economy, engagement
Fintech companies use advanced algorithms on non-traditional data to assess creditworthiness but access to credit is often restricted by supply-side barriers such as limited financial infrastructure, high costs, strict documentation, and demand-side barriers like poor financial literacy, financial instability, and cultural concerns. We aim to promote financial inclusion in this process by developing a multi-step Bayesian choice model that considers loan defaults conditional on a loan product, a marginal model for the different loan products, and an intricate prior regularization to handle high dimensionality. The motivating application includes high-dimensional data from tens of thousands of customers, bringing major computational challenges. To ensure fairness, we conduct a counterfactual analysis by simulating random product assignments and studying the connection between product selection and default outcomes. This approach confirmed the model's ability to perform reliably even for unseen data patterns. Addressing bias and default risks with high-valued performance metrics, the model provides a practical solution for sustainable lending practices using mobile footprint data.
Keywords
Fintech
Bayesian choice model
Financial inclusion
Counterfactual
Mobile footprints
Insurers are increasingly vulnerable to multiple sources of risks, including climate-related risks and market risks. Multi-risk stress testing models that simultaneously test a firm's likelihood to survive multiple catastrophic events are scarce, and existing models do not properly account for dynamic feedback due to agents continual interactions with the environment. Whereas there exists strong interdependence between human actions, the environment, and risks.
This study develops a multi-risk stress testing model with dynamic feedback mechanisms based on multi-agent reinforcement learning (MARL). The model is employed to evaluate the vulnerability and resilience of property and casualty (P & C) insurers in the U. S to catastrophic windstorm and market risks. Market risk is introduced in the model through volatility in returns from investing premiums in the bond and equity market and through the correlation between premium rates and the returns on investment. Results using new and unique firm and macro-level data sets offer deeper understanding of P & C insurers' exposure to multi-risks with significant regulatory implications.
Keywords
Insurance
Catastrophic risks
Stress-testing
Multi-Agent Reinforcement Learning
Dynamic portfolio optimization has significantly benefited from a wider adoption of deep learning (DL). While existing research has focused on how DL can be applied to solving the Hamilton-Jacobi-Bellman (HJB) equation, some very recent developments propose to forego the derivation of HJB in favor of empirical utility maximization over dynamic allocation strategies expressed through artificial neural networks. In addition to simplicity and transparency, this approach is universally applicable, as it is essentially agnostic about market dynamics. We apply it to optimal portfolio allocation between cash account and risky asset following Heston model. The results appear on par with theoretical ones.
Keywords
Merton problem
asset allocation
deep learning
artificial neural networks
empirical risk minimization
stochastic volatility