Publications | 潘劲豪 Jinhao Pan

2026

arXiv’26

Confidence-Orchestrated Self-Evolution against Uncertain LLM Feedback

Bowen Wei , Nan Wang , Yuqing Zhou , Jinhao Pan , and Ziwei Zhu

2026

@misc{wei2026confidenceorchestratedselfevolutionuncertainllm,
  title = {Confidence-Orchestrated Self-Evolution against Uncertain LLM Feedback},
  author = {Wei, Bowen and Wang, Nan and Zhou, Yuqing and Pan, Jinhao and Zhu, Ziwei},
  year = {2026},
  eprint = {2605.28010},
  archiveprefix = {arXiv},
  primaryclass = {cs.AI},
  url = {https://arxiv.org/abs/2605.28010},
}

ICML’26

Knowing Bias, Doing Better: Mitigating Social Bias in LLMs via Know-Bias Neuron Enhancement

Jinhao Pan , Chahat Raj , Anjishnu Mukherjee , Sina Mansouri , Bowen Wei , Shloka Yada , and Ziwei Zhu

2026

arXiv Bib

@misc{pan2026knowbiasmitigatingsocialbias,
  title = {Knowing Bias, Doing Better: Mitigating Social Bias in LLMs via Know-Bias Neuron Enhancement},
  author = {Pan, Jinhao and Raj, Chahat and Mukherjee, Anjishnu and Mansouri, Sina and Wei, Bowen and Yada, Shloka and Zhu, Ziwei},
  year = {2026},
  eprint = {2601.21864},
  archiveprefix = {arXiv},
  primaryclass = {cs.AI},
  url = {https://arxiv.org/abs/2601.21864},
}

ACL’26 Findings

Talent or Luck? Evaluating Attribution Bias in Large Language Models

Chahat Raj , Mahika Banerjee , Jinhao Pan , Aylin Caliskan , Antonios Anastasopoulos , and Ziwei Zhu

2026

arXiv Bib

@misc{raj2026talentluckevaluatingattribution,
  title = {Talent or Luck? Evaluating Attribution Bias in Large Language Models},
  author = {Raj, Chahat and Banerjee, Mahika and Pan, Jinhao and Caliskan, Aylin and Anastasopoulos, Antonios and Zhu, Ziwei},
  year = {2026},
  eprint = {2505.22910},
  archiveprefix = {arXiv},
  primaryclass = {cs.CL},
  url = {https://arxiv.org/abs/2505.22910},
}

arXiv’26

ClawSafety: "Safe" LLMs, Unsafe Agents

Bowen Wei , Yunbei Zhang , Jinhao Pan , Kai Mei , Xiao Wang , Jihun Hamm , Ziwei Zhu , and Yingqiang Ge

2026

arXiv Bib

@misc{wei2026clawsafetysafellmsunsafe,
  title = {ClawSafety: "Safe" LLMs, Unsafe Agents},
  author = {Wei, Bowen and Zhang, Yunbei and Pan, Jinhao and Mei, Kai and Wang, Xiao and Hamm, Jihun and Zhu, Ziwei and Ge, Yingqiang},
  year = {2026},
  eprint = {2604.01438},
  archiveprefix = {arXiv},
  primaryclass = {cs.AI},
  url = {https://arxiv.org/abs/2604.01438},
}

arXiv’26

A Logical-Rule Autoencoder for Interpretable Recommendations

Jinhao Pan^* , Bowen Wei^* , and Ziwei Zhu

2026

arXiv Bib

@misc{pan2026logicalruleautoencoderinterpretablerecommendations,
  title = {A Logical-Rule Autoencoder for Interpretable Recommendations},
  author = {Pan, Jinhao and Wei, Bowen and Zhu, Ziwei},
  year = {2026},
  eprint = {2604.04270},
  archiveprefix = {arXiv},
  primaryclass = {cs.IR},
  url = {https://arxiv.org/abs/2604.04270},
}

AAAI’26
Bias Association Discovery Framework for Open-Ended LLM Generations

Jinhao Pan , Chahat Raj , and Ziwei Zhu

In Proceedings of the AAAI Conference on Artificial Intelligence , 2026

Abs Bib Code Poster Slides Website

Social biases embedded in Large Language Models (LLMs) raise critical concerns, resulting in representational harms – unfair or distorted portrayals of demographic groups – that may be expressed in subtle ways through generated language. Existing evaluation methods often depend on predefined identity-concept associations, limiting their ability to surface new or unexpected forms of bias. In this work, we present the Bias Association Discovery Framework (BADF), a systematic approach for extracting both known and previously unrecognized associations between demographic identities and descriptive concepts from open-ended LLM outputs. Through comprehensive experiments spanning multiple models and diverse real-world contexts, BADF enables robust mapping and analysis of the varied concepts that characterize demographic identities. Our findings advance the understanding of biases in open-ended generation and provide a scalable tool for identifying and analyzing bias associations in LLMs.
@inproceedings{pan2026bias, title = {Bias Association Discovery Framework for Open-Ended LLM Generations}, author = {Pan, Jinhao and Raj, Chahat and Zhu, Ziwei}, booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence}, volume = {40}, number = {38}, pages = {32637--32645}, year = {2026}, }

2025

EMNLP’25 Findings
What’s Not Said Still Hurts: A Description-Based Evaluation Framework for Measuring Social Bias in LLMs

Jinhao Pan , Chahat Raj , Ziyu Yao , and Ziwei Zhu

In Findings of the Association for Computational Linguistics: EMNLP 2025 , 2025

Abs Bib Code Poster Slides Website

Large Language Models (LLMs) often exhibit social biases inherited from their training data. While existing benchmarks evaluate bias by term-based mode through direct term associations between demographic terms and bias terms, LLMs have become increasingly adept at avoiding biased responses, leading to seemingly low levels of bias. However, biases persist in subtler, contextually hidden forms that traditional benchmarks fail to capture. We introduce the Description-based Bias Benchmark (DBB), a novel dataset designed to assess bias at the semantic level that bias concepts are hidden within naturalistic, subtly framed contexts in real-world scenarios rather than superficial terms. We analyze six state-of-the-art LLMs, revealing that while models reduce bias in response at the term level, they continue to reinforce biases in nuanced settings. Data, code, and results are available at \urlhttps://github.com/JP-25/Description-based-Bias-Benchmark.
@inproceedings{pan-etal-2025-whats, title = {What{'}s Not Said Still Hurts: A Description-Based Evaluation Framework for Measuring Social Bias in {LLM}s}, author = {Pan, Jinhao and Raj, Chahat and Yao, Ziyu and Zhu, Ziwei}, editor = {Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet}, booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2025}, year = {2025}, address = {Suzhou, China}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2025.findings-emnlp.76/}, pages = {1438--1459}, isbn = {979-8-89176-335-7}, }
WSDM’25
Combating Heterogeneous Model Biases in Recommendations via Boosting

Jinhao Pan , James Caverlee , and Ziwei Zhu

In Proceedings of the Eighteenth ACM International Conference on Web Search and Data Mining , 2025

Abs Bib Code Poster Website

Collaborative Filtering (CF) based recommenders often exhibit model biases, delivering strong recommendation utility to certain users or items at the expense of others. Prior research approaches these biases as isolated and standalone issues, ignoring their interconnected nature and developing separate methods, thereby compromising the specialized debiasing efforts. Thus, we introduce a boosting-based framework designed to alleviate a broad spectrum of biases. This framework employs a series of sub-models, each tailored for different user and item subgroups. Theoretically, our model ensures an exponentially decreasing upper bound on the training loss across all user and item types with increasing boosting iterations. Extensive experiments demonstrate its superior debiasing capabilities against state-of-the-art methods across four model bias types. Appendix, data and code are available at https://github.com/JP-25/CFBoost
@inproceedings{pan2025combating, title = {Combating Heterogeneous Model Biases in Recommendations via Boosting}, author = {Pan, Jinhao and Caverlee, James and Zhu, Ziwei}, booktitle = {Proceedings of the Eighteenth ACM International Conference on Web Search and Data Mining}, pages = {222--231}, year = {2025}, doi = {10.1145/3701551.3703505}, }

NeurIPS LAW 2025

CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage

Bowen Wei , Yuan Shen Tay , Howard Liu , Jinhao Pan , Kun Luo , Ziwei Zhu , and Chris Jordan

2025

arXiv Bib

@misc{wei2025cortexcollaborativellmagents,
  title = {CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage},
  author = {Wei, Bowen and Tay, Yuan Shen and Liu, Howard and Pan, Jinhao and Luo, Kun and Zhu, Ziwei and Jordan, Chris},
  year = {2025},
  eprint = {2510.00311},
  archiveprefix = {arXiv},
  primaryclass = {cs.CL},
  url = {https://arxiv.org/abs/2510.00311},
}

2024

ECIR’24 IR4Good
Countering Mainstream Bias via End-to-End Adaptive Local Learning

Jinhao Pan , Ziwei Zhu , Jianling Wang , Allen Lin , and James Caverlee

In European Conference on Information Retrieval , 2024

Abs arXiv Bib PDF Code Slides

Collaborative filtering (CF) based recommendations suffer from mainstream bias – where mainstream users are favored over niche users, leading to poor recommendation quality for many long-tail users. In this paper, we identify two root causes of this mainstream bias: (i) discrepancy modeling, whereby CF algorithms focus on modeling mainstream users while neglecting niche users with unique preferences; and (ii) unsynchronized learning, where niche users require more training epochs than mainstream users to reach peak performance. Targeting these causes, we propose a novel end-To-end Adaptive Local Learning (TALL) framework to provide high-quality recommendations to both mainstream and niche users. TALL uses a loss-driven Mixture-of-Experts module to adaptively ensemble experts to provide customized local models for different users. Further, it contains an adaptive weight module to synchronize the learning paces of different users by dynamically adjusting weights in the loss. Extensive experiments demonstrate the state-of-the-art performance of the proposed model. Code and data are provided at https://github.com/JP-25/end-To-end-Adaptive-Local-Leanring-TALL-.
@inproceedings{pan2024countering, title = {Countering Mainstream Bias via End-to-End Adaptive Local Learning}, author = {Pan, Jinhao and Zhu, Ziwei and Wang, Jianling and Lin, Allen and Caverlee, James}, booktitle = {European Conference on Information Retrieval}, pages = {75--89}, year = {2024}, organization = {Springer}, isbn = {978-3-031-56069-9}, }