Publications

2026

  1. arXiv’26
    ClawSafety: "Safe" LLMs, Unsafe Agents
    Bowen Wei , Yunbei Zhang ,  Jinhao Pan , Kai Mei , Xiao Wang , Jihun Hamm , Ziwei Zhu , and Yingqiang Ge
    2026
  2. arXiv’26
    A Logical-Rule Autoencoder for Interpretable Recommendations
    Jinhao Pan* Bowen Wei* , and Ziwei Zhu
    2026
  3. arXiv’26
    KnowBias: Mitigating Social Bias in LLMs via Know-Bias Neuron Enhancement
    Jinhao Pan Chahat Raj Anjishnu Mukherjee , Sina Mansouri , Bowen Wei , Shloka Yada , and Ziwei Zhu
    2026
  4. AAAI’26
    Bias Association Discovery Framework for Open-Ended LLM Generations
    Jinhao Pan Chahat Raj , and Ziwei Zhu
    In Proceedings of the AAAI Conference on Artificial Intelligence , 2026

2025

  1. EMNLP’25 Findings
    What’s Not Said Still Hurts: A Description-Based Evaluation Framework for Measuring Social Bias in LLMs
    Jinhao Pan Chahat Raj Ziyu Yao , and Ziwei Zhu
    In Findings of the Association for Computational Linguistics: EMNLP 2025 , 2025
  2. WSDM’25
    Combating Heterogeneous Model Biases in Recommendations via Boosting
    Jinhao Pan James Caverlee , and Ziwei Zhu
    In Proceedings of the Eighteenth ACM International Conference on Web Search and Data Mining , 2025
  3. NeurIPS LAW 2025
    CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage
    Bowen Wei , Yuan Shen Tay , Howard Liu ,  Jinhao Pan , Kun Luo , Ziwei Zhu , and Chris Jordan
    2025

2024

  1. ECIR’24 IR4Good
    Countering Mainstream Bias via End-to-End Adaptive Local Learning
    In European Conference on Information Retrieval , 2024