Papers

  • Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization

    Multi-objective optimization (MOO) arises in many real-world applications where trade-offs between competing objectives must be carefully balanced. In the offline setting, where only a static dataset is available, the main challenge is generalizing beyond observed data. We introduce Pareto-Conditioned Diffusion (PCD), a novel framework that formulates offline MOO as a …

  • Gauss-Newton Natural Gradient Descent for Shape Learning

    We explore the use of the Gauss-Newton method for optimization in shape learning, including implicit neural surfaces and geometry-informed neural networks. The method addresses key challenges in shape learning, such as the ill-conditioning of the underlying differential constraints and the mismatch between the optimization problem in parameter space and the …

  • Reasoning about Intent for Ambiguous Requests

    Large language models often respond to ambiguous requests by implicitly committing to one interpretation. Intent misunderstandings can frustrate users and create safety risks. To address this, we propose generating multiple interpretation-answer pairs in a single structured response to ambiguous requests. Our models are trained with reinforcement learning and customized reward …

  • Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges

    Agentic AI systems powered by large language models (LLMs) and endowed with planning, tool use, memory, and autonomy, are emerging as powerful, flexible platforms for automation. Their ability to autonomously execute tasks across web, software, and physical environments creates new and amplified security risks, distinct from both traditional AI safety …

  • On Hardness and Approximation of Broadcasting in Structured Graphs

    We study the Telephone Broadcasting problem in graphs with restricted structure. Given a designated source in an undirected graph, the goal is to disseminate a message to all vertices in the minimum number of rounds, where in each round every informed vertex may inform at most one neighbor. For general …

  • Provable Training Data Identification for Large Language Models

    Identifying training data of large-scale models is critical for copyright litigation, privacy auditing, and ensuring fair evaluation. However, existing works typically treat this task as an instance-wise identification without controlling the error rate of the identified set, which cannot provide statistically reliable evidence. In this work, we formalize training data …