Posts

All the articles I've posted.

3.2
Meet Dynamic Individual Preferences: Resolving Conflicting Human Value with Paired Fine-Tuning
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2604.12479v1 Announce Type: new Abstract: Recent advances in large language models (LLMs) have significantly improved the alignment of models with general human preferences. However, a major cha
3.2
AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow
2026年04月15日
· cs.LG updates on arXiv.org· 04/15 12:35 采集
arXiv:2604.11945v1 Announce Type: new Abstract: High-fidelity numerical simulation of subsurface flow is computationally intensive, especially for many-query tasks such as uncertainty quantification a
3.0
Safe-SAIL: Towards a Fine-grained Safety Landscape of Large Language Models via Sparse Autoencoder Interpretation Framework
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2509.18127v3 Announce Type: replace-cross Abstract: Sparse autoencoders (SAEs) enable interpretability research by decomposing entangled model activations into monosemantic features. However, un
3.0
SEW: Self-Evolving Agentic Workflows for Automated Code Generation
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2505.18646v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have demonstrated effectiveness in code generation tasks. To enable LLMs to address more complex coding challenge
3.0
Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data
2026年04月15日
· cs.LG updates on arXiv.org· 04/15 12:35 采集
arXiv:2503.10676v2 Announce Type: replace-cross Abstract: We study the efficacy of fine-tuning Large Language Models (LLMs) for the specific task of report (government archives, news, intelligence rep
3.0
PILOT: Planning via Internalized Latent Optimization Trajectories for Large Language Models
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2601.19917v2 Announce Type: replace Abstract: Strategic planning is critical for multi-step reasoning, yet compact Large Language Models (LLMs) often lack the capacity to formulate global strate
3.0
Retrieval as a Decision: Training-Free Adaptive Gating for Efficient RAG
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2511.09803v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) improves factuality but retrieving for every query often hurts quality while inflating tokens and latency. We p
3.0
Enhancing Agentic Textual Graph Retrieval with Synthetic Stepwise Supervision
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2510.03323v2 Announce Type: replace Abstract: Integrating textual graphs into Large Language Models (LLMs) is promising for complex graph-based QA. However, a key bottleneck is retrieving inform
3.0
LLM as Attention-Informed NTM and Topic Modeling as long-input Generation: Interpretability and long-Context Capability
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2510.03174v2 Announce Type: replace Abstract: Topic modeling aims to produce interpretable topic representations and topic--document correspondences from corpora, but classical neural topic mode
3.0
Fine-tuning Factor Augmented Neural Lasso for Heterogeneous Environments
2026年04月15日
· cs.LG updates on arXiv.org· 04/15 12:35 采集
arXiv:2604.12288v1 Announce Type: cross Abstract: Fine-tuning is a widely used strategy for adapting pre-trained models to new tasks, yet its methodology and theoretical properties in high-dimensional
3.0
Advancing Multi-Agent RAG Systems with Minimalist Reinforcement Learning
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2505.17086v4 Announce Type: replace Abstract: Large Language Models (LLMs) equipped with modern Retrieval-Augmented Generation (RAG) systems often employ multi-turn interaction pipelines to inte
3.0
Joint Flashback Adaptation for Forgetting-Resistant Instruction Tuning
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2505.15467v2 Announce Type: replace Abstract: Large language models have achieved remarkable success in various tasks. However, it is challenging for them to learn new tasks incrementally due to
3.0
Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation
2026年04月15日
· cs.LG updates on arXiv.org· 04/15 12:35 采集
arXiv:2604.13010v1 Announce Type: new Abstract: On-policy distillation (OPD) has emerged as an efficient post-training paradigm for large language models. However, standard OPD requires a live teacher
3.0
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2604.12374v1 Announce Type: cross Abstract: We describe the pre-training, post-training, and quantization of Nemotron 3 Super, a 120 billion (active 12 billion) parameter hybrid Mamba-Attention
3.0
Interpretable Relational Inference with LLM-Guided Symbolic Dynamics Modeling
2026年04月15日
· cs.LG updates on arXiv.org· 04/15 12:35 采集
arXiv:2604.12806v1 Announce Type: new Abstract: Inferring latent interaction structures from observed dynamics is a fundamental inverse problem in many-body interacting systems. Most neural approaches
3.0
Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2604.12290v1 Announce Type: cross Abstract: Current LLM agent benchmarks, which predominantly focus on binary pass/fail tasks such as code generation or search-based question answering, often ne
3.0
GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees
2026年04月15日
· cs.LG updates on arXiv.org· 04/15 12:35 采集
arXiv:2604.12757v1 Announce Type: new Abstract: Adversarial robustness is essential for deploying neural networks in safety-critical applications, yet standard evaluation methods either require expens
3.0
Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage
2026年04月15日
· cs.AI updates on arXiv.org· 04/15 12:35 采集
arXiv:2603.08819v3 Announce Type: replace-cross Abstract: Retrieval-augmented generation (RAG) systems combine document retrieval with a generative model to address complex information seeking tasks l
3.0
HintMR: Eliciting Stronger Mathematical Reasoning in Small Language Models
2026年04月15日
· cs.CL updates on arXiv.org· 04/15 12:35 采集
arXiv:2604.12229v1 Announce Type: cross Abstract: Small language models (SLMs) often struggle with complex mathematical reasoning due to limited capacity to maintain long chains of intermediate steps
3.0
El Agente Quntur: A research collaborator agent for quantum chemistry
2026年04月15日
· cs.AI updates on arXiv.org· 04/15 12:35 采集
arXiv:2602.04850v2 Announce Type: replace-cross Abstract: Quantum chemistry is a foundational enabling tool for the fields of chemistry, materials science, computational biology and others. Despite of

Posts

Meet Dynamic Individual Preferences: Resolving Conflicting Human Value with Paired Fine-Tuning

AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow

Safe-SAIL: Towards a Fine-grained Safety Landscape of Large Language Models via Sparse Autoencoder Interpretation Framework

SEW: Self-Evolving Agentic Workflows for Automated Code Generation

Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data

PILOT: Planning via Internalized Latent Optimization Trajectories for Large Language Models

Retrieval as a Decision: Training-Free Adaptive Gating for Efficient RAG

Enhancing Agentic Textual Graph Retrieval with Synthetic Stepwise Supervision

LLM as Attention-Informed NTM and Topic Modeling as long-input Generation: Interpretability and long-Context Capability

Fine-tuning Factor Augmented Neural Lasso for Heterogeneous Environments

Advancing Multi-Agent RAG Systems with Minimalist Reinforcement Learning

Joint Flashback Adaptation for Forgetting-Resistant Instruction Tuning

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Interpretable Relational Inference with LLM-Guided Symbolic Dynamics Modeling

Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization

GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees

Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage

HintMR: Eliciting Stronger Mathematical Reasoning in Small Language Models

El Agente Quntur: A research collaborator agent for quantum chemistry