Tag: 图像生成

All the articles with the tag "图像生成".

6.5
Gemini can now pull from Google Photos to generate personalized images
2026年04月17日
· The Verge· 04/17 06:31 采集
Gemini Personal Intelligence 现可调用 Google Photos 数据，配合 Nano Banana 2 模型生成基于个人上下文的个性化图片
6.4
TS-Agent: Understanding and Reasoning Over Raw Time Series via Iterative Insight Gathering
2026年04月08日
· cs.AI updates on arXiv.org· 04/08 12:31 采集
arXiv:2510.07432v2 Announce Type: replace Abstract: Large language models (LLMs) exhibit strong symbolic and compositional reasoning, yet they stru...
6.4
IBISAgent: Reinforcing Pixel-Level Visual Reasoning in MLLMs for Universal Biomedical Object Referring and Segmentation
2026年04月08日
· cs.AI updates on arXiv.org· 04/08 12:31 采集
arXiv:2601.03054v4 Announce Type: replace-cross Abstract: Recent research on medical MLLMs has gradually shifted its focus from image-level underst...
6.0
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
2026年04月08日
· cs.CL updates on arXiv.org· 04/08 12:31 采集
arXiv:2511.04570v2 Announce Type: replace-cross Abstract: The 「Thinking with Text「 and 「Thinking with Images「 paradigms significantly improve the r...
6.0
Sim-CLIP: Unsupervised Siamese Adversarial Fine-Tuning for Robust and Semantically-Rich Vision-Language Models
2026年04月08日
· cs.CL updates on arXiv.org· 04/08 12:31 采集
arXiv:2407.14971v3 Announce Type: replace-cross Abstract: Vision-Language Models (VLMs) rely heavily on pretrained vision encoders to support downs...
6.7
Luma AI 发布 Uni-1：单架构统一图像理解与生成，基准超越 Google 和 OpenAI
2026年03月24日
· VentureBeat· 03/24 10:32 采集
Luma AI 推出 Uni-1 模型，在统一架构中同时实现图像理解和生成，基准测试超越 Google 和 OpenAI，成本降低 30%。

Tag: 图像生成

Gemini can now pull from Google Photos to generate personalized images

TS-Agent: Understanding and Reasoning Over Raw Time Series via Iterative Insight Gathering

IBISAgent: Reinforcing Pixel-Level Visual Reasoning in MLLMs for Universal Biomedical Object Referring and Segmentation

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Sim-CLIP: Unsupervised Siamese Adversarial Fine-Tuning for Robust and Semantically-Rich Vision-Language Models

Luma AI 发布 Uni-1：单架构统一图像理解与生成，基准超越 Google 和 OpenAI