arxiv:2601.09708
Min-Hung Chen
AI & ML interests
Multimodal AI, Transfer Learning, Unsupervised Learning, Video Understanding, Vision Transformer, Computer Vision, Deep Learning
Recent Activity
upvoted a paper 2 days ago
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models upvoted a paper 4 days ago
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning upvoted an article 12 days ago
Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation