Machine Learning — Research Repository

AI & Data Science Preprint PDF DOI

Generalizable Sparse-View 3D Reconstruction from Unconstrained Images

Vinayak Gupta, Chih-Hao Lin, Shenlong Wang, Anand Bhattad, Jia-Bin Huang · 2026

Reconstructing 3D scenes from sparse, unposed images remains challenging under real-world conditions with varying illumination and transient occlusions. Existing methods rely on scene-specific optimiz…

Read Paper →

Engineering Preprint PDF DOI

LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models

Hao Chen, Jiaming Liu, Zhonghao Yan, Nuowei Han, Renrui Zhang, Chenyang Gu, Jialin Gao, Ziyu Guo, Siyuan Qian, Yinxi Wang, Peng Jia, Chi-Wing Fu, Shanghang Zhang, Pheng-Ann Heng · 2026

Vision-Language-Action (VLA) models have increasingly incorporated reasoning mechanisms for complex robotic manipulation. However, existing approaches share a critical limitation: whether employing ex…

Read Paper →

AI & Data Science Preprint PDF DOI

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Keming Wu, Zuhao Yang, Kaichen Zhang, Shizun Wang, Haowei Zhu, Sicong Leng, Zhongyu Yang, Qijie Wang, Sudong Wang, Ziting Wang, Zili Wang, Hui Zhang, Haonan Wang, Hang Zhou, Yifan Pu, Xingxuan Li, Fangneng Zhan, Bo Li, Lidong Bing, Yuxin Song, Ziwei Liu, Wenhu Chen, Jingdong Wang, Xinchao Wang, Xiaojuan Qi, Shijian Lu, Bin Wang · 2026

Recent visual generation models have made major progress in photorealism, typography, instruction following, and interactive editing, yet they still struggle with spatial reasoning, persistent state, …

Read Paper →

AI & Data Science Preprint PDF DOI

Exploration Hacking: Can LLMs Learn to Resist RL Training?

Eyon Jang, Damon Falck, Joschka Braun, Nathalie Kirch, Achu Menon, Perusha Moodley, Scott Emmons, Roland S. Zimmermann, David Lindner · 2026

Reinforcement learning (RL) has become essential to the post-training of large language models (LLMs) for reasoning, agentic capabilities and alignment. Successful RL relies on sufficient exploration …

Read Paper →

AI & Data Science Preprint PDF DOI

Synthetic Computers at Scale for Long-Horizon Productivity Simulation

Tao Ge, Baolin Peng, Hao Cheng, Jianfeng Gao · 2026

Realistic long-horizon productivity work is strongly conditioned on user-specific computer environments, where much of the work context is stored and organized through directory structures and content…

Read Paper →

AI & Data Science Preprint PDF DOI

LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis

Lincan Li, Zheng Chen, Yushun Dong · 2026

Electroencephalogram (EEG) signals are vital for automated seizure detection, but their inherent noise makes robust representation learning challenging. Existing graph construction methods, whether co…

Read Paper →

Physics Preprint PDF DOI

Defending Quantum Classifiers against Adversarial Perturbations through Quantum Autoencoders

Emma Andrews, Sahan Sanjaya, Prabhat Mishra · 2026

Machine learning models can learn from data samples to carry out various tasks efficiently. When data samples are adversarially manipulated, such as by insertion of carefully crafted noise, it can cau…

Read Paper →

AI & Data Science Preprint PDF DOI

Strait: Perceiving Priority and Interference in ML Inference Serving

Haidong Zhao, Nikolaos Georgantas · 2026

Machine learning (ML) inference serving systems host deep neural network (DNN) models and schedule incoming inference requests across deployed GPUs. However, limited support for task prioritization an…

Read Paper →

AI & Data Science Preprint PDF DOI

Action Motifs: Self-Supervised Hierarchical Representation of Human Body Movements

Genki Kinoshita, Shu Nakamura, Ryo Kawahara, Shohei Nobuhara, Yasutomo Kawanishi, Ko Nishino · 2026

Effective human behavior modeling requires a representation of the human body movement that capitalizes on its compositionality. We propose a hierarchical representation consisting of Action Atoms tha…

Read Paper →

AI & Data Science Preprint PDF DOI

PhyCo: Learning Controllable Physical Priors for Generative Motion

Sriram Narayanan, Ziyu Jiang, Srinivasa Narasimhan, Manmohan Chandraker · 2026

Modern video diffusion models excel at appearance synthesis but still struggle with physical consistency: objects drift, collisions lack realistic rebound, and material responses seldom match their un…

Read Paper →

Physics Preprint PDF DOI

Mapping the Phase Diagram of the Vicsek Model with Machine Learning

Grace T. Bai, Brandon B. Le · 2026

In this study, we use machine learning to classify and interpolate the phase structure of the Vicsek flocking model across the three-dimensional parameter space $(\eta,\rho,v_0)$. We construct a datas…

Read Paper →

Engineering Preprint PDF DOI

Sequential Inference for Gaussian Processes: A Signal Processing Perspective

Daniel Waxman, Fernando Llorente, Petar M. Djuric · 2026

The proliferation of capable and efficient machine learning (ML) models marks one of the strongest methodological shifts in signal processing (SP) in its nearly 100-year history. ML models support the…

Read Paper →

Physics Preprint PDF DOI

Reorganizing Quantum Measurement Records Improves Time-Series Prediction

Markus Baumann, Maximilian Zorn, Thomas Gabor, Claudia Linnhoff-Popien, Jonas Stein · 2026

Near-term quantum computers are accessed through repeated circuit executions, which produce finite measurement records rather than exact deterministic outputs. In quantum reservoir computing, these re…

Read Paper →

AI & Data Science Preprint PDF DOI

Continuous-tone Simple Points: An $\ell_0$-Norm of Cyclic Gradient for Topology-Preserving Data-Driven Image Segmentation

Wenxiao Li, Faqiang Wang, Yuping Duan, Li Cui, Liqiang Zhang, Jun Liu · 2026

Topological features play an essential role in ensuring geometric plausibility and structural consistency in image analysis tasks such as segmentation and skeletonization. However, integrating topolog…

Read Paper →

Engineering Preprint PDF DOI

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

Binghao Huang, Yunzhu Li · 2026

We present FlexiTac, a low-cost, open-source, and scalable piezoresistive tactile sensing solution designed for robotic end-effectors. FlexiTac is a practical "plug-in" module consisting of (i) thin, …

Read Paper →

AI & Data Science Preprint PDF DOI

Global Optimality for Constrained Exploration via Penalty Regularization

Florian Wolf, Ilyas Fatkhullin, Niao He · 2026

Efficient exploration is a central problem in reinforcement learning and is often formalized as maximizing the entropy of the state-action occupancy measure. While unconstrained maximum-entropy explor…

Read Paper →

AI & Data Science Preprint PDF DOI

Beyond Pixel Fidelity: Minimizing Perceptual Distortion and Color Bias in Night Photography Rendering

Furkan K{i}nl{i} · 2026

Night Photography Rendering (NPR) poses a significant challenge due to the extreme contrast between dark and illuminated areas in scenes, stemming from concurrent capture of severely dark regions alon…

Read Paper →

AI & Data Science Preprint PDF DOI

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

Kehong Gong, Zhengyu Wen, Dao Thien Phong, Mingxi Xu, Weixia He, Qi Wang, Ning Zhang, Zhengyu Li, Guanli Hou, Dongze Lian, Xiaoyu He, Mingyuan Zhang, Hanwang Zhang · 2026

Recent methods for arbitrary-skeleton motion capture from monocular video follow a factorized pipeline, where a Video-to-Pose network predicts joint positions and an analytical inverse-kinematics (IK)…

Read Paper →

AI & Data Science Preprint PDF DOI

PRISM: Pre-alignment via Black-box On-policy Distillation for Multimodal Reinforcement Learning

Sudong Wang, Weiquan Huang, Xiaomin Yu, Zuhao Yang, Hehai Lin, Keming Wu, Chaojun Xiao, Chen Chen, Wenxuan Wang, Beier Zhu, Yunjian Zhang, Chengwei Qin · 2026

The standard post-training recipe for large multimodal models (LMMs) applies supervised fine-tuning (SFT) on curated demonstrations followed by reinforcement learning with verifiable rewards (RLVR). H…

Read Paper →

AI & Data Science Preprint PDF DOI

Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces

Andrew Bond, Ilkin Umut Melanlioglu, Erkut Erdem, Aykut Erdem · 2026

Modern visual world modeling systems increasingly rely on high-capacity architectures and large-scale data to produce plausible motion, yet they often fail to preserve underlying 3D geometry or physic…

Read Paper →

Browse Research Papers

Generalizable Sparse-View 3D Reconstruction from Unconstrained Images

LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Exploration Hacking: Can LLMs Learn to Resist RL Training?

Synthetic Computers at Scale for Long-Horizon Productivity Simulation

LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis

Defending Quantum Classifiers against Adversarial Perturbations through Quantum Autoencoders

Strait: Perceiving Priority and Interference in ML Inference Serving

Action Motifs: Self-Supervised Hierarchical Representation of Human Body Movements

PhyCo: Learning Controllable Physical Priors for Generative Motion

Mapping the Phase Diagram of the Vicsek Model with Machine Learning

Sequential Inference for Gaussian Processes: A Signal Processing Perspective

Reorganizing Quantum Measurement Records Improves Time-Series Prediction

Continuous-tone Simple Points: An $\ell_0$-Norm of Cyclic Gradient for Topology-Preserving Data-Driven Image Segmentation

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

Global Optimality for Constrained Exploration via Penalty Regularization

Beyond Pixel Fidelity: Minimizing Perceptual Distortion and Color Bias in Night Photography Rendering

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

PRISM: Pre-alignment via Black-box On-policy Distillation for Multimodal Reinforcement Learning

Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces

Browse by Category

Research Type

Publish Your Research