Zhang-Wei Hong

Research: My aim is to address practical search problems using tools for sequential decision-making such as reinforcement learning (RL) and the multi-arm bandit framework. I broadly interpret search challenges as discovery of nearly optimal solutions within an huge search space. Addressing these challenges is akin to strike a balance between exploration (deciding when to try the unknown) and exploitation (deciding when to leverage what's known). Thus, I believe that the sequential decision-making principles of balancing exploration and exploitation are crucial for devising effective solutions for search challenges. My research is focused on the following key areas:

  • Sample Efficient RL: Given that resolving many search problems necessitates costly real-world interactions or the use of high-fidelity simulations, my early works are dedicated to creating RL algorithms that can learn effective policies more efficiently with minimal human intervention and less data. (References: NeurIPS'23, ICML'23, ICML'23, ICLR'23, ICLR'22, and ICLR'22)
  • Domain-agnostic Exploration Strategies: At the core of solving search challenges is the capability to experiment with new solutions in an efficient manner, minimizing repetition and accurately estimating outcomes with fewer trials. Existing exploration methods often rely on estimating uncertainty through domain-specific features, necessitating significant human effort and trial and error. This limitation prevents us from applying single exploration strategy to many problems. My goal is to develop domain-agnostic exploration strategies. (References: NeurIPS'23)
  • Applications: In addition to algorithms, I'm interested in addressing real-world problems by framing them as search challenges and applying my algorithms to these problems. My recent interests include applications in AI safety (e.g., red teaming), cybersecurity, software testing, and scientific endeavors. (References: ICLR'24)

Bio: I'm a 4th year Ph.D. student in Electrical Engineering and Computer Science (EECS) at Massachusetts Institute of Technology (MIT), advised by Prof. Pulkit Agrawal. I finished my B.S. degree and M.S. degree at National Tsing Hua University in close colloaboration with Prof. Chun-Yi Lee and Prof. Min Sun. Previously, I was fortunate to work with Prof. Jan Peters at TU Darmstadt in Germany. Also, I was working at Preferred Networks with Dr. Guilherme Maeda and Prabhat Nagarajan.

       

  Experience

Research Intern Jun. 2023 - Sep. 2023
MIT-IBM Watson AI Lab
Advisor: Akash Srivastava
Research Intern Jun. 2022 - Oct. 2022
Microsoft Research Montreal
Advisor: Romain Laroche and Remi Tachet des Combes.

Research Intern Jun. 2019 - Oct. 2019
Preferred Networks
Advisor: Prabhat Nagarajan and Dr. Guilherme Maeda.

Research intern Feb. 2019 - Jun. 2019
Appier
Advisor: Prof. Min Sun

Visiting researcher Jul. 2018 - Oct. 2018
Intelligent Autonomous System (IAS) group at TU Darmstadt
Advisor: Prof. Jan Peters

Research Assistant 2017 Jul. - 2020 Mar.
ELSA Lab at National Tsing Hua University
Advisor: Prof. Chun-Yi Lee

Research Collaboration 2016 Oct. - 2017 Mar.
Vision Science Lab at National Tsing Hua University
Advisor: Prof. Min Sun

Teaching Assistant 2017 - 2018
Taiwan NVIDIA Deep Learning Institute
Advisor: Prof. Chun-Yi Lee

Software Engineering Intern Jul. 2016 - Nov. 2016
Mediatek
Advisor: Anthony Liu

Contract Software Engineer Oct. 2015 - Dec. 2015
Industrial Technology Research Institute (ITRI)


  Selected Publications

Curiosity-driven Red-teaming for Large Language Models
Zhang-Wei Hong, Idan Shenfeld, Tsun-Hsuan Wang, Yung-Sung Chuang, Aldo Pareja, James R. Glass, Akash Srivastava, Pulkit Agrawal
Accepted as a conference paper in ICLR 2024
Paper | Code | Bibtex
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets
Zhang-Wei Hong, Aviral Kumar, Sathwik Karnik, Abhishek Bhandwaldar, Akash Srivastava, Joni Pajarinen, Romain Laroche, Abhishek Gupta, Pulkit Agrawal
Accepted as a conference paper in NeurIPS 2023
Paper | Code |
Maximizing Velocity by Minimizing Energy
Srinath Mahankali*, Chi-Chang Lee*, Gabriel B. Margolis, Zhang-Wei Hong, Pulkit Agrawal
Accepted as a conference paper in ICRA 2024
Paper (coming soon)
TGRL: An Algorithm for Teacher Guided Reinforcement Learning
Idan Shenfeld, Zhang-Wei Hong, Aviv Tamar, Pulkit Agrawal
Accepted as a conference paper in ICML 2023
Paper | Website | Code |
Parallel Q-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation
Zechu Li*, Tao Chen*, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal
Accepted as a conference paper in ICML 2023
Paper | Code |
Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Reweighting
Zhang-Wei Hong Pulkit Agrawal, Remi Tachet des Combes and Romain Laroche
Accepted as a conference paper in ICLR 2023
Paper | Code |
Redeeming intrinsic rewards via constrainted optimization
Eric Chen*, Zhang-Wei Hong*, Joni Pajarinen and Pulkit Agrawal (* indiccates co-first author)
Accepted as a conference paper in NeurIPS 2022
Paper | Website | Code | MIT News
Bilinear Value Networks for Multi-goal Reinforcement Learning
Zhang-Wei Hong*, Ge Yang*, and Pulkit Agrawal (* indiccates co-first author)
International Conference on Learning Representation (ICLR) 2022 - Conference paper
Paper | Code
Topological Experience Replay
Zhang-Wei Hong, Tao Chen, Yen-Chen Lin, Joni Pajarinen, and Pulkit Agrawal
International Conference on Learning Representation (ICLR) 2022 - Conference paper
Paper | Code
Stubborn: A Strong Baseline for Indoor Object Navigation
Haokuan Luo, Albert Yue, Zhang-Wei Hong , Pulkit Agrawal
IEEE/RSJ International Conference on Intelligent Robots and Systems 2022 - Conference paper
Paper | Code
Reducing the Deployment-Time Inference Control Costs of Deep Reinforcement Learning Agents via an Asymmetric Architecture
Chin-Jui Chang, Yu-Wei Chu, Chao-Hsien Ting, Hao-Kang Liu, Zhang-Wei Hong, and Chun-Yi Lee
International Conference on Robotics and Automation (ICRA) 2021 - Conference paper
Paper
Periodic Intra-Ensemble Knowledge Distillation for Reinforcement Learning
Zhang-Wei Hong, Prabhat Nagarajan, and Guilherme Maeda
European Conference on Machine Learning (ECML) 2021 - Conference paper
Paper
Adversarial Active Exploration for Inverse Dynamics Model Learning
Zhang-Wei Hong, Tsu-Jui Fu, Tzu-Yun Shann, Yi-Hsiang Chang, and Chun-Yi Lee
Conference on Robot Learning (CoRL) 2019 - Oral
Paper | Project

Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
Zhang-Wei Hong, Tzu-Yun Shann, Shih-Yang Su, Yi-Hsiang Chang, Tsu-Jui Fu, and Chun-Yi Lee
Neural Information Processing Systems (NeurIPS) 2018 - Poster
International Conference on Representation Learning (ICLR) Workshop 2018
Paper | Project


Virtual-to-Real: Learning to Control in Visual Semantic Segmentation
Zhang-Wei Hong, Chen Yu-Ming, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, Hsuan-Kung Yang, Brian Hsi-Lin Ho, Chih-Chieh Tu, Yueh-Chuan Chang, Tsu-Ching Hsiao, Hsin-Wei Hsiao, Sih-Pin Lai, and Chun-Yi Lee
International Joint Conference on Artificial Intelligence (IJCAI) 2018 - Oral
Paper | Project

Deep Policy Inference Q-Network for Multi-Agent Systems
Zhang-Wei Hong, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, and Chun-Yi Lee
International Conference On Autonomous Agents and Multi-Agent Systems (AAMAS) 2018 - Oral
Paper

Tactics of adversarial attack on deep reinforcement learning agents
Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, and Min Sun
International Joint Conference on Artificial Intelligence (IJCAI) 2018 - Poster
Paper | Project


  Book

Lecture notes, 6.8200 (previsouly 6.484) Computational sensorirmotor learning, MIT


  Talks

Invited talk at Macro eyes, Host: Prof. Suvrit Sra, MIT
Invited talk at Toronto AI in Robotics Seminar, Host: Prof. Igor Gilitschenski, University of Toronto


  Teaching

Teaching Assistant, 6.484 Computational sensorirmotor learning, MIT Spring 2022

Teaching Assistant, 6.S090 Deep learning for control, MIT Spring 2021

Teaching Assistant, Deep Learning Institute, NVIDIA Taiwan Spring 2018



template from jonbarron