Author of the publication

An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models.

, , , , , , , , , , , and . ACL (Findings), page 6549-6560. Association for Computational Linguistics, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

What Can Neural Networks Reason About?, , , , , and . CoRR, (2019)Gradient Descent Finds Global Minima of Deep Neural Networks., , , , and . CoRR, (2018)Near-Optimal Randomized Exploration for Tabular Markov Decision Processes., , , , and . NeurIPS, (2022)Planning with General Objective Functions: Going Beyond Total Rewards., , , , and . NeurIPS, (2020)Agnostic $Q$-learning with Function Approximation in Deterministic Systems: Near-Optimal Bounds on Approximation Error and Sample Complexity., , , and . NeurIPS, (2020)On Reward-Free Reinforcement Learning with Linear Function Approximation., , , and . NeurIPS, (2020)Is Long Horizon RL More Difficult Than Short Horizon RL?, , , and . NeurIPS, (2020)Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron., and . COLT, volume 195 of Proceedings of Machine Learning Research, page 1155-1198. PMLR, (2023)Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap., , and . COLT, volume 134 of Proceedings of Machine Learning Research, page 4438-4472. PMLR, (2021)On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP., , , and . ICML, volume 139 of Proceedings of Machine Learning Research, page 11296-11306. PMLR, (2021)