I make systems that learn in real-time from experience. Currently, I work in a small team led by John Carmack. Previously, I developed efficient reinforcement learning algorithms with Richard S. Sutton. I also participated in at the 55th International Mathematical Olympiad (Honorable-mention), and the XXVI Asian Pacific Mathematical Olympiad (Bronze Medal).
My research is driven by the big world hypothesis (Javed & Sutton, 2024) [PDF, Talk], which is the idea that no matter how large and complex our agents become, they will always be small compared to the world they interact with. Some consequences of the big world hypothesis are that no amount of prior learning is sufficient, continual learning is the only way to maintain strong performance, and computationally efficient learning algorithms are essential.
Khurram Javed
Khurram Javed, Arsalan Sharifnassab, Richard S. Sutton
SwiftTD is a TD learning algorithm that can learn to assign credit to important signals. An algorithm like SwiftTD will be a key ingredient for robust real-time learning from rich and noisy sensory streams.
Khurram Javed, Richard S. Sutton
Khurram Javed, Haseeb Shah, Richard S. Sutton, Martha White
One algorithm for backprop-free recurrent learning. Some variant of efficient recurrent learning that does not require backpropagation is essential for learning an effective agent state in big worlds. Learning an effective agent state, in turn, is essential for learning in big worlds. This paper shows that it is possible to have backprop-free recurrent learning.
Khurram Javed, Martha White
Learning online from a stream requires representations that are suitable for online updating. This paper shows that such representations exist. The method for learning such representations in this paper—gradient-based meta-learning with the gradient computed using backpropagation—scales poorly but there are other alternatives that scale better.