I am working on developing scaleable and decentralized algorithms for real-time reinforcement learning from first principles. Currently, I am a research scientist at Keen AGI. In the past, I have worked with Prof. Richard S. Sutton, Prof. Martha White, Prof. Yoshua Bengio, and Prof. Faisal Shafait. I also represented my home country at 55th International Mathematical Olympiad (Honorable-mention), and XXVI Asian Pacific Mathematical Olympiad (Bronze Medal).
We propose a more robust and sample efficient algorithm for temporal difference learning and evaluate it on prediction problems on the Arcade Learning Environment (ALE).
Paper | Demo |
RLC 2024
Outstanding Paper Award |
We propose an algorithm for scalable recurrent learning and evaluate it on prediction problems on the Arcade Learning Environment (ALE).
Paper |
JMLR
|
We clarify the difference between step-size normalization and step-size optimization using simple examples.
Paper |
arXiv
|
We propose OML, an objective for learning representations by using catastrophic interference as a training signal. Resultant representations are naturally sparse, accelerate future learning and are robust to forgetting under online updates in continual learning.
Paper | Code | Talk | Poster |
NeurIPS19
|
We propose a simple drop-in procedure for approximating the Bayesian credible regions of patient-specific survival functions that can be applied to many ISD models.
Paper | Code |
IJCAI19
|
We isolate the truly effective existing ideas for incremental classifier learning from those that only work under certain conditions. Moreover, we propose a dynamic threshold moving algorithm that can successfully remove bias from an incrementally learned classifier when learning by knowledge distillation.
Paper | Poster | Code |
ACCV18
|
We propose a computationally efficient document segmentation algorithm that recursively uses convolutional neural networks to precisely localize a document in a natural image in real-time.
Paper | Slides | Code |
ICDAR17
|
We propose a method for learning models that do not rely on spurious correlations. Our work builds on IRM (M Arjovsky, 2019). Unlike IRM, it can be implemented online to (1) detect spurious features for a set of given features and (2) learn non-spurious features from sensory data.
Paper | Code |
arXiv
|