News

Right now I cannot get back to Boston since my US visa has been processing for !

  • February 6, 2025
    I added my new paper on Value Iteration analysis with absolute probability sequences to the website!

  • January 22, 2025
    My MDP Geometry paper was accepted to AISTATS 2025!

  • January 8 & 9, 2025
    I presented my MDP Geometry paper in the Technion (Shie Mannor's lab) and TAU (Tomer Koren's lab).

  • October 2, 2024
    I presented my MDP Geometry paper at the seminar in University of Cyprus (announcement)

  • September 16, 2024
    Since my US visa is still not ready, I came to Cyprus where I will stay as a visiting researcher at University of Cyprus.

  • July 17, 2024
    My TD-SVRG paper was accepted to TMLR 2024!

Arsenii Mustafin

I am looking for a Postdoc position starting Summer 2025. I want to continue working on Reinforcement Learning and ideally dedicate some time to further develop the MDP Geometry framework. If you have an open position or know someone who does, please send me an email.


I am a 6th year PhD student at Boston University, department of Computer Science. I am advised by Prof. Alex Olshevsky and Prof. Yannis Paschalidis. My main interest is Reinforcement learning, both theoretical and practical. I am also interested in other topics in machine learning, including computer vision, explainability and deep fake detection. Previously I worked on SemaFor DARPA project with BU team lead by Prof. Kate Saenko and a project on Explainable AI with Prof. Sarah Adel Bargal.

Completed coursework: EC700: Reinforcement learning, CS591: Deep Learning, CS542: Machine Learning, EC724: Advanced Optimization, CS565: Algorithmic Data Mining, CS537: Randomness in Computing, CS531: Advanced optimization algorithms, CS655: Computer network, EC 500 A2: Robot Learning and Vision for Navigation.

Previously I was studying economics, and took several undergraduate and graduate-level courses on econometrics and time-series analysis. Prior to coming to BU, I worked as a machine learning developer. I spent four years in Mainland China and speak fluent Mandarin, in addition to fluent English and native Russian.

aam [at] bu.edu  |  CV  |  LinkedIn  |  Github

Skills

Algorithmic: Reinforcement Learning, Deep Learning, classic Machine Learning algorithms, Time Series analysis.

Software: PyTorch, Tensorflow (1&2), Linux (+git), SQL.

Papers & Patents:

MDP Geometry, Normalization and Reward Balancing Solvers
Mustafin A., Pakharev A., Olshevsky, A., Paschalidis, I.
Accepted to AISTATS 2025

In this work we suggest a new geometric view on MDP, which allows to interpret the main MDP problems as geometric problems. Inspired by this view we propose a Normalization procedure which transforms an MDP to a form, in which finding optimal policy is trivial. It enables a new class of MDP solving algorithms, which we call Reward Balancing Solvers. One algorithm from this class, Safe Reward Balancing (RB-S) achieves and exceeds the state-of-the-art convergence guarantees in known MDP (Value Iteration), unknown MDP (Q-learning) and federated unknown MDP (Federated Q-learning) settings.

Analysis of Value Iteration Through Absolute Probability Sequences
Mustafin A., Colla, S., Olshevsky, A., Paschalidis, I.
Currently under review, 2025

In this work, we use absolute probability sequences to develop a new line of analysis of Value Iteration algorithm and examine its convergence in terms of the L² norm, offering a new perspective on its behavior and performance.

Closing the Gap Between SVRG and TD-SVRG with Gradient Splitting
Mustafin A., Olshevsky, A., Paschalidis, I.
TMLR, 2024

In the paper, my coauthors and I significantly improve theoretical guarantees of SVRG method applied to TD update, show that it exhibits the same convergence as SVRG in convex optimization setting and provide theoretical guarantees for practical algorithm.

Ani-GIFs: A Benchmark Dataset for Domain Generalization of Action Recognition from GIFs
Mustafin, A.*, Jain, S.*, Lteif, D.*, Majumdar S.*, Tourni, I.*, Bargal S., Saenko K., Sclaroff S.
Frontiers in Computer Science, 2022

This paper published in "Frontiers in computer science" presents a dataset for Domain Generalization problems in the video domain.

Patent: SYSTEMS AND METHODS FOR IMAGE OR VIDEO PERFORMANCE HEAT MAP GENERATION
Mustafin A., Saraee E., Hamedi J., Halloran Z.
US Patent, 2021

The patent application was a result of my internship at Vizit labs, Inc, during which I played a key role in developing an explainability technique for computer vision model.


Original template