Learning in Games:

In this project we study learning in games. Learning in game theoretic framework is interesting since it does not necessarily guarantee convergence to Nash equilibrium or any concepts of equilibria in general. In this work we analyze one of the simplest and earliest form of learning algorithm known as the fictitious play.

Project Report

Options in Swarm Robotics:

We study the problem of Reinforcement Learning in Robot Swarms. Robotic swarms are useful in several activities such as firefighting, construction, surveillance, sports coverage etc. In all these activities, swarms add robustness, adaptivity and flexibility (reconfigurability) when compared with single robots. Most of these activities require a coordinated behavior by all robots in the swarm. This coordinated behaviour can in several cases be represented by the trajectories of physical locations (configuration trajectories) of the robots in the swarm.In this work, we study adapted versions of conventional single agent RL algorithms for swarm robotics. We conduct numerical studies in simulated environments and provide preliminary suggestions based on these results.

Project Report


Value iteration Algorithms for POMDPs:

Outline the challenges of performing value iteration algorithms for POMDPs, the project involved a critical analysis of existing value itertation algorithms that exploits the PWLC structure of the value functionm,using belief state as the information state

Project Report


Evaluation of Value-based and Policy-based methods in Dynamic Multi Drug Therapies for HIV Treatment:

Performed a detailed analysis of value based and policy based methods for Reinforcement Learning algorithms in order to learn optimal STI strategies using a set of trajectories generated during clinical trials of different STI protocols.

Project Report



Unifying On-Policy and Off-Policy Learning in TD Learning and Actor Critic Methods:

Combined the stability of On-Policy TD learning with the efficiency of Off-Policy Learning and proposed a unified approach where the control algorithm either uses on-policy sampled action or off-policy samples depending on the amount of exploration. The idea has been further extended to Actor Critic methods and Q(¤â) algorithm with eligibiity traces.

Project Report



Android Application for an indegenous portable ECG machine:

A low cost portable ECG machine has been developed by the department of Biomedical Physics and Technology university of Dhaka. The android app has been developed that is capable of displaying combined ECG traces of 12 leads. The app allows near real time data transfer to and from the patient facilitating the use of ECG for telemedicine application.



Automatic frequency domain analysis from evoked EMG response:

This was an extension to the previous work done by the Department of Biomedical Physics and Technology, University of Dhaka. (i,e) The Distribution of F latency as a new method in nerver conduction. In the context of extending the functionality of the already existing software, further work has been done that would take the EMG responses in order to automatically plot and represent the distribution of F latency to test the effectiveness of DFL as a diagnostic tool.




Learning Algorithm to Automatically Classify QRS Complexes for Acceptable ECG traces:

A learning algorithm was used for automatic detection and delineation of the QRS complexes of ECG. Using continuous wavelet transform the baseline wander of the signal has been removed. We use the property of R peak having the maximum amplitude, and used KNN classifier to detect the QRS interval. Once QRS complexes has been detected the regularity is checked in order to ensure that an acceptable ecg trace has been obtained.