In my research I effectively use computers to solve challenging decision-making and control problems. In particular, I focus on developing general tools that enable a machine to adaptively learn a solution. This is a far superior method than executing a hand-coded solution, because the problems we face today contain uncertainty, and thus do not admit static solutions.

I build on the strength of the current computer technology to manipulate numbers and complex data structures extremely quickly. In extracting statistics from huge data sets, evaluating complex functions, and solving large systems of equations machines routinely outperform people. I am exploiting this computational power in conjunction with solid statistical methods to provide a basis for handling the inherent uncertainty and nondeterminism that characterizes most outstanding decision-making problems.

In joint work with Michael Littman at Duke and AT&T Labs, I focused on learning methods for the algorithm selection problem (adaptively and recursively choosing the appropriate algorithm for each incoming problem instance). Eventually, I came to realize that there is need for more efficient and powerful reinforcement-learning methods, so during my Ph.D. work with Ronald Parr at Duke, I focused on general-purpose reinforcement-learning algorithms with goals in two directions: efficient utilization of data and scaling to large problems. I used techniques from linear algebra, function approximation, linear programming, monte-carlo estimation, and classification to devise and propose learning algorithms that meet the challenges of a wide range of decision making problems.

I am currently doing research in this area at Georgia Tech and apply my methods to industrial problems (disassembly planning, routing in re-entrant lines) with a plan to continuing on problems from operations research (economic markets and auctions, scheduling), networking (dynamic packet routing, router/server reconfiguration), robotics (coordination, planning), and meta--computation (using learning and reasoning to build ``smarter'' computers that make better use of their resources, as opposed to simply faster computers). During this process, I expect to raise more fundamental and/or theoretical questions. In parallel, I collaborate with medical doctors from Emory University on supervised learning methods for computer-aided diagnosis of gastrointestinal bleeding.

In the past, I have established a solid basis in neural systems (Hopfield networks, neural maps), robotics (path planning, motion control), human-computer interaction (speech interfaces), and DNA computation (self--assembly). I still maintain a strong interest in all these fields and I am actively seeking collaborations along these lines.

My long-term vision is two-fold and bidirectional in terms of interdisciplinary collaboration. On one hand, I want to spread the machine learning technology by using it to solve outstanding problems in other disciplines. Computational biology, medical diagnosis, operations research, and robotics are just a few sample areas where uncertainty is inherent and machine learning can make a diŽerence. On the other hand, I want to bring new tools, developed in other disciplines, into machine learning. I believe that mathematics, statistics, control theory, and physics will be some of the main such collaborators in the years to come. Tools, such as wavelets, experimental design, adaptive control, and chaotic dynamics, have the potential of contributing significantly to the core challenges of machine learning. Under this view, I envision my research career crossing the narrow borders of strict specialization while maintaining a scholarly expertise on a very promising field.

My Ph.D. dissertation is on *"**Efficient
Approximate Policy Iteration Methods for Sequential Decision Making in
Reinforcement Learning*** "**.
Briefly, I developed two algorithms for the control problem in
reinforcement learning.

Here is my Ph.D. dissertation committee:

- Prof. Ronald Parr
(chair)

- Prof. Michael
Littman (Rutgers)

- Prof. Xiaobai Sun
- Prof. Leslie P. Kaelbling (MIT)

Check out the publications page for
related published papers.

An online copy of my thesis will appear on this page soon.

As part of my Ph.D. dissertation I worked on the use of ** "Least-Squares Methods in Reinforcement Learning
for Control"**. Briefly, I used linear architectures and
least-squares projections for value function approximation in order to
solve large-scale control problems using relatively few training data.
The core algorithm,

Model-Free Least-Squares Policy Iteration

Michail G. Lagoudakisand Ronald Parr

Proceedings of NIPS*2001: Neural Information Processing Systems: Natural and Synthetic

Vancouver, BC, December 2001, pp. 1547-1554.

Least-Squares Policy Iteration

Michail G. Lagoudakisand Ronald Parr

Journal of Machine Learning Research,4, 2003, pp. 1107-1149.

Check out the publications page for more
related published papers.

For additional information on LSPI and a code distribution go to the LSPI page.

Algorithm Selection using Reinforcement Learning

Michail G. Lagoudakisand Michael L. Littman

Proceedings of the 17th International Conference on Machine LearningStanford, CA, June 2000, pp. 511-518

Learning to Select Branching Rules in the DPLL Procedure for Satisfiability

Michail G. Lagoudakisand Michael L. Littman

Electronic Notes in Discrete Mathematics (ENDM), Vol. 9, Elsevier

LICS 2001 Workshop on Theory and Applications of Satisfiability Testing (SAT 2001)

Boston, MA, June 14-15, 2001 (ENDM Volume 9 online)

You can also check out **RLSAT**,
a DPLL like solver for SAT and #SAT problems with learning
capabilities.

2D DNA Self-Assembly for Satisfiability

Michail G. Lagoudakisand Thomas H. LaBean

Proceedings of the5th DIMACS International Meeting on DNA Based Computers

MIT, Boston, MA, June 1999, pp. 139-152

Here is my M.Sc. thesis committee:

- Prof.
Anthony S. Maida (chair)

- Prof.
Kimon P. Valavanis (now at the University of South Florida)

- Prof. Bill Z. Manaris (now at the College of Charleston)

And here is a poster describing
the main points of my thesis.

Neural Maps for Mobile Robot Navigation

Michail G. Lagoudakisand Anthony S. Maida

Proceedings of the1999 IEEE International Joint Conference on Neural Networks

Washington, D.C., July 1999

Mobile Robot Local Navigation with a Polar Neural Map

M.Sc. Thesis

Center for Advanced Computer Studies, University of Louisiana, Lafayette, 1998