My general area of research is at the intersection of control, optimization and machine learning. My applications include mobile robotics, transportation engineering, and connected health.

The problems I am generally interested in focus on the integration of novel data sources into mathematical learning models. These can typically represent online processes, for example repeated decision making in online games, or surge pricing in double-sided markets for mobility on demand systems (e.g. Uber, Lyft etc.) or physical processes, for example water in distribution networks or vehicles moving in the transportation network.

Under the umbrella of overused buzzwords such as “big data”, “IoT”, “IoP”, or “cyberphysical systems”, my research focuses on the process of inference, and optimal control with new sets of data, mostly collected from connected devices or infrastructure.

  • The term “inference” is a commonly used term in machine learning to describe the process of integrating data into a mathematical model (deterministic or stochastic), to provide a best estimate, given the data, and given the model. Inference has many other names in various fields, in particular, “estimation” in control theory, “filtering” in signal processing, “data assimilation” in hydrodynamics, “nowcast” in atmospheric sciences, etc. This process is usually performed on models run through a calibration process, again known under various names based on the discipline, for example “system ID” in control theory, “learning” in machine learning, “inverse modeling” in the physical sciences, etc.
  • The term “optimal control” usually refers to the optimization of a process given a time horizon and a “cost,” “scoring,” or “payoff” function, which can be achieved via various optimization techniques including closed form solutions, Lyapunov analysis or convex optimization when possible, and various forms of gradient descent otherwise (stochastic, mirror, adjoint-based).

The contributions made in my lab include both inference and optimal control, and build on mathematical analysis of various known and established models in the fields of practionners in application disciplines (civil engineering, health, etc.).


Sample theoretical and applied work

Learning and repeated decision-making

One of the problems I am interested in studying is learning behavioral dynamics of humans: can we learn how people make decisions (i.e. how people learn), and can we use the corresponding models to predict what their future decisions will be? To answer these questions, I use the framework of of sequential decision-making to model the learning process and decision-making process of humans. Players making repeated decisions, in which they learn over time to optimize their choices, can be efficiently modeled by a sequential process in which they optimize a payoff function at each step, linked to the “reward” they experience. The techniques developed leverage known models, such as the replicator dynamics and the hedge algorithm. In order to understand if these decision-making process models converge to an equilibrium (or not), we use optimization methods commonly developed in machine learning, such as mirror descent and stochastic gradient descent. Depending on the assumptions made on the learning process humans use to make their decisions, we prove convergence of these processes to a set of equilibria (in some cases Nash equilibria of an underlying game). The more specific the assumptions on the learning process humans use, the more guarantees there is for the type of convergence (i.e. on average, no-regret, almost surely etc.), and potentially for convergence rates. For illustration purposes, we have implemented an online gaming framework on Mechanical Turk (Amazon’s crowdsourcing service for parallelizable tasks). In this online game (see video to gain a better understanding of the experiments), distributed online players use Mechanical Turk to play against each other, while we watch the convergence rate of their game, and use our algorithms to predict the decisions they will make, based on what we observed they did in the past. The results are very exciting, humans indeed converge to the equilibria predicted by our models, and the learning rates we infer are representative of their behavior well enough to predict a few steps their future actions in time. This type of work has numerous applications, in particular systems in which players (for example companies such as Waze/Google, INRIX, Apple Maps, routing motorists running their apps) compete for a given resource selfishly (for example road capacity), and improve their decision making process over time by learning the dynamics of their agents (motorists).

Learning and modeling behavioral changes in transportation networks

I am interested in studying the use of distributed networks to model large-scale mobility patterns in urban environments. Several scales of the problem are challenging and interesting. At super large scale, we have studied the integration of cell tower records data (mainly CDR data) into user equilibrium models, to perform user equilibrium inference using a new approached called cellpath. At this scale, I am also interested in understanding the effect of massive adoption of new services (for example routing services like Waze/Google, INRIX, Apple Maps, or Mobility as a Service (MaaS) apps) on congestion. In particular, I am interested in understanding how selfish routing algorithms contribute to redistribute traffic in previously uncongested areas, and what impact they have on overall optimality of traffic. At smaller scales, we have used filtering and estimation techniques to integrate GPS and mobile data into traffic flow models, following work started with the Mobile Millennium project [URL], which is still ongoing and the focus of great interest. Using the same types of models (networks of hyperbolic PDEs discretized with the Godunov scheme), we have created adjoint-based optimal control schemes to produce “on demand” congestion patterns, showing that with proper design of cost functions, one can create near-arbitrary patterns in time-space diagrams. We illustrated this by creating “Cal” logo looking time-space diagrams, Go Bears! More recently, my work has focused on modeling Mobility as a Service (MaaS) companies such as Lyft and Uber using Jackson networks, to study the problems of rebalancing fleets and surge pricing. Using convex optimization though block-coordinate descent, I have created new modeling frameworks to characterize the impact of cyberattacks on MaaS companies (for example through fake reservation requests to capture competitor’s rides).



Most of the work done in my group is implemented in projects that bridge the world of the lab and the world of practice. At the present time, two projects in my group are in expansion.

Discovery of emergent behaviors in traffic using reinforcement learning

Traffic systems can often be modeled by complex (nonlinear and coupled) dynamical systems for which classical analysis tools struggle to provide the understanding sought by transportation agencies, planners, and control engineers, mostly because of difficulty to provide analytical results on these. This project studies complex traffic control problems involving interactions of humans, automated vehicles, and sensing infrastructure, using the framework of deep reinforcement learning (RL). The resulting control laws and emergent behaviors of the vehicles provide insight and understanding of the potential for automation of traffic through mixed fleets of autonomous and manned vehicles. We are building Flow, a new computational framework integrating open-source deep learning and simulation tools, to support the development of controllers for automated vehicles in the presence of complex nonlinear dynamics in traffic. Leveraging recent advances in deep RL, Flow enables the use of RL methods such as policy gradient for traffic control and allows for benchmarking of the performance of classical (including hand-designed) controllers with learned control laws. Model-free learning RL methods naturally select policies that yield traffic flow improvements previously discovered by model-driven approaches, such as stabilization, platooning, and efficient vehicle spacing, known to improve ring road and intersection efficiency. Remarkably, by effectively leveraging the structure of the human driving behavior, the learned policies surpass the performance of state-of-the-art controllers designed for automated vehicles.

Connected Corridors

The project focuses on defining new paradigms for mobility, at the scale of a corridor, i.e. an urban area comprising highways, arterial streets and public transit systems. The project is supported by a coalition of public agencies, focused around the I210 corridor in LA: Caltrans (the California Department of Transportation), LA Metro, the cities of Pasadena, Montrovia, Duarte and Arcadia, and SCAG (the Southern California Association of Governments). The Connected Corridors project is a 10 years project, hosted at PATH (URL), in which a team of practitioners are in the process of developing simulation tools at the scale of an entire corridor. The tools were initially focused along highway traffic, and now include arterial traffic, transit, and will ultimately include other modes of transportation / commuting (biking, telework, MaaS, carpooling, etc.). The simulation platform will support a series of playbooks and mobility improvement measures currently under development under a new Concept of Operations developed by our team for the stakeholders of the corridor. The platform can be used by the students to illustrate the algorithms they are currently developing, in particular, through professional implementations of network flow models (running on Amazon’s EC2) or professional software calibrated by a team of traffic engineers (in particular using TSS’ Aimsun).


The project will complete the prototyping of a hardware ecosystem for in-home monitoring of patients with Alzheimer’s disease (AD), to enable clinical data collection, and to test novel algorithms based on this data. The hardware ecosystem consists of cameras, radar sensors, Android Wear smartwatches, Android phones, and bluetooth in-home sensors. Data collection is achieved through a partnership with clinicians at UCSF. Machine learning algorithms are applied to common problems encountered in early stages of AD, such as falls, leaving appliances on (stove, faucet, etc.), leaving the house, etc. The primary goals of the project include the development of a joint hardware ecosystem to be tested at UCSF, and a roadmap for creating a longer term clinical study with a cohort of 300 patients.

In a joint STTR project funded by the National Science Foundation, UC Berkeley has partnered with various memory care facilities in California and other States. We are looking for more partnerships to deploy a Alzheimer patient monitoring system. Our research project is recruiting new facilities and networks to test the system at broader scales. A description of the UC Berkeley study on Fall prevention systems for memory care is available here. The process of the study is explained here.

Past projects

In the past, the group has worked on various projects related to mobile sensing, in the field of mobile robotics (floating sensor network), and transportation engineering. These projects are still generating academic work as part of the group, or in collaboration with other groups.

Floating Sensor Network, Field deployments of a fleet of 100 aquatic robots, in the Georgianna Slough and Sacramento River, to collect Lagrangian sensor data and to deploy a submarine and static sensing equipment, 2012-2014.

Mobile Millennium, Launched November 10, 2008 in Northern California, to enroll up to 10,000 users to participate in traffic data collection using cellular phones (number of users to this day: more than 4,000).

Mobile Century, February 8, 2008, involving 100 cars measuring traffic on I-880 in California to demonstrate the possibility of traffic reconstruction using cellular phones.




The videos below show some research activities of the group, more videos and photos in the gallery section

Alexandre M. Bayen

Department of Civil & Environmental Engineering
Department of Electrical Engineering and Computer Sciences
University of California, Berkeley