Description of the Lab

In this Lab, we go through the code of a PyRat program that walks randomly, and extend it to make it better. The aim of this Lab is to get familiarized with the environment.

Before engaging in the Lab, make sure that you followed Lab 1, that guided you through the tutorial.py file, as functions used there will be used again in this Lab.

The provided programs

When you download PyRat from the GitHub repository, you are provided with a few sample programs:

  • template.py: This program basically does nothing. It provides a blank code skeleton that you can use for defining your own programs;
  • tutorial.py: As mentioned earlier, this file contains functions to manipulate the maze. It allows to abstract the choice of representation between a dictionary and a numpy ndarray;
  • random_1.py: This program returns a random move at every turn of the game. It is detailed in the documentation. If you haven’t read it already, please do it now;
  • random_2.py & random_3.py: We will go through these files below.

The random_2.py program

It is probably no surprise, but the random_1.py (that you encountered in the documentation) is not very efficient at finding a piece of cheese. Indeed, choosing at random between all possible actions leads to hitting walls, doing nothing, or going to the wrong direction.

While we cannot correct the third point without more advanced algorithms (that we will study in the PyRat course), we can already do something about the other two limitations. The random_2.py program does that, by restricting the set of actions from which we should randomly draw at each turn.

Let’s analyze the new elements in random_2.py.

# Previously developed functions
from tutorial import get_neighbors, locations_to_action

Here, we just indicate that we will use some previously defined function from the tutorial.py file.

The main difference lies in the turn function:

def turn ( maze:             Union[numpy.ndarray, Dict[int, Dict[int, int]]],
           maze_width:       int,
           maze_height:      int,
           name:             str,
           teams:            Dict[str, List[str]],
           player_locations: Dict[str, int],
           player_scores:    Dict[str, float],
           player_muds:      Dict[str, Dict[str, Union[None, int]]],
           cheese:           List[int],
           possible_actions: List[str],
           memory:           threading.local
         ) ->                str:

    """
        This function is called at every turn of the game.
        It should return an action within the set of possible actions.
        You can access the memory you stored during the preprocessing function by doing memory.my_key.
        You can also update the memory with new information, or create new entries as memory.my_key = my_value.
        In:
            * maze:             Map of the maze, as data type described by PyRat's "maze_representation" option.
            * maze_width:       Width of the maze in number of cells.
            * maze_height:      Height of the maze in number of cells.
            * name:             Name of the player controlled by this function.
            * teams:            Recap of the teams of players.
            * player_locations: Locations for all players in the game.
            * player_scores:    Scores for all players in the game.
            * player_muds:      Indicates which player is currently crossing mud.
            * cheese:           List of available pieces of cheese in the maze.
            * possible_actions: List of possible actions.
            * memory:           Local memory to share information between preprocessing, turn and postprocessing.
        Out:
            * action:           One of the possible actions, as given in possible_actions.
    """

    # Choose a random neighbor
    neighbors = get_neighbors(player_locations[name], maze)
    neighbor = random.choice(neighbors)
    
    # Retrieve the corresponding action
    action = locations_to_action(player_locations[name], neighbor, maze_width)
    return action

Contrary to random_1.py, we now make use of the maze description (variable maze). Here, the firt two lines exploit the get_neighbors function to find the cells that can be accessed from the current location, given by player_locations[name], and to return a random cell from it.

Then, locations_to_action is used to convert this pair of locations into a move that the player should perform next turn.

The random_3.py program

At this point, we just have a random program that does not hit walls and avoids doing nothing. Still, there is a good chance that it will not be super efficient. Let’s improve it a bit.

Here, let’s code a simple strategy:

  • Everytime we reach a cell, we mark it as visited;
  • Then, if we have an unvisited neighbor, we choose one at random to go there;
  • If no such neighbor exists, we just move at random, hoping to reach a cell with an unvisited neighbor later.

This strategy is what random_3.py does. To do so, we need to add some elements. In particular, we need to initialize a list of visited cells. Let’s do that at the beginning of the game, in the preprocessing function:

def preprocessing ( maze:             Union[numpy.ndarray, Dict[int, Dict[int, int]]],
                    maze_width:       int,
                    maze_height:      int,
                    name:             str,
                    teams:            Dict[str, List[str]],
                    player_locations: Dict[str, int],
                    cheese:           List[int],
                    possible_actions: List[str],
                    memory:           threading.local
                  ) ->                None:

    """
        This function is called once at the beginning of the game.
        It is typically given more time than the turn function, to perform complex computations.
        Store the results of these computations in the provided memory to reuse them later during turns.
        To do so, you can crete entries in the memory dictionary as memory.my_key = my_value.
        In:
            * maze:             Map of the maze, as data type described by PyRat's "maze_representation" option.
            * maze_width:       Width of the maze in number of cells.
            * maze_height:      Height of the maze in number of cells.
            * name:             Name of the player controlled by this function.
            * teams:            Recap of the teams of players.
            * player_locations: Locations for all players in the game.
            * cheese:           List of available pieces of cheese in the maze.
            * possible_actions: List of possible actions.
            * memory:           Local memory to share information between preprocessing, turn and postprocessing.
        Out:
            * None.
    """

    # To store the already visited cells
    memory.visited_cells = []

Note that we define visited_cells within the memory variable. As this variable will be updated across turns, we need to make sure that it continues to exist from a function call to another.

Then, let’s update the turn function to do what we described above:

def turn ( maze:             Union[numpy.ndarray, Dict[int, Dict[int, int]]],
           maze_width:       int,
           maze_height:      int,
           name:             str,
           teams:            Dict[str, List[str]],
           player_locations: Dict[str, int],
           player_scores:    Dict[str, float],
           player_muds:      Dict[str, Dict[str, Union[None, int]]],
           cheese:           List[int],
           possible_actions: List[str],
           memory:           threading.local
         ) ->                str:

    """
        This function is called at every turn of the game.
        It should return an action within the set of possible actions.
        You can access the memory you stored during the preprocessing function by doing memory.my_key.
        You can also update the memory with new information, or create new entries as memory.my_key = my_value.
        In:
            * maze:             Map of the maze, as data type described by PyRat's "maze_representation" option.
            * maze_width:       Width of the maze in number of cells.
            * maze_height:      Height of the maze in number of cells.
            * name:             Name of the player controlled by this function.
            * teams:            Recap of the teams of players.
            * player_locations: Locations for all players in the game.
            * player_scores:    Scores for all players in the game.
            * player_muds:      Indicates which player is currently crossing mud.
            * cheese:           List of available pieces of cheese in the maze.
            * possible_actions: List of possible actions.
            * memory:           Local memory to share information between preprocessing, turn and postprocessing.
        Out:
            * action:           One of the possible actions, as given in possible_actions.
    """

    # Mark current cell as visited
    if player_locations[name] not in memory.visited_cells:
        memory.visited_cells.append(player_locations[name])

    # Go to an unvisited neighbor in priority
    neighbors = get_neighbors(player_locations[name], maze)
    unvisited_neighbors = [neighbor for neighbor in neighbors if neighbor not in memory.visited_cells]
    if len(unvisited_neighbors) > 0:
        neighbor = random.choice(unvisited_neighbors)
        
    # If there is no unvisited neighbor, choose one randomly
    else:
        neighbor = random.choice(neighbors)
    
    # Retrieve the corresponding action
    action = locations_to_action(player_locations[name], neighbor, maze_width)
    return action

Here, we have four blocks of code:

  • First, we mark the current cell as visited. More precisely, we add the index of the current cell in the list that we defined earlier;
  • Second, we get the list of neighbors of the current cell, and filter out those that were already visited. We then draw a random neighbor from that list if there is one;
  • If there is no such neighbor, we choose a random neighbor;
  • Finally, we determine the decision that leads to the neighbor we selected.

Contrary to random_2.py, we now need a preprocessing function. Therefore, we need to change the main a bit as follows:

if __name__ == "__main__":

    # Map the function to the character
    players = [{"name": "Random 3", "preprocessing_function": preprocessing, "turn_function": turn}]

    # Customize the game elements
    config = {"maze_width": 15,
              "maze_height": 11,
              "mud_percentage": 0.0,
              "nb_cheese": 1,
              "trace_length": 1000}

    # Start the game
    game = PyRat(players, **config)
    stats = game.start()

    # Show statistics
    print(stats)

Your turn to work!

A first improvement of random_3.py

Now, let’s work on a new improvement of random_3.py. Please copy-paste it into a new random_4.py file.

While it improves random_2.py, there is still something that feels wrong with random_3.py. Indeed, when no unvisited neighbor exists, it just behaves like random_2.py. Therefore, if by no chance it enters an area where it visits everything, it will be stuck into that area for long.

To avoid this, a simple trick is to remember the trajectory (either the series of locations, or the series of actions) that we have taken. Then, if no unvisited neighbor exists, we can just go back on our steps until we find a previously visited cell with still unvisited neighbors (typically a crossroad).

Hint: when coding random_4.py, you may need to extract an element from a list. Here is the official documentation page on lists.

Too easy? Here is a more challenging task

If you succeed coding random_4.py too early, you can try to improve it again. If you struggle on random_4.py, it’s ok if you skip this part. For those who wish to continue, let’s copy random_4.py into a new random_5.py file.

We are going to add a small preprocessing step. The maze may have a lot of dead ends, that are no use exploring (unless they contain a cheese). It could be interesting to update the maze description to remove these cells.

Propose a simple methodology that will transform the maze variable in the preprocessing function, so that it does not contain useless dead ends anymore. Do not forget to save your result in the memory so that you can use that new maze during the turns instead of the regular one.

Comparing algorithms

OK, we have spent some time working on new strategies for improving the random_1.py program, but does that mean they indeed work better?

What could you do to answer this question?

Note: We have written a slightly more complete page on the website about statistics. Have a look at it, you may find interesting information.