Computer science is a difficult science: concepts are often linked to discrete mathematics and proving intuitive results can be complex.

Rather than proving a property with a mathematical formalism, it often happens that we try to support a property by numerous and well thought simulations.

Let us show you on an example how such a scientific approach is implemented to analyze a problem.

## Problem

The problem we are considering here is that of the famous Syracuse sequence.

A Syracuse sequence is obtained by induction as follows:

• The first term is a parameter , which is a strictly positive integer.
• The term is computed as follows:
• if is even, then .
• if is odd, then .

As soon as a 1 is encountered in this sequence, the cycle 1,4,2,1,4,2,… repeats itself indefinitely.

It is supposed that for any choice of , the sequence will end up passing through 1 and thus repeating this cycle. However, no one has proved this.

We will try to analyze the impact of on the number of values the sequence takes before meeting the value 1.

## Python program

In order to analyze this impact, we start by writing the function which calculates the Syracuse sequence:

# The syracuse function takes as an argument an integer n strictly higher than 0
# It outputs the number of steps before encountering 1
def syracuse (n) :
if n == 1 :
return 0
else :
if n % 2 == 0 :
return 1 + syracuse(n / 2)
else :
return 1 + syracuse(3 * n + 1)

Let us save this function in a file called syracuse.py.

## Small values of u0

First, we will look at what happens with small values of . To do this, we are going to make our program interfaceable with a script, i.e., ease its execution with different parameters.

The Python sys module is used for this:

import sys

# We retrieve the first argument as u0
u0 = int(sys.argv)

# The syracuse function takes as an argument an integer n strictly higher than 0
# It outputs the number of steps before encountering 1
def syracuse (n) :
if n == 1 :
return 0
else :
if n % 2 == 0 :
return 1 + syracuse(n / 2)
else :
return 1 + syracuse(3 * n + 1)

# We execute the function and output its result
print(syracuse(u0))

Thanks to this module, we can now start our program from the terminal by setting as argument. To do this, we write in a terminal:

python syracuse.py 14

The output printed in the terminal is 17.

## Average results

We will execute our script for values of between 1 and 100 to see how it behaves. We therefore write in a terminal:

for i in {1..100}; do python syracuse.py i = 1; i++) {py -3 syracuse.py u_0 i >> results.csv; done

The first line ensures that the old results.csv file, which might exist, is removed before new data is added to the file.

Again, for Windows users, the command is a bit different:

for ( i -le 100; i | Out-File results.csv -Append}

To plot these data, we use LibreOffice:

libreoffice results.csv

We specify that the data are separated by spaces and we can then draw the following plot:

### Behavior analysis

What we have obtained is already interesting. It seems that globally our curve is increasing even if there are strong local variations.

In order to verify this statement, we propose to check the situation for larger values of . We cannot be exhaustive so we will look at what happens for the first multiples of 1,000,000. The problem is that given the variations, we cannot just draw a plot with a point for every multiple of 1,000,000.

What we propose then is to look at a large number of numbers close to multiples of 1,000,000 and average them. For that, we will use the random module of Python:

import sys
import random

# We retrieve the first argument as u0
u0 = int(sys.argv)

# We perform 10.000 tests
TESTS = 10000

# We study a neighborhood of ± 50.000 around u0
NEIGHBORHOOD = 50000

# The syracuse function takes as an argument an integer n strictly higher than 0
# It outputs the number of steps before encountering 1
def syracuse (n) :
if n == 1 :
return 0
else :
if n % 2 == 0 :
return 1 + syracuse(n / 2)
else :
return 1 + syracuse(3 * n + 1)

# We start the computations and store the results in total
total = 0
for _ in range(TESTS) :
total = total + syracuse(u0 + random.randint(-NEIGHBORHOOD, NEIGHBORHOOD))

# We write the output
print(u0, total / TESTS)

If we run the tests multiple times with

python syracuse.py 1000000

we notice that the values do not change much, which means that 10,000 tests for each value is reasonable.

We still have to calculate the result for our different values, and write the result in a file results.csv:

rm -f results.csv
for i in {1..100}; do python syracuse.py i=1; i++) {py -3 syracuse.py u_0 i; python syracuseopt.py i * 4 * 10000000 ] >> results.csv; done

For Windows users:

for ( i -le 100; i * 4 * 10000000) | Out-File results3.csv -Append}

We obtain the following plot:

This plot is obtained with the Gnuplot tool and the following script:

# Header
set encoding utf8

# Labels
set xlabel "Central value of u0"
set ylabel "Number of steps before obtaining a 1"
set style line 11 lc rgb '#606060' lt 1
set border 3 back ls 11
set tics nomirror out scale 0.75

# Ranges
set yrange [0:]

# Tics
set xtics 1000000000

# Zoom
set size 0.8,0.8

# Caption position
set key at 0.7,0.95

# Output options
set terminal png
set output "syracuse.png"

# Lines definitions
set linestyle 1 lt 1 lw 4

# Plot
plot "results.csv" using 1:2 notitle with linespoints ls 1 smooth bezier