We present the first application to utilize linear programming to generate running routes. Route generation is beneficial for runners visiting new and unfamiliar locations. Current applications forgo linear programming, utilizing various other methods to create custom routes. Our objective is to create a linear programming model that produces optimal closed-loop routes based on user preferences. The model proposed is an extension of ATSP [1]. The model processes a street network. One set of decision variables represent the inclusion of network edges in the optimal solution, and the second set of decision variables are the sequence variables in which the route nodes are selected. The objective function minimizes the total distance traveled in the route. A modification of the flow conservation constraint is introduced to ensure that a node can be visited and departed at most once. Novel constraints that force the solver to optimize for distance, elevation, and route friendliness are introduced in the model. Subtour elimination constraints found in the Miller-Tucker-Zemlin ATSP formulation are introduced to prevent subtours and ensure connectivity. The model UI is implemented as a website. This enabled feedback for verification and validation testing. Athletes input parameters such as their starting location, preferred distance, terrain type, and elevation to help further optimize their route. In conclusion, the linear programming model helps runners plan the most optimal running routes according to their preferences.
operations research, linear programming, travelling salesperson problem, solutions engineering
Approximately 50 million people in the U.S. participate in running or jogging [1]. There are many reasons individuals of all ages incorporate running into their daily routine, from exercise to providing a social outlet. Strava, a social network comprised of over 100 million athletes [2], provides tracking and analysis of participating athletes' activities, such as running and cycling. Athletes can share the routes they complete, as well as other metrics such as time elapsed and elevation gained. Sharing routes and training regimens has become central to running and has revolutionized how athletes celebrate each other's accomplishments.
While the route sharing process has become automated for many athletes, the route planning process remains manual. Route planning is time consuming, requiring athletes to analyze the environment in which they are training. These athletes must factor criteria in the route planning process such as the distance they would like to run and the overall terrain of the environment in which they are running. Situations will arise where athletes may be unfamiliar with the city they are in and are unable to effectively plan their routes.
Heuristic algorithms have been applied towards the generation of running routes. While heuristics operate at a low time complexity and can offer permissible approximations, the potential inaccuracy of the approximation produced by the heuristic may be deemed unacceptable by athletes. The problem of identifying an optimal running route for a set of criteria draws parallels to the Travelling Salesperson Problem [3]. The application of linear programming to the running route generation context has not yet been explored and offers an alternative to current heuristic solutions.
As the popularity of running continues to rise, athletes will begin to seek new running routes to introduce more options into their training routine. The process of identifying potential routes to complete can be daunting for inexperienced runners as well as for more experienced athletes in unfamiliar environments. This solution aims to solve the run route generation problem in a scalable methodology. The solution must yield optimal results; therefore, the solution could naturally rely on a linear programming model instead of a heuristic.
We shall define the linear programming model name to be Generate Optimal Running Route. The input for the model (1) is a bidirectional incomplete graph representing the underlying infrastructure network in which the athlete is deciding to run, a positive real number (2) representing the distance in miles the athlete would like to run, a positive real number (3) representing the upper bound of elevation gain in feet the athlete can gain over the route, and a number (4) representing the minimum percentage of the route required to be in residential and park pathways.
The output of the model is defined as the set of edges (5) that comprise the optimal closed loop running route. Secondary information such as the total distance travelled in miles (6), elevation gain over the route in feet (7), and percentage of route comprised of residential and park pathways (8) mut be obtainable from the optimal edge set returned from the model.
The required research to implement a linear programming solution for the running route generation problem was comprised of identifying dependable data sources for model generation, selecting an appropriate linear programming solver based on optimization benchmarks, and extending existing linear programming models in literature.
Selecting appropriate data sources was an important step in formulating the model. OpenStreetMap (OSM) was the data source of choice due to the completeness of the database and the availability of the underlying infrastructures' network. OSM's geospatial data was deemed to have equal accuracy as to Google's [4]. NetworkX, a Python package dedicated towards complex networks and graphs, was utilized to transform data from OSM. In addition to OSM, the Google Maps Elevation API was leveraged to obtain raw elevation data for each node fetched in the API call to OSM.
A solver is software that optimizes a given linear programming model's decision variable to achieve the best objective function value given the model's constraints. Solvers find solutions utilizing root finding algorithms. These algorithms work by seeking the root of a linear function through continuously iterating over the solution space and converging at the root value that results in a linear function quantity of zero [5]. Gurobi was the selected solver for this solution. Gurobi is a powerful mathematical optimization solver commercially available, ideal for processing large models quickly. Gurobi offers a cloud-based Optimization as a Service, making it ideal for cloud native web applications. The Gurobi Cloud SaaS is also free for academic use.
The Travelling Salesperson Problem has been extensively studied within Computer Science. TSP is concerned with finding the shortest route between all nodes within a graph, on the condition that each node is visited exactly once, and the route finishes on the starting node. There are many variants of TSP, each variant having its own motivation [3]. A particular variation of interest is the Asymmetric Travelling Salesperson Problem, where the graph of interest is incomplete and potentially undirected. ATSP is of interest particularly for its asymmetric property, in which the graph is not well connected, leading to the requirement of subtour elimination constraints that are not necessary in TSP. To remove illegal subtours from their ATSP solution, Miller, Tucker, and Zemlin [6] introduced real number decision variables (9) representing each node within the graph. The purpose of these variables is to track the order in which each city is visited, which is then leveraged to identify and eliminate subtours. The subtour elimination constraints from Miller, Tucker, and Zemlin (10) ensure that the order the nodes are selected in the path contain no subtours, a requirement of an optimal solution. Sawik, building on the MTZ solution, introduced a variation of ATSP in which there is a predetermined starting location. Sawik also introduced constraints (11), (12) to ensure the ordering of the nodes begins and ends with the predetermined starting node.
The ATSP, decision variables, and constraints defined above served as the basis of extension for the Route Generation Running Problem.
The model was then formulated with scope well defined and existing literature analyzed. The model's scope was to identify the optimal subtour for the given graph and preferences. Due to the possibility of an athlete's starting location on a node with less than two incident edges, Dijkstra's algorithm and Breadth First Search were utilized to find the closest viable root node from the user's location but were not leveraged within the linear programming model itself.
The first set of decision variables are binary and represent the inclusion of an edge in the solution (13). A given binary edge variable which represents the connection from node i to node j. The edges of the graph (1) are associated with three properties: distance which represents the length in feet of each edge, the elevation delta of an edge, and the terrain type. The second set of decision variables (9) are the sequence variables which apply to the nodes selected for the tour
The objective function (14) of the route running problem remains the same as ATSP. by setting the objective function to minimize the total distance traveled, we can then set constraints to enforce our route properties that the solver will optimize.
The constraint at (15) is a modification of the flow conservation constraint. Flow conservation ensures that the sum of the binary variables of the possible edges entering a node is equal to the sum of the binary variables of the edges leaving the node. This constraint deviates from the original solution in TSP in that a node can be visited and departed at most once. The constraint at (16) ensures that at least one edge beginning out of the root node is selected. Constraint (11) is also modified to set the root node to the first sequence variable.
The constraint (16), (17), and (18) are novel constraints introduced to the model. (16) ensures the distance of the route to be greater than the input distance, dist. (17) ensures that the elevation gained in the optimal route must not exceed elevation. (18) ensures that at least the fraction of the route specified by terrain / 100 is travelled over terrain types included in the terrains set (pathways, residential neighborhoods, trails). Depending on input parameters, specifying a low terrain value can yield faster compute times, and in some cases, high terrain values may lead the model to become infeasible. The constraints (10), (11), and (12) are included to prevent subtours and ensure connectivity.
The method of delivery decided for the linear programming model was a web application [7]. This was decided for ease of accessibility, allowing athletes to generate their own routes and provide feedback and validation. The application architecture [8] consists of a user interface, a backend, an in-memory database, a persistent store database, Heroku Platforms for compute power, and the Gurobi Optimization SaaS.
The user interface enables athletes to generate and interact with their running routes. The UI is written in HTML, CSS, and JavaScript. The UI frameworks used were Bootstrap and Folium. The backend serves requests from the UI and delegates tasks to the worker process. The backend was written with Flask, a python web micro framework. The in-memory database utilized was Redis and served as a queue between the backend and worker. The persistent database utilized was Postgres SQL, which stores the generated routes for future use.
The athlete first inputs their parameters into a form within the UI. Once the form is submitted, the parameters are posted to the backend. The backend then collects the route parameters and passes the parameters to the worker process via the Redis Queue. This enables the backend to continue serving the athlete, while the worker process handles computationally intensive tasks. The worker process then utilizes the Gurobi SaaS to compute the optimal route per the parameters. Once an optimal solution is found, the result is stored in the Postgres Database. The worker process then notifies the backend that processing is completed through the RQ. The backend then fetches the newly generated route from the Postgres Database and serves the newly generated route through the UI to the athlete [9].
A survey was conducted to identify the pain points of athletes when planning their running routes. Over 650 athletes responded to the survey [10]. Key performance indicators were used to measure model success in addressing athlete needs. Three were used: One to measure the accuracy of the model, another to measure the model against competing alternatives, and the last to measure the satisfaction of the athletes.
The KPI chosen to measure the accuracy of the model was percent variance (20) from user preferences. This compares the input parameter for distance (2) to the result returned by the model (6). The KPI was set to 10%. The current average variance is 7.184% and was computed from routes generated by athletes utilizing the application.
The KPI measuring success against the competition is the number of unique preferences offered. This benchmark was set to a three-preference minimum to ensure the model is novel compared to other solutions. This KPI was achieved, as four parameters, location, distance, elevation, and terrain are unique preferences meaningfully utilized within the application.
The last KPI is the customer satisfaction score (CSAT). This KPI measures the athlete's satisfaction on a scale of 1-5 with 5 being the highest CSAT. The CSAT KPI was set to 4.25. To measure the KPI, eight athletes performed testing and validation by generating routes through the application and running their routes. Seven of the eight athletes reported a score of 3 or higher regarding their satisfaction with the applications ability to generate routes according to their distance specifications. While the CSAT KPI was not met, as one athlete reported a score of 2, the average CSAT score was 4.625, exceeding the 4.25 benchmark.
To measure the effectiveness of the elevation and terrain features, KPIs must be established. Further analysis will be conducted over candidate KPIs to identify meaningful methods of measuring these features. Root cause analysis will be conducted to determine what led to low CSAT for individuals who had unfavorable experience with the routes generated by the application. Additional features geared towards athletes are in development.
Initial market research was done to validate the concept and gather user feedback regarding preferences. Over 650 athletes submitted feedback. The respondents confirmed the value proposition of running route generation as well as terrain and elevation as desired preferences. An optimization model was developed around capturing the core preferences submitted by athletes. A web application was architected and developed around the core model. The web application was then deployed and made accessible through the internet. After deployment, verification and user validation were attained. The initial application measured near-sufficiently against the predetermined KPIs, and athletes were generally happy with the application and its value. Further improvements to the application will be implemented to achieve higher CSAT scores.
[1] Statista Research Department, "Running & Jogging - statistics & Facts," Statista, 10 January 2024. [Online]. Available: https://www.statista.com/topics/1743/running-and-jogging/#statisticChapter. [Accessed 1 February 2024].
[2] Strava, "About," Strava, 2024. [Online]. Available: https://www.strava.com/about. [Accessed 1 February 2024].
[3] N. Cummings, "A Brief History of the Travelling Salesman Problem," The Operational Research Society, [Online]. Available: https://www.theorsociety.com/about-or/or-methods/heuristics/a-brief-history-of-the-travelling-salesman-problem/. [Accessed 1 February 2024].
[4] A. Braun, "What Is OpenStreetMap and Should You Be Using It?," 2 July 2018. [Online]. Available: https://www.maketecheasier.com/what-is-openstreetmap/. [Accessed 30 January 2024].
[5] P. Bartlett, "An Introduction to Root-Finding Algorithms," in Root Finding Algorithms, Santa Barbara, UC Santa barbara, 2013, pp. 1-9.
[6] T. Sawik, "A note on the Miller-Tucker-Zemlin model for the asymmetric traveling salesman problem," BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES, vol. 64, no. 3, pp. 517-520, 2016.
[7] Y. Levanoni, "Opti Run," 10 January 2024. [Online]. Available: https://optirungen.com/. [Accessed 10 January 2024].
[8] Gurobi, "Facility Location Architecture," Gurobi, 26 May 2020. [Online]. Available: https://demos.gurobi.com/facility-location/doc/architecture?scenario=TaDgstiCBz.ztf4KL0NST. [Accessed 10 January 2024].
[9] Y. Levanoni, "OptiRunRender," 9 February 2024. [Online]. Available: https://github.com/ylevanon/OptiRunRender.
[10] O. Works, "Runner Survey (Responses)," 9 February 2024. [Online]. Available: https://docs.google.com/spreadsheets/d/1MsYgiqs8sg6KrcZ324AaCkyXNCM3wpziAxAdBfZpOVo