Where's my Voi scooter: [3] Coding the Voi scooter location getter

·

7 min read

I am going to code the Voi scooter location getter in this blog. The program should request access tokens only when it needs to, send Voi scooter location requests every minute, and efficiently store them.

Read config file

I started with make_api_requests.py, I first need to be able to access my private information from an uncommitted source, a config file. According to Xiaoxu Gao in her article, There are 4 main options, namely YAML, JSON, TOML and INI. I summarized the advantages of each of them:

  • INI: simple format, no curly brackets, no quotes, very straightforward, can contain comments
  • YAML: like INI, but with nested structures, can contain comments
  • JSON: similar to YAML, but with curly brackets and quotes, supported in most programming languages, however, does not allow comments
  • TOML, similar to INI, has no nested structure but supports more datatype

In my case, I will use INI since it is the simplest and I don't need the other features. In my secrets.ini I have:

[MAIN]
authenticationToken = <my authentication token>
zoneId = <zone id>

Then to read the config, I used:

import configparser

secret_config = configparser.ConfigParser()
secret_config.read('secret.ini')
authentication_token = secret_config['MAIN']['authenticationToken']
zone_id = secret_config['MAIN']['zoneId']

Make API requests

Now I need to request an access token and location data. To request an access token, I can simply do:

url = "https://api.voiapp.io/v1/auth/session"
obj = {"authenticationToken": authentication_token}
re = requests.post(url, json=obj)
access_token_json = json.loads(re.text)
access_token = access_token_json["accessToken"]

But I want to only request it when I need it. I know that it resets every 15 minutes, I thought about recording the timestamps of my location request, and only ask for an access token if 15 minutes have passed. But this method is not robust, my clock may not be the same as the API server's clock, and my program might have a delay, so I cannot count on my program to know when should I request an access token.

the other way is to request location data anyways, and only ask for an access token if the location data request didn't go through. I know that the response to an unauthorized request is {"code":"401.2","detail":"Unauthorized, Token Invalid"} with a status code of 401, so I can use that to my advantage. My plan is:

image.png

I implemented this by having get_scooter_locations() request the locations, if the status code is 401, it calls update_access_token() to update the access_token, which is a global variable, then request the location again and return it.

import requests
import json

access_token = ""

def update_access_token():
    global access_token
    url = "https://api.voiapp.io/v1/auth/session"
    obj = {"authenticationToken": authentication_token}
    re = requests.post(url, json=obj)
    access_token_json = json.loads(re.text)
    access_token = access_token_json["accessToken"]

def get_scooter_locations():
    url = f"https://api.voiapp.io/v2/rides/vehicles?zone_id={zone_id}"
    re = requests.get(url, headers={"x-access-token": access_token})
    if re.status_code == 401:
        update_access_token()
        re = requests.get(url, headers={"x-access-token": access_token})
    if re.status_code == 200:
        return json.loads(re.text)

Now, make_api_request.py is functioning correctly and I will move on to the next part, the complete code can be found on github.com/chit-uob/getVoiLocation.

Handle response

I created handle_response.py to process the response and store the results.

Process content

As mentioned in the previous blog, in all the data that is returned for a Voi scooter (id, short, battery, location in lng and lat, zoneId, category, locked, lockType and lock status), I only need short, battery, lng, lat, and to generate timestamp and vehicle count. I did this by creating a function store_response(response_json), inside that I first get the scooter_list, then create a new vehicle_data list and append the minimized data to it. Finally, I generate the timestamp (I know I spelt it time_stamp in the program, but for compatibility purposes, I will leave it like that) and vehicle count and put them inside the same object record.

def store_response(response_json):
    scooter_list = response_json["data"]["vehicle_groups"][0]["vehicles"]
    vehicle_data = []
    for scooter in scooter_list:
        vehicle_data_item = [scooter['short'], scooter['battery'], scooter['location']['lng'],
                             scooter['location']['lat']]
        vehicle_data.append(vehicle_data_item)

    iso_time = datetime.now().isoformat()

    record = {
        'time_stamp': iso_time,
        'vehicle_count': len(scooter_list),
        'data_format': ["short", "battery", "lng", "lat"],
        'vehicle_data': vehicle_data
    }

Store content inside a file

Then, I need to store the content inside files. I wanted the content to be grouped by hours, so I thought about generating a JSON object in the program and dumping it every hour. However, that would use a lot of RAM (random access memory) as it needs to remember the content within the hour, and if the program fails mid-way, all the content within that hour would be lost. Therefore I want the program to write to the data file every time it generated data, I could do that by appending to a file line by line, but I also want the file to be a JSON object for easier parsing, so I need to come up with a solution.

Using correct JSON syntax

JSON list starts with a [, then each object is followed by a ,, except the last one, JSON does not allow trailing commas, and a ] at the end. Something like this:

[
{"id": "1"},
{"id": "2"},
{"id": "3")
]

the [ part is easy, I just need to write [\n whenever I am creating a new file, \n stands for a new line character, it is like pressing Enter on your keyboard.

For the commas, I can write the json content with json.dumps() and file.write(), then write a ,\n.

For the ] and removing the trailing comma, I need to delete the new line character and a comma character. For this task, I will use the Python file.seek() method, supplying I want to seek backwards two characters, from the file's end. Then I use file.truncate() to cut the file right then and there. Note that I used ab+.

  • a: I want to append to the file.
  • b: I am editing in binary, I want to seek backwards two bytes.
  • +: I am opening a file for updating (reading and writing)

You can learn more about python I/O here.

with open(Path(f"data/location_data/scooter_data_{current_date_string_precise_to_hour}.json"), 'ab+') as f:
    f.seek(-2, os.SEEK_END)
    f.truncate()

However, to my surprise, this does not work correctly and it does not remove the comma like I wished to. I investigated this by calling f.seek(-2, os.SEEK_END) then print(f.read()) to see what was read, it was b'\r\n', note that the b in front means binary. Apparently, there is an extra \r character, which is an escape sequence. So after changing it to f.seek(-3, os.SEEK_END) then print(f.read()), it becomes b',\r\n', including the comma, it works now.

Deciding when to change file

the second challenge is when should I end the previous file and write to the new one. It is when the hour change, I did this by introducing a current_date_string_precise_to_hour variable (very descriptive I know), we also have date_string_precise_to_hour for the new data item. If the current_date_string_precise_to_hour is different from date_string_precise_to_hour, the hour changed, so we end the previous file, open a new one, and update current_date_string_precise_to_hour.

I also had to deal with the initial case, when we don't have a current_date_string_precise_to_hour yet, I set the initial value to be "" the empty string, the added a case in the if statement, the completed code looks like this:

current_date_string_precise_to_hour = ""

def put_inside_data_file(date_string_precise_to_hour, record_json):
    global current_date_string_precise_to_hour

    if current_date_string_precise_to_hour != "" and \
            current_date_string_precise_to_hour != date_string_precise_to_hour:
        with open(Path(f"data/location_data/scooter_data_{current_date_string_precise_to_hour}.json"), 'ab+') as f:
            f.seek(-3, os.SEEK_END)
            f.truncate()
        with open(Path(f"data/location_data/scooter_data_{current_date_string_precise_to_hour}.json"), 'a') as f:
            f.write("\n]\n")
    current_date_string_precise_to_hour = date_string_precise_to_hour

    data_file_path = Path(f"data/location_data/scooter_data_{date_string_precise_to_hour}.json")
    if not data_file_path.is_file():
        with open(data_file_path, 'w') as f:
            f.write("[\n")
    with open(data_file_path, 'a') as f:
        f.write(json.dumps(record_json))
        f.write(",\n")

    print(f"Placed data of {record_json['time_stamp']} in a file")

I also added put_inside_data_file(iso_time[:13], record) to the end of store_response(response_json), so now this is settled, again the complete code can be found on the Github repo.

Main loop

Now I just need the main function to run my script every minute:

import time
import handle_response
import make_api_requests

while True:
    response = make_api_requests.get_scooter_locations()
    handle_response.store_response(response)
    time.sleep(60)

What I plan to do in the future

I need to run this script 24/7, but I also don't want to keep my computer running all the time as it wears it and uses electricity. So I will look into VPS (virtual private server) hosting the next blog.