ChatGPT Impresses Vitalik Buterin in Extracting Data from GeoLife Dataset

In a recent blog post, Vitalik Buterin, the co-founder of Ethereum, explored an interesting question: how does the time it takes to travel from point A to point B scale with distance, in the real world? He used the GeoLife dataset to randomly select points where people are actually at and an API to get public transit travel time between the points.

Buterin found that travel time grows slower than linearly and that the further one has to go, the more opportunity they have to resort to forms of transportation that are faster but have some fixed overhead. The study found that for distances under 500km, the power law fit that the linear regression gave is travel_time = 965.8020738916074 * distance^0.6138556361612214 (time in seconds, distance in km).

To gather data for longer distances, Buterin manually obtained travel time data for 16 pairs of points that were more than 500km apart from each other. He obtained the public transit travel time from the starting point to the nearest airport, the public transit travel time to the end point from the nearest airport, and the flight time from the starting point to the end point. He computed the travel time as (to_airport) * 1.5 + (90 if international else 60) + flight_time + from_airport.

Although he encountered a few bugs and had to fix some code manually, he found that the ChatGPT 3.5 was particularly good at teaching him libraries and APIs that he had never heard of before but others had been using. He asked ChatGPT for help with getting an API key for the Google Maps Directions API, writing a function to compute the straight-line distance between two GPS coordinates, given a list of (distance, time) pairs, drawing a scatter plot, with time and distance as axes, both axes logarithmically scaled, and doing a linear regression on the logarithms of distance and time to try to fit the data to a power law.

Overall, the study provides valuable insights into the relationship between travel time and distance, and shows the potential of using AI to extract meaningful insights from large datasets.

Read more:

Join us on Telegram

Follow us on Twitter

Follow us on Facebook

Follow us on Reddit

You might also like