Distance v Demand using O&D Survey
7/24/2019 - M. Lawder
As I started looking through some of the Origin & Destination (O&D) Survey Data Available in the DB1B Database from the Bureau of Transportation Statistics, I continued to get excited about the breath of different data available in the database. There will be more analysis to come in future posts, but I'll start by going through a thought I had prior to digging into the data. Which airports and destinations does St. Louis have a larger than "expected" demand (and also where is demand lower than expected).
While there are a lot of ways you could measure the expected demand for an O&D Pair, I went with a crude measure to start, take an airport's total departures in the O&D survey and divide it by the total O&D departures in the USA combined. This simple math will give you the percentage of passengers that a given airport should be sending to every other airport in the country (if everything else was equal). So for STL, there were 5.2M passengers completing (domestic) travel at the airport compared to 567.8M total completed domestic trips in the O&D Survey for 2018 combined, leading the STL receiving a 0.92% percentage of expected passenger with all it's O&D pairs. I should note a little more info about the O&D survey data. The data contains information about 10% of all domestics tickets (so to get total passenger numbers you multiply by 10) and for this analysis I was looking specifically at the Market Database which looks at a Passenger's initial starting point and final destination regardless if they have a layover or a nonstop trip. So, you can measure how many people are flying between any two airports in the country even if those airports don't have direct flights between them. This survey is as good of a method available for calculating the O&D demand between airports although the actual routes between the airports will effect this value. If you want to look at route data check out our Airline Browser Tool and you can also look more closely at specific routes in the Survey's Coupon data.
Back to STL, looking at all of the tickets flying out of of Lambert, we can see the airlines are sending a larger portion of passengers than "expected" to the following airports (minimum 1M annual departing passengers):
Top 10 airports from STL
Note that the PCT of Total Arriving PAX column represents the amount of travelers to the specific airport that begin their trips from STL. For Example, STL supplies roughly 14 of every 1,000 passengers that arrive in Omaha. There are a couple of interesting groupings in the top 10. Five of the airports are in major markets where there are multiple airports, but one of the airports is taking up a large portion of overall traffic to that city (DAL, HOU, MDW, LGA, DCA). Three airports fall into the "popular destinations that are getting a little far to drive to" category (RSW, DEN, SAT) while OMA and DSM are much closer than those three, they have multiple direct flights a day to STL which could be encouraging air travel compared to driving. All of the top 10 destinations have direct service from Southwest (the largest airline at STL). Only 3 of the bottom 10 destinations have direct service from STL and none are served by Southwest. Here are the lowest (11) pairs:
Bottom 11 airports from STL
Several of the airports in the lowest pairs are again in larger markets with multiple airports (several of the same markets appear in both the top and bottom groups), but have less travelers to STL at the expense of the other market airports (ORD, BUR, IAD, LGB, JFK). Several are in cities too close to STL (and with few direct flight options) to make flying worth while (IND, MEM, CVG,SDF). Although I'm still not sure why MEM is so amazingly low.
Now, this measure by itself does not show that St. Louis has a weak travel demand between a certain destination because many of those routes (as you might be able to tell) are close by and therefore people are most likely just using other means to travel to those destinations. For a non-STL example, even though Newark (EWR) and Philadelphia (PHL) are both larger airports, trips between the two airports is much lower than the "expected" value because more people would drive, take a bus, or ride a train to cover the 90 or so miles. So we expect destinations close to other airports by to have lower than the rough "expectation" air travel demand even though the overall travel demand is probably very high. So how does overall distance between destinations influence the demand?
Below we plotted the each destination from STL with the actual percentage of passengers they supplied to each destination against the Nonstop distance between the airport. In red is a trendline for the data:
All Destinations from STL (non-stop and connecting service) with over 1/2M Annual arriving passengers. Chart shows the percentage of each airport's annual arriving passengers that originated from STL plotted against the Nonstop distance between the airports. Hover over each dot to see annual passenger numbers traveling from STL to a destination. Data aggregated from DB1B Survey
The trendline shows the (not unexpected) result that destinations nearby have a smaller percentage of their passengers coming from STL. The percentage rises sharply and sees its highest values at destinations between 400 and 1,000 miles before beginning to drop again for destinations farther away. This trend would seem to follow the thought that overall travel demand for a destination will decrease as the destination gets further away. However once a destinations get close enough that other forms of travel make sense over flying, the air travel demand will drop. Note again, that this is using percentages and not the raw number of travel demand.
Is this trend happening only due to St. Louis' location relative to other airports? Heading East most destinations are within 1,000 miles while many destinations on the West Coast are over 1,500 miles away. But we still generally see a similar trend at other airports around the country. Not all airports fall into a similar trend and for the one's that do, the distance of the peak varies from only a couple hundred miles to over a thousand. Below are several more airports that exhibit the trend to various degrees.
Similar plot to above for six additional airports.
While the shape of the trend is similar, the actual scatter plots will dependent on the airport's location which determines the distances (x-axis). DEN appears to buck the trend where nearby airports tend to have weaker connections, but in reality there are only 4 major airports within 500 miles of DEN so we aren't going to be able to see how nearby airports influence DEN. The trend does appear to be more obvious for airports that have a large number of airports within 1,000 miles. Additionally larger markets will have generally higher percentages for all of the destinations since these markets will ultimately account for more total passengers around the US.
An extension of this analysis could include looking to normalize the location relative to the other airports. However, I don't plan on spending much more time looking at the distance to demand connection because there are a lot of other interesting analysis to look into with the O&D survey beyond this first look!
While there are a lot of ways you could measure the expected demand for an O&D Pair, I went with a crude measure to start, take an airport's total departures in the O&D survey and divide it by the total O&D departures in the USA combined. This simple math will give you the percentage of passengers that a given airport should be sending to every other airport in the country (if everything else was equal). So for STL, there were 5.2M passengers completing (domestic) travel at the airport compared to 567.8M total completed domestic trips in the O&D Survey for 2018 combined, leading the STL receiving a 0.92% percentage of expected passenger with all it's O&D pairs. I should note a little more info about the O&D survey data. The data contains information about 10% of all domestics tickets (so to get total passenger numbers you multiply by 10) and for this analysis I was looking specifically at the Market Database which looks at a Passenger's initial starting point and final destination regardless if they have a layover or a nonstop trip. So, you can measure how many people are flying between any two airports in the country even if those airports don't have direct flights between them. This survey is as good of a method available for calculating the O&D demand between airports although the actual routes between the airports will effect this value. If you want to look at route data check out our Airline Browser Tool and you can also look more closely at specific routes in the Survey's Coupon data.
Back to STL, looking at all of the tickets flying out of of Lambert, we can see the airlines are sending a larger portion of passengers than "expected" to the following airports (minimum 1M annual departing passengers):
| Destination | PAX Originating @ STL | PCT of Total Arriving PAX | MILES from STL |
|---|---|---|---|
| DAL | 99,890 | 1.9049 | 546.0 |
| HOU | 68,120 | 1.5391 | 687.0 |
| MDW | 94,370 | 1.4278 | 251.0 |
| OMA | 30,940 | 1.3917 | 342.0 |
| LGA | 173,840 | 1.3789 | 888.0 |
| RSW | 54,350 | 1.2666 | 979.0 |
| DCA | 115,850 | 1.2468 | 719.0 |
| DEN | 202,210 | 1.0994 | 770.0 |
| DSM | 13,020 | 1.0809 | 259.0 |
| SAT | 44,750 | 1.0423 | 786.0 |
Note that the PCT of Total Arriving PAX column represents the amount of travelers to the specific airport that begin their trips from STL. For Example, STL supplies roughly 14 of every 1,000 passengers that arrive in Omaha. There are a couple of interesting groupings in the top 10. Five of the airports are in major markets where there are multiple airports, but one of the airports is taking up a large portion of overall traffic to that city (DAL, HOU, MDW, LGA, DCA). Three airports fall into the "popular destinations that are getting a little far to drive to" category (RSW, DEN, SAT) while OMA and DSM are much closer than those three, they have multiple direct flights a day to STL which could be encouraging air travel compared to driving. All of the top 10 destinations have direct service from Southwest (the largest airline at STL). Only 3 of the bottom 10 destinations have direct service from STL and none are served by Southwest. Here are the lowest (11) pairs:
| Destination | PAX Originating @ STL | PCT of Total Arriving PAX | MILES from STL |
|---|---|---|---|
| ORD | 69,790 | 0.3983 | 258.0 |
| CVG | 13,070 | 0.3596 | 308.0 |
| SJU | 11,190 | 0.3464 | 2024.0 |
| BUR | 7,830 | 0.3015 | 1583.0 |
| MYR | 3,510 | 0.2916 | 728.0 |
| IAD | 10,940 | 0.2533 | 696.0 |
| LGB | 1,560 | 0.0871 | 1581.0 |
| SDF | 840 | 0.0512 | 254.0 |
| IND | 1,900 | 0.0464 | 229.0 |
| JFK | 3,980 | 0.0370 | 892.0 |
| MEM | 40 | 0.0020 | 256.0 |
Several of the airports in the lowest pairs are again in larger markets with multiple airports (several of the same markets appear in both the top and bottom groups), but have less travelers to STL at the expense of the other market airports (ORD, BUR, IAD, LGB, JFK). Several are in cities too close to STL (and with few direct flight options) to make flying worth while (IND, MEM, CVG,SDF). Although I'm still not sure why MEM is so amazingly low.
Now, this measure by itself does not show that St. Louis has a weak travel demand between a certain destination because many of those routes (as you might be able to tell) are close by and therefore people are most likely just using other means to travel to those destinations. For a non-STL example, even though Newark (EWR) and Philadelphia (PHL) are both larger airports, trips between the two airports is much lower than the "expected" value because more people would drive, take a bus, or ride a train to cover the 90 or so miles. So we expect destinations close to other airports by to have lower than the rough "expectation" air travel demand even though the overall travel demand is probably very high. So how does overall distance between destinations influence the demand?
Below we plotted the each destination from STL with the actual percentage of passengers they supplied to each destination against the Nonstop distance between the airport. In red is a trendline for the data:
The trendline shows the (not unexpected) result that destinations nearby have a smaller percentage of their passengers coming from STL. The percentage rises sharply and sees its highest values at destinations between 400 and 1,000 miles before beginning to drop again for destinations farther away. This trend would seem to follow the thought that overall travel demand for a destination will decrease as the destination gets further away. However once a destinations get close enough that other forms of travel make sense over flying, the air travel demand will drop. Note again, that this is using percentages and not the raw number of travel demand.
Is this trend happening only due to St. Louis' location relative to other airports? Heading East most destinations are within 1,000 miles while many destinations on the West Coast are over 1,500 miles away. But we still generally see a similar trend at other airports around the country. Not all airports fall into a similar trend and for the one's that do, the distance of the peak varies from only a couple hundred miles to over a thousand. Below are several more airports that exhibit the trend to various degrees.
While the shape of the trend is similar, the actual scatter plots will dependent on the airport's location which determines the distances (x-axis). DEN appears to buck the trend where nearby airports tend to have weaker connections, but in reality there are only 4 major airports within 500 miles of DEN so we aren't going to be able to see how nearby airports influence DEN. The trend does appear to be more obvious for airports that have a large number of airports within 1,000 miles. Additionally larger markets will have generally higher percentages for all of the destinations since these markets will ultimately account for more total passengers around the US.
An extension of this analysis could include looking to normalize the location relative to the other airports. However, I don't plan on spending much more time looking at the distance to demand connection because there are a lot of other interesting analysis to look into with the O&D survey beyond this first look!