Joint / School Trip Detection and
Imputation
Households with children that participate via the rMove app are asked
to proxy-report travel information for children under 18. These types of
trips are often underreported in surveys and must be imputed to provide
a more complete household travel survey dataset. There are two primary
types of known missing trips in raw datasets:
- Joint Trips: Trips on which a member is
present (based on other household members’ records) that are not copied
to the member’s travel diary.
- School Trips: Non-reported school trips on
days when a parent reported that a child attended school.
While the purpose of this imputation effort is primarily to impute
child school trips, a necessary step is to first impute missing joint
trips in general. Thus, the imputation process is broadly cleaved into
two stages. First is to detect and associate joint trip records with
each other to determine which proxy reported trips are missing if any.
Then if the person still does not possess a school trip after the joint
trip proxy reporting imputation, and should have one based on school day
attendance, then individual outbound and inbound trips are synthesized
for that person.
The process to impute each trip type is described below.
Joint Trip Detection and Imputation
To identify and impute missing joint trips, RSG first performs a
joint trip detection process to match trip records across household
members. Because smartphones traveling at the same time may detect
information slightly differently, this process is not exact. User-added
trips (which may include inexact times and locations) add to the
complexity. To best match trips across members, RSG uses a threshold of
trips that depart/arrive less than 15 minutes apart and start/end from
locations under 100 meters away. From this group, a list of trip pairs
forms a network graph that can be used to identify unique joint trips as
being a fully connected subgraph clique (i.e., two or more trips that
completely overlap with each other).
Once all trips are associated into sets of unique joint trips, the
missing joint trips can be identified as the difference between the
number of joint trip records and the total reported household members in
a unique joint trip. For example, if a joint trip has only three trip
joint records but the joint trips report four household members, then
one trip record is missing.
These missing trips can then be imputed by copying the trip
attributes from a host trip. The host is determined as the person with
the lowest person number in the household. The one caveat is that the
trip purpose cannot always be imputed if the purpose of host trip is
“escort”, which is common for missing child trips. To mitigate this,
purpose is imputed if the destination is a known fixed location, such as
home or school. A buffer radius of 100 meters is used to check if the
destination is associated with one of those known fixed locations.
School Trip Detection and Imputation
Missing school trips cannot be copied from a host trip and must be
entirely synthesized. The origin and destination are assumed to be the
home and school locations, and the departure and arrival times are
randomly sampled from the distribution of school trip departures and
durations in the data itself. Two trips are synthesized for each missing
school day, one for outbound and one for inbound trips. The trip mode is
assumed to be the typical mode to school as reported in the survey. If
the mode is “drive,” there are no escort trips, and they are not driving
age, it is assumed that the child is a passenger by someone outside the
household (e.g., carpool).