2017
DOI: 10.1109/tiv.2017.2720459
|View full text |Cite
|
Sign up to set email alerts
|

How Much Data is Enough? A Statistical Approach with Case Study on Longitudinal Driving Behavior

Abstract: Big data has shown its uniquely powerful ability to reveal, model, and understand driver behaviors. The amount of data affects the experiment cost and conclusions in the analysis. Insufficient data may lead to inaccurate models while excessive data waste resources. For projects that cost millions of dollars, it is critical to determine the right amount of data needed. However, how to decide the appropriate amount has not been fully studied in the realm of driver behaviors. This paper systematically investigate… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
58
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
4

Relationship

5
5

Authors

Journals

citations
Cited by 71 publications
(58 citation statements)
references
References 64 publications
(86 reference statements)
0
58
0
Order By: Relevance
“…The predominant reason for this is the low exposure of the safety-critical events, such as crashes, that makes the training dataset sufficient for making meaningful statistical inference to be extremely large to the extent that collecting it under Naturalistic Field-Operational Test (N-FOT), i.e. on-road deployment, becomes time-consuming, risky, and costly [2], [3], [4]. Google cars, Waymo, for instance, have logged over 3.5 million miles from four states in the USA: Washington, California, Arizona, and Texas [5].…”
Section: Introductionmentioning
confidence: 99%
“…The predominant reason for this is the low exposure of the safety-critical events, such as crashes, that makes the training dataset sufficient for making meaningful statistical inference to be extremely large to the extent that collecting it under Naturalistic Field-Operational Test (N-FOT), i.e. on-road deployment, becomes time-consuming, risky, and costly [2], [3], [4]. Google cars, Waymo, for instance, have logged over 3.5 million miles from four states in the USA: Washington, California, Arizona, and Texas [5].…”
Section: Introductionmentioning
confidence: 99%
“…We used driving encounter data collected by the University of Michigan Transportation Research Institute (UMTRI) [4] from about 3,500 vehicles equipped with on-board GPS. The latitude and longitude data was recorded by the GPS device to represent vehicle positions.…”
Section: Experiments a Dataset And Preprocessingmentioning
confidence: 99%
“…In addition, if the relative range was less than 10 m, the event ended, ensuring that no Stop-&-Go case was included. • The duration for a singular car-following event should be larger than 50 s. • Any driver with less than 500 car-following events was eliminated, which was able to guarantee the collected data were enough to capture the underlying driving styles [46]. Based on the above criteria, 49 drivers were selected out from 56 drivers.…”
Section: B Data Extractionmentioning
confidence: 99%