Sports Videos in the Wild (SVW): A Video Dataset for Sports Analysis

The amount of digital videos being created is increasing exponentially, e.g., YouTube has reached the upload rate of 100 hours of video per minute. A great deal of this growth is due to the tremendous popularity of smartphones and ubiquitous Internet access. This means that amateur-user generated videos form the new trend in content generation. Thus, there is an immediate need for robust algorithms to automatically analyze and retrieve these videos. On the other hand, many computer vision problems are data-driven and the existence of representative and realistic datasets are necessary for developing robust algorithms. Therefore, we present a highly unconstrained dataset of sports videos, called Sport Videos in the Wild (SVW).

SVW is comprised of 4200 videos captured solely with smartphones by users of Coach’s Eye smartphone app, a leading app for sports training developed by TechSmith corporation. SVW includes 30 categories of sports and 44 different actions. Due to imperfect practice of amateur players and unprofessional capturing by amateur users, SVW is very challenging for automated analysis.

Potential applications of SVW include: genre categorization, action recognition, action detection, and spatio-temporal alignment.

Sample Frames

Labelling

Each video is annotated with the sport genre. In addition, for 40% of the video, time span of each action and a bounding box showing the spatial extent of the action at the start and end frame of the action is also specified.

In SVW, unlike existing datasets, there are multiple actions from the same sport genre, making appearance-based recognition infeasible.

Volleyball Labels
Annotated actions categories ([343, 359, Forearm], [380, 400, Set], [438, 454, Spike]) within a video from Volleyball genre category.

Comparison with existing datasets

Dataset	Purpose	Categ. #	Clip #	Avg. length	Unconst. actions	Unconst. capturing	Camera vibration	Orientation	Sources
KTH	AR	6	100	NA	No	No	No	Landscape	Staged
Weizmann	AR	9	9	NA	No	No	No	Landscape	Staged
IXMAS	AR	11	30	NA	No	No	No	Landscape	Staged
UCF Sports	AR	9	14+	NA	Yes	No	No	Landscape	Broadcast TV
Olympic	AR	16	50	NA	Yes	No	No	Landscape	YouTube
Hollywood2	AR SU	A: 12 S: 10	61+ 62+	NA	Yes	No	No	Landscape	Movies
UCF50	AR	50	100+	NA	Yes	No	Slight	Landscape	YouTube
HMDB	AR	51	101+	NA	Yes	No	Slight	Landscape	Movies & Internet
UCF101	AR	101	100+	7.2	Yes	No	Slight	Landscape	YouTube
THUMOS	AR/AD	101	100+	NA	Yes	No	Slight	Landscape	YouTube
SVW	AR/AD GC	A:44 G: 30	50+ 110+	15.1	Yes	Yes	Yes	Landscape & Portrait	Smartphone & Tablet

Statistics

Video length

Camera Orientation

Field

Evaluation protocol

The genre categorization accuracy is used as the performance metric and is defined as the fraction of testing videos whose genres are correctly classified.Three splits of 70% training and 30% testing are generated for this purpose.

For questions regarding this dataset please contact Morteza Safdarnejad (safdarne [at] egr.msu.edu).

SVW videos and labels can be downloaded from here.

Citation

If you use SVW dataset, please refer to this paper in your publications:

Sports Videos in the Wild (SVW): A Video Dataset for Sports Analysis

Labelling

SVW Download

Publications