CREMA-D

Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D)

View the Project on GitHub

CREMA-D (Crowd-sourced Emotional Multimodal Actors Dataset)

Notice: If cloning from GitHub fails, try cloning from GitLab at our CREMA-D mirror

Summary

CREMA-D is a data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified).

Actors spoke from a selection of 12 sentences. The sentences were presented using one of six different emotions (Anger, Disgust, Fear, Happy, Neutral, and Sad) and four different emotion levels (Low, Medium, High, and Unspecified).

Participants rated the emotion and emotion levels based on the combined audiovisual presentation, the video alone, and the audio alone. Due to the large number of ratings needed, this effort was crowd-sourced and a total of 2443 participants each rated 90 unique clips, 30 audio, 30 visual, and 30 audio-visual. 95% of the clips have more than 7 ratings.

The description below specifies the data made availabe in this repository.

For a more complete description of how CREMA-D was created use this link or the link below to the paper.

Access

If you access the GitHub repository, please fill out this form. That way we can keep a record of the community of CREMA-D users.

Contact/Questions

If you have questions about this data set, please submit a new issue to the repository or contact dcooper@wcupa.edu.

Storage requirements

Note: This repository uses Git Large File Storage, git-lfs. You will need to install it on top of your git installation in order to get the video and audio files. If you just download the zip file (~24MB zipped, ~163MB unzipped) then all of the video and audio files will just be links to the git-lfs file. For more information go here.

This Directory holds files used in the paper:

Cao H, Cooper DG, Keutmann MK, Gur RC, Nenkova A, Verma R. CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset. IEEE transactions on affective computing. 2014;5(4):377-390. doi:10.1109/TAFFC.2014.2336244.

The collection of the videos is described in this paper:

Keutmann, M. K., Moore, S. L., Savitt, A., & Gur, R. C. (2015). Generating an item pool for translational social cognition research: methodology and initial validation. Behavior research methods, 47(1), 228-234.

License:

This Crowd-sourced Emotional Mutimodal Actors Dataset (CREMA-D) is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/

Description

Text Data Files:
  1. SentenceFilenames.csv - list of movie files used in study
  2. finishedEmoResponses.csv - the first emotional response with timing.
  3. finishedResponses.csv - the final emotional Responses with emotion levels with repeated and practice responses removed, used to tabulate the votes
  4. finisedResponsesWithRepeatWithPractice.csv - the final emotional responses with emotion levels with repeated and practice responses in tact. Used to observe repeated responses and practice responses.
  5. processedResults/tabulatedVotes.csv - the tabulated votes for each movie file.
  6. VideoDemographics.csv - a mapping of ActorID (the first 4 digits of each video file) to Age, Sex, Race, and Ethicity.
R Scripts
  1. processFinishedResponses.R - converts the finisedResponses.csv to the tabulated
  2. readTabulatedVotes.R - reads processedResults/tabulatedVotes.csv
Finished Responses Columns
(finishedResponses.csv and
finishedResponsesWithRepeatWithPractice.csv)
Finished EmoResponses Columns
(finishedEmoResponses.csv)
Summary Table Columns
processedResults/summaryTable.csv
Tabulated Votes Columns
processedResults/tabulatedVotes.csv
Video Demographics Columns
VideoDemographics.csv
Filename labeling conventions

The Actor id is a 4 digit number at the start of the file. Each subsequent identifier is separated by an underscore (_).

Actors spoke from a selection of 12 sentences (in parentheses is the three letter acronym used in the second part of the filename):

The sentences were presented using different emotion (in parentheses is the three letter code used in the third part of the filename):

and emotion level (in parentheses is the two letter code used in the fourth part of the filename):

The suffix of the filename is based on the type of file, flv for flash video used for presentation of both the video only, and the audio-visual clips. mp3 is used for the audio files used for the audio-only presentation of the clips. wav is used for files used for computational audio processing.

Video Files

Flash Video Files used for presentation to the Raters are stored in the VideoFlash directory.

Audio Files

MP3 Audio files used for presentation to the Raters are stored in the AudioMP3 directory.

Processed Audio

WAV Audio files converted from the original video into a format appropriate for computational audio processing are stored in the AudioWAV directory.

Funding Sources

All data collection and method development was supported by the following funding sources: