The total duration of videos is 2 hours. The dataset adopted different cameras to shoot fire videos. The shooting time includes day and night.The dataset can be used for tasks such as fire detection.
These expressions are produced at two levels of emotional intensities (regular and strong) except for the neutral emotion that only contains regular intensity.
Human lipreading performance increases for longer words, indicating the importance of features capturing temporal context in an ambiguous communication channel.