Static and Dynamic Cues to Male Attractiveness

Most studies on facial attractiveness have relied on attractiveness judged from photographs rather than video clips. Only a few studies combined images and video sequences as stimuli. In order to determine static and dynamic cues to male attractiveness, we perform behavioural and computational analyses of the Mr. World 2014 contestants. We asked 365 participants to assess the attractiveness of images or video sequences (thin slices) taken from the profile videos of the Mr. World 2014 contestants. Each participant rated the attractiveness on a 7-point scale, ranging from very unattractive to very attractive. In addition, we performed computational analyses of the landmark representations of faces in images and videos to determine which types of static and dynamic facial information predict the attractiveness ratings. The behavioural study revealed that: (1) the attractiveness assessments of images and video sequences are highly correlated, and (2) the attractiveness assessment of videos was on average 0:25 point above that of images. The computational study showed (i) that for images and video sequence, three established measures of attractiveness correlate with attractiveness, and (ii) mouth movements correlate negatively with attractiveness ratings. The conclusion of the study is that thin slices of dynamical facial expressions contribute to the attractiveness of males in two ways: (i) in a positive way and (ii) in a negative way. The positive contribution is that presenting a male face in a dynamic way leads to a slight increase in attractiveness rating. The negative contribution is that mouth movements correlate negatively with attractiveness ratings.


Introduction
Facial appearance has been claimed to be the most important component of physical attractiveness (1). Most studies on facial attractiveness have relied on attractiveness judged from photographs rather than from video clips (2; 3). Only a few studies combined images and video sequences as stimuli (4; 5; 6). The focus of this paper is on the behavioural and computational study of static and dynamic male attractiveness. The analyses will be performed on short video sequences, so-called thin slices. Very brief encounters with persons have been found to allow for accurate and reliable assessments of their traits or qualities (7). Such thin-slice encounters enable human assessors to make split-second decisions on the suitability or capabilities of individuals. Our research question for this paper reads as follows.

RQ1: To what extent do thin slices of dynamic facial expressions contribute to the attractiveness of males?
To answer this question, we focus on two-sub questions.
RQ1a: To what extent do the attractiveness ratings differ for static and dynamic male faces?

RQ1b: What static and dynamic characteristics of male faces predict the attractiveness ratings?
RQ1a will be addressed by means of a behavioural study in which participants are instructed to rate the attractiveness of images and short video sequences (thin slices) of males. RQ1b will be addressed through computational analyses of the static and dynamic stimuli used in the behavioural experiment.
The paper is organised as follows. Section discusses the three main visual cues of male attractiveness. Then, section reviews previous behavioural findings on the relative contribution of static and dynamic facial 1 information. Section describes the research method for the behavioural study (addressing RQ1a) and specifies the video collection and the statistical and computational analyses (addressing RQ1b). Then, section Seminar Nasional "Archipelago Engineering" (ALE) 2018 Ambon, 26 April 2018 Fakultas Teknik Universitas Pattimura, ISSN : 2620-3995 191 provides a general discussion on the results. Finally, section answers RQ1.

Three visual Cues of Male Attractiveness
In previous work, three visual cues to male attractiveness have been discovered: symmetry, averageness, and masculin-ity (8; 9; 10). We briefly discuss each of these three cues below. Faces that have a vertical symmetry are generally rated as more attractive than those that do not (11). Symmetry may signal an individual's genetic quality in defence against parasites (12). Symmetry also possesses other cues to good health. Highly symmetric faces are assessed as being more attractive, healthier and more physically fit than their low-symmetry counterparts (13).
The second cue is averageness. An average face, obtained by averaging over a number of different faces, is typically assessed to be more attractive than a non-averaged face (10). Persons with average faces are assessed to be more healthy (14). The more similar an individual face is with respect to the average face, the more attractive it becomes.
The third cue is masculinity. Male faces that score high on masculinity are assessed to be more attractive than those that score low on masculinity (8).
Examples of facial traits of masculinity are large jaws and prominent eyebrows (15).
The evidence supporting these three cues in relation to attractiveness prompt us to focus on their analyses.

Static and Dynamic Cues to Attractiveness
The visual cues of symmetry, averageness and masculinity can all be assessed from static images of frontal faces. Assuming that these cues are sufficient to determine the attractiveness of males, presenting participants with either images or videos of male faces is likely to result in the same attractiveness ratings. In case facial dynamics provide additional cues that affect the attractiveness in a way that differs from the static cues, this may give rise to different attractiveness ratings. In previous work, three studies focussed on the relationship between attractiveness assessments of static and dynamic faces (4; 5; 6). Below, we will review these three studies. Here, we remark that the studies differ in (1) the gender of assessors (female or male) and (2) the gender of the to-be-assessed models (female or male). Our focus is on female assessors that assess the attractiveness of males. However, the static versus dynamic presentation of men to women (our case), men to men, women to men, and women to women seems to have similar effects for both genders. Hence, our review includes studies of males and females assessing either males or females. We are now ready for our comparison.
The first study is by (5). He found a difference in the evaluation of dynamic and static images. He conducted two experiments. In the first experiment he compared the attractiveness ratings of female models (rather than male models) displayed as a static image or as a 10-second video clip. In the clip, the models read a text while maintaining a neutral expression. The static image was defined as a single frame taken from the clip. The assessors consisted of males and females. Half of the assessors were first provided with 50% of the static models and then with 50% of the thin slices. For the other half of the assessors, the order was the reversed. The average attractiveness ratings on a fivepoint scale for the static and dynamic stimuli differed slightly: 2:87 (SD = 0:95) and 2:94 (SD = 0:92), respectively. The correlation of the ratings assigned to the same models presented in static and dynamic format was quite low (r = 0:19), suggesting a clear difference between attractiveness assessments for both formats. In the second experiment, Rubenstein examined how ratings of the emotional expression related to attractiveness. He found that the valency of emotion is a relevant cue in the dynamic format, but it was not a stimulus in the static format. Moreover, positive emotions were strongly related to attractiveness for dynamic faces (r = 0.48), but not for static ones (r = 0.11).
The second study is by (5). In contrast to (4), (5) found no difference in the evaluation of the attractiveness of static and dynamic faces. Their study used 10-second video sequences of males, rather than females, performing the following actions: (i) rotating the head from left to right with a neutral expression, (ii) facing the camera with a neutral expression while counting from 7 to 13, and (iii) smile. In addition, static images taken from the video showed the male face looking directly at the camera with a neutral expression. The assessors were all females. The results revealed a high correlation (r = 0:83) between the attractiveness assessments of the static and dynamic faces. The average ten-point scale ratings for the still images and videos were the same: 4:1 (SD = 1:1 and 1:2, respectively). (5) argued that assessors were able to quickly create a robust assessment about the attractiveness from a single image only. The addition of dynamic information did not seem to contribute to the attractiveness assessments.
The third study is (6). He found further evidence for the indication that dynamic information does not contribute to assessments of attractiveness. He conducted an extensive experiment using images and videos of 220 (115 female and 105 male) models. The static stimuli consisted of frontal faces of male and female actors with neutral expressions. For the Seminar Nasional "Archipelago Engineering" (ALE) 2018 Ambon, 26 April 2018 Fakultas Teknik Universitas Pattimura, ISSN : 2620-3995 192 dynamic stimuli, the models were instructed to act as if they encountered an attractive person of the opposite sex. The length of the videos varied from about 2 to 7:5 seconds. Again, a high correlation was obtained between the attractiveness ratings for static and dynamic stimuli (r = 0:7). The average attractiveness expressed by seven-point ratings given by females were for static stimuli 2:78 (SD = 0:86) and for dynamic stimuli 3:28 (SD = 0:78). When given by males these ratings equaled 2:86 (0:26) and 3:15 (0:76).
The main point of disagreement between Rubenstein on the one hand and Rhodes and Koscinski on the other hand concerns the correlation of attractiveness ratings for static and dynamic stimuli. Rubenstein finds a low correlation, whereas Rhodes and Koscinski report a high correlation. The difference between the findings of Rubenstein and Rhodes may be caused by the fact that Rubenstein employed female models that were assessed by males and females, whereas Rhodes used male models assessed by female assessors. Maybe, the assessment of female models depends on the presentation model (static versus dynamic). However, if that would be the case, Koscinski would have found different results for the assessment of male and female. So, in the end there is no agreement. This question can therefore be coined as an open question.
In the general case, there is less disagreement among the three studies regarding the absolute difference between the attractiveness ratings for static and dynamic faces. (5) report no difference between the average ratings for both formats, whereas Rubenstein and Koscinski find a higher rating for video sequences. In summary, there is some conflicting evidence regarding the correlation of attractiveness ratings for static and dynamic stimuli that cannot be explained by the gender of the to-be-assessed model.
Our behavioural study aims at determining (1) the correlation between static and dynamic male faces and (2) their absolute ratings for attractive males in the somewhat more natural setting of Mister World selfpresentation videos. Subsequently, our computational study aims at determining (3) which type of dynamic information in the facial stimuli predict the attractiveness ratings (16). To this end we analyse the dynamics of the landmark configurations. In addition, we examine the dynamics of head pose in terms of the dynamic cues: yaw, pitch and roll.

Research Methodology
In this section, we present method survey study , we outline results of the survey study , we describe method computational analysis extraction , we discuss results of the computational study.

Method Survey Study
This section describes the methods used for performing the behavioural study (addressing RQ1a). The behavioural study has a between-subjects design in which the two formats of facial stimuli are counterbalanced. Attractiveness ratings for the static and dynamic stimuli were collected via an online survey.
Below, we describe (A) the participants (who act as assessor), (B) the stimuli, and (C) the experimental procedure.

A: Participants
Female participants were invited to participate in the survey through a message distributed via social media. In total, 365 participants accepted the invitation (average age = 21.3 years, SD = 4.20). They were assigned to one of two counter-balanced versions of the survey (see procedure under C). Version 1 was completed by 93 respondents and version 2 by 102 respondents. The remaining participants (170 in total) either did not complete the survey or were males. For details, an average age for version 1 21.81 and an average age for version 2 21.25

B: Static and Dynamic Stimuli
The stimuli for the behavioural experiment (i.e., the video collection) were obtained from www.youtube.com. In total 46 profile videos were obtained of Mister World pageant 2014 contestants. The dynamic stimuli consisted of the initial 10 seconds of the video, corresponding to 300 frames per video. These short sequences contain reasonably standardised presentations of the contestants, who present themselves by providing some personal information and by motivating their reason to join the Mister World competition. Throughout the video, the contestants were facing the camera. The static images were defined as single frames of the videos showing the contestants in a representative and neutral pose. Figure 1 shows six examples of such images. The two versions of the survey were defined as follows. Version 1 consisted of the static stimuli corresponding to the first half of the alphabetically ordered Mr World contestants followed by the dynamic stimuli for the remaining contestants. For version 2, the dynamic stimuli of the first half were followed by the static stimuli of the second half. In this way, each contestant was rated on the basis of his image and of his video and each participant saw for  (see table  1) Participants were instructed to rate the attractiveness of the candidates on a 7-point Likert scale ranging from very unattractive (1) to very attractive (7). All stimuli were presented on single pages. After rating a stimulus, participants could scroll back to previous pages (stimuli and ratings).

Results of the Survey Study
Below we present the results of the behavioural study. Table 2 lists for each contestant the average ratings and standard deviations for the static and dynamic stimuli. In addition, the difference scores (static minus dynamic rating) are listed in the last column. The attractiveness ratings (third column) are ordered according to descending attractiveness of the static images. From left to right, the columns in the table represent the following: the rank of the attractiveness rating for static images (No.), the name of the country of origin of the male pageant (Countries), the average and standard deviation of the attractiveness score for the static image (Static Images and SD), and the average and standard deviation of the attractiveness scores for the dynamic images (Dynamic Images and SD). The final column lists the difference scores (i.e., attractiveness score for static image minus attractiveness score for dynamic image).
From the table, we make two observations. First, the attractiveness score for the static and dynamic images are quite similar. Second, the difference scores are predominantly negative, indicating higher attractiveness score for dynamic images.
To support these observations, we computed the corre-lation between the attractiveness scores for the static and dynamic images and computed the histogram for the difference scores.
The correlation between the average ratings for the static and dynamic stimuli was very high: r = 0.93, N = 46, p < 0.01. The left part of figure 2 plots the average static ratings against the dynamic ratings. Each point corresponds to a Mr World contestant. The solid line is the best fitting regression line. Figure 3 shows the distribution of difference scores (cf. Table 2). A clear bias towards negative scores is visible indicating higher ratings for dynamic stimuli than for their static counterparts. The average difference score is equal to 0.40 (SD = 0.25).   These results allow us to answer RQ1a (To what extent do attractiveness ratings differ for static and dynamic male faces?). Our findings revealed that the attractiveness ratings of male faces in static or dynamic form are highly correlated. This indicates that the relative attractiveness of male faces is barely affected by presentation mode. However, the attractiveness ratings do differ in an absolute sense. Male dynamic faces are assessed to be slightly more attractive than their static counterparts.

Method Computational Study
This section describes the research method used for performing the computational study (addressing RQ1b). The computational analysis of attractiveness is performed on facial expressions as extracted with Intraface (16) from still images and video sequences. The facial landmarks correspond to fiducial points situated on the face, i.e., facial locations at or near to the mouth, nose, and eyes. In what follows we describe (A) the dataset, (B) the landmark extraction, and (C) the computational procedure used for analysing the data.

A: Dataset
The computational analysis was performed on the same static and dynamic stimuli as used in the behavioural study.

B: Landmark Extraction
In order to be able to perform computational measurements of symmetry and averageness, the facial landmarks were extracted from the images and videos using the Supervised Descent Method (SDM) (16) which is part of the publicly available Intraface software. SDM takes an image or video frame as input and returns estimates of the locations of 49 landmarks: 2×5 landmarks representing the two eyebrows, 2×6 landmarks for the eyes, 9 landmarks for the nose, and 18 landmarks for the mouth. As can be seen in figure 4 shows extracted landmarks are shown together with their estimates. In addition, SDM estimates the three-dimensional head pose for each image or frame. Head pose is represented by yaw (the direction of shaking "no"), pitch (the direction of nodding "yes"), and roll (the in-plane rotation of the face). All landmarks were normalised in position, scale and orientation. Position normalisation was obtained by defining the landmark at the tip of the nose (landmark 17) as the origin. All landmark configurations were rescaled to have the same distance between the centers of the two eyes. Finally, using the roll pose estimate, the landmark configurations were rotated to the upright position.

C: Computational Procedure
In the computational procedure we distinguish measuring static visual cues (three), and dynamic Seminar Nasional "Archipelago Engineering" (ALE) 2018 Ambon, 26 April 2018 Fakultas Teknik Universitas Pattimura, ISSN : 2620-3995 195 visual cues (one: the dynamics of the landmark). To measure the three static visual cues to attractiveness, viz.
Symmetry (static). Facial symmetry was determined by comparing the landmarks at the left and right sides of the face. More specifically, symmetry was defined as the difference between the average distance of the landmarks on the left side of the face to the vertical midline and the average distance of the landmarks on the right side of the face to the vertical midline.

Averageness (static).
For the normalised landmark configurations obtained from the still images, an average configuration was computed in which each landmark was assigned the location of that landmark averaged over all contestants. The distance of each landmark configuration to the average (the mean Euclidean distance of each landmark to its average) was defined as a measure of averageness.

Masculinity (static).
Finally, masculinity was measured by means of the "Gender" detector of CERT (17). For each image or frame of the video, this detector returns a real number with the magnitude indicating the degree of femininity if positive and the degree of masculinity if negative.
To measure the dynamic visual cues, viz. the dynamic landmark, the following computational procedure was used. Landmarks (dynamic). To assess the contribution of the dynamics of the landmark to the attractiveness ratings, the following procedure was used. For each frame in the video, the distance of a landmark to its initial position was computed. For all but one landmark, the resulting distance vector represents the dynamic movements during the 10-second period of the video. Only for landmark 17 (tip of the nose) which is fixed at the origin, the distance vector contains all zeros. We defined the standard deviation of the individual landmark distance vectors as a measure of the temporal variation of the landmarks.
For the static and dynamic measurements we computed the correlation values of their values with the corresponding average attractiveness ratings. Significant correlations indi-cate that information about the measurement may be of relevance to the perceived attractiveness.

Results of the Computational Study
Below we present the result, of the computational study. Figure 5 summarizes the results of extracting the landmarks from the videos of the 48 contestants. Each pane shows the 49 landmark positions for the 300 frames superimposed. The amount of movement and orientation of the face and the components of a contestant is reflected in each pane. For instance, the plot of the contestant from The Netherlands reveals that he remained relatively stable with his face in a vertical position. In contrast, the contestant from Northern Ireland move his eyebrows (they are detached from the eyes) and had his face in a tilted orientation. The results obtained for the measurements of the static cues to attractiveness, viz. symmetry, averageness, and masculinity, are listed in Table  3. Only results with a p-value smaller than 0.05 are shown. The static measurements of symmetry, averageness, and masculinity correlate significantly with the attractiveness ratings given to the still images. Masculinity has the largest correlation of −0.39. The negative sign is due to the negative value of the CERT variable for males. For the dynamic ratings, symmetry is not significantly correlated with attractiveness, but averageness (0.30) and masculinity are (-0.40).  Table 4 lists the landmarks with significant correlations. As can be seen to the figure 4, all landmarks listed correspond to the mouth region. Their dynamics are, of course, highly correlated. A reason is that they reflect a single underlying factor: the mouth dynamics. Given that the signs are negative, these values indicate that attractiveness correlates negatively with mouth movements.  These findings lead to the following answer to RQ1b (what static and dynamic characteristics of male faces predict the attractiveness ratings?). Our results show that the static characteristics of symmetry, averageness, and masculinity correlate with attractiveness ratings. The same is true for their dynamic counterparts, except for symmetry (for which no results could be obtained). In addition, for the dynamic measurements the movements of the mouth correlate negatively with attractiveness.

General Discussion
In this paper we examined the contribution of three visual cues to attractiveness (horizontal symmetry, averageness, and masculinity) in static (photographs) and dynamic (video) stimuli. The results of the behavioural study are largely in agreement with those by (5) and (6) in two respects. First, we find a large correlation between the assessment of static and dynamic stimuli (r = 0.93). Similarly, (5) and (6) found correlations of r = 0.83 and r = 0.7, respectively. Despite the agreement in correlation, we still found the highest correlation. The reason that the correlation in our experiment was higher may be due to the fact that our static stimuli were extracted from the video sequences. Hence, the static images are contained in the dynamic sequences. This was not the case in the (5) and (6) studies, which may have resulted in relatively lower correlation values. The second similarity concerns the absolute differences between the attractiveness ratings of static and dynamic stimuli. Similar to (5) and (6), we find higher ratings for the attractiveness ratings for videos as compared to still images. This adds to the evidence that all individuals are considered to be a bit more attractive when viewed on a video than when seen on a picture. The disagreement of our results with those by (4) may be partly attributable to the fact that he used female faces as stimuli, whereas we used faces as male stimuli. Here, we remark that the contribution of facial dynamics to attractiveness has been found to be different for male and female stimuli (18). Repeating our study with female stimuli may help to determine whether the correlation between static and dynamic stimuli is specific to the use of male stimuli.
If the participants in our behavioural study relied on the three visual cues to assess attractiveness, then a computational analysis of these cues should result in similar outcomes for static and dynamic stimuli. However, our results show some little differences for averageness and masculinity (see Table 3). The correlations for the static and dynamic faces are about the same. This outcome supports the idea that the assessors in the experimental study relied on averageness and masculinity as cues to the assessment of attractiveness. For symmetry, no significant results were obtained which may be due to disruptions in the symmetry calculations due to head pose variations.
At the end of this discussion, we remark that we studied the effect of landmark dynamics on attractiveness ratings. We found that mouth movements (as measured by standard deviation) contribute negatively to attractiveness. The eight landmarks listed in Table 4 (and ordered by their p-value) all belong to the mouth region. Hence their dynamics are highly correlated and, as a consequence, their correlations (C) with the according attractiveness ratings are all quite similar (ranging from −0.37 to 0.29). Post-hoc visual inspection of the videos revealed that the lower-ranked contestants tend to make prominent movements with their mouths during speech, while the front-ranked contestants made more subtle mouth movements while talking. A recent study of the facial dynamics of males and females revealed a specific temporal mouth movement pattern in females, but not in males (19). This suggests that prominent movements are rather typical for males. The observation may give rise to lower attractiveness ratings.

Answer to RQ1
The results of behavioural and computational studies have allowed us to answer the two research questions RQ1a and RQ1b (see subsection result of behavioural study and result of computational study). Here, we repeat the main research question that guided our research in this paper. RQ1: To what extent do thin slices of dynamic facial expressions contribute to the attractiveness of males?
The answer to RQ1 is as follows. Thin slices of dynamical facial expressions contribute to the attractiveness of males in two ways: (1) in a positive way and (2) in a negative way.

197
The positive contribution is that, on average, presenting a male face in a dynamic way leads to a slight increase in attractiveness rating (0.25 point on a 7-point scale). The negative contribution is that, on average, mouth movements correlate negatively with attractiveness ratings.
These answers may be translated into two recommendations for male pageants: (1) they should present themselves as much as possible in a dynamic fashion (i.e., through video or live appearances), and (2) they should restrict their mouth movements as much as possible.