Systematic map of meta-analyses related to sexual selection

if (!require("pacman")) {install.packages("pacman")}
pacman::p_load(circlize,
               cowplot,
               DT,
               effects,
               ggpattern,
               ggraph,
               ggtext,
               igraph,
               janitor,
               kableExtra,
               lubridate,
               mdthemes,
               MuMIn,
               patchwork,
               performance,
               tidygraph,
               tidyverse,
               VGAM)

options(DT.options = list(rownames = FALSE,
                          dom = "Blfrtip",
                          scrollX = TRUE,
                          pageLength = 5,
                          columnDefs = list(list(targets = '_all', 
                                                 className = 'dt-center')),
                          buttons = c('copy', 'csv', 'excel', 'pdf')))

source("load_data.R")

Appendix S1.

Changes from pre-registration

Objectives have slightly changed (i.e. broad questions remain the same, but we added more details and subquestions).
Interlocus sexual conflict was not specified in the pre-registration.
Phylogenetic comparative analyses were intended to be excluded from our map (study-design criterion), but because meta-analyses that incorporate phylogenetic relationships in their models can be considered to be phylogenetic comparative analyses, we removed this selection criterion.
Some variables related to the content of meta-analyses were modified (e.g. how questions were described and classified, see below) or removed because of a mismatch between questions and models (e.g. ‘main_question_method’; see Section III.5).
Some methodological variables extracted from meta-analyses were not reported here because they were too subjective or often unclear in meta-analyses, such as ‘inclusion_decisions’ and ‘meta_estimate’.
Bibliometrics data extraction was only vaguely mentioned in the pre-registration, and gender assignment and affiliation analyses were not mentioned in the pre-registration.
Post-hoc analyses (Sections II.4 and III.6.c) were not initially planned.

Table S1.

ROSES form.

read_csv("csv_files/roses_checklist.csv", 
         locale = locale(encoding = "latin1")) %>%
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Appendix S2.

Literature searches

Strings used for literature searches depending on the database, as follows: Scopus: TITLE-ABS-KEY((“metaanal*” OR “meta-anal*” OR “metaregres*” OR “meta-regres*” OR (quantitativ* w/3 review* ) OR (quantitativ* w/3 synthe*) OR (global* w/3 synthe*) OR “comprehensive evidence”) AND (“sexual* select*” OR “*male choice” OR “mate cho*” OR “mat* prefer*” OR “*male prefer*” OR “intrasexual competition” OR “intra-sexual competition” OR “intersexual competition” OR “inter-sexual competition” OR “sperm competition” OR “mating pattern*” OR “assortative mating” OR “mating success” OR “polyandr*” OR “polygy*” OR “extra-pair” OR “extrapair” OR “mate guarding” OR “reproductive tactic*” OR “remating” OR “honest signal*” OR “sexual signal*” OR “ornament*” OR “sperm transfer” OR “good genes” OR “good-genes” OR “ejaculate trait*” OR “ejaculate production” OR “bird song*” OR “mating strateg*” OR “bateman gradient*”))

Web of Science: TOPIC((“metaanal*” OR “meta-anal*” OR “metaregres*” OR “meta-regres*” OR (quantitativ* NEAR/3 review* ) OR (quantitativ* NEAR/3 synthe*) OR (global* NEAR/3 synthe*) OR “comprehensive evidence”) AND (“sexual* select*” OR “*male choice” OR “mate cho*” OR “mat* prefer*” OR “*male prefer*” OR “intrasexual competition” OR “intra-sexual competition” OR “intersexual competition” OR “inter-sexual competition” OR “sperm competition” OR “mating pattern*” OR “assortative mating” OR “mating success” OR “polyandr*” OR “polygy*” OR “extra-pair” OR “extrapair” OR “mate guarding” OR “reproductive tactic*” OR “remating” OR “honest signal*” OR “sexual signal*” OR “ornament*” OR “sperm transfer” OR “good genes” OR “good-genes” OR “ejaculate trait*” OR “ejaculate production” OR “bird song*” OR “mating strateg*” OR “bateman gradient*”))

Google Scholar:
Simplified Chinese: ((“荟萃分析” OR “元分析”) AND (“性选择” OR “性别选择”))
Traditional Chinese: ((“薈萃分析” OR “元分析”) AND (“性選擇” OR “性别選擇”))
Croatian: (“meta-analiza” AND (“spolni odabir” OR “seksualna selekcija” OR “spolna selekcija”))
Japanese: ((“メタ分析” OR “メタ解析”) AND “性選択”)
Polish: (“metaanaliza” AND “dobór płciowy”)
Portuguese: (“meta-análise” AND “seleção sexual”)
Russian: (“мета-анализ” AND “половой отбор”)
Spanish: (“meta-análisis” AND “selección sexual”)

Appendix S3.

Screening

A few studies employed meta-analytical methods but the data they used to make inferences were not from empirical papers [e.g. Winternitz et al. (2013) used genbank entries, Holman (2016) used simulated data, Dobson et al. (2018) used estimated data, and Friis, Dabelsteen & Cardoso (2021) used citizen data]. We therefore deemed these studies invalid according to our criteria for broad meta-analyses and excluded them from our systematic map during full-text screening. Furthermore, we did not consider plasma concentration of carotenoids as a sexual trait, despite evidence of a connection to the expression of ornaments. As a consequence we excluded Simons, Groothuis & Verhulst (2015) from our systematic map as this study exclusively investigated plasma concentrations of carotenoids. By contrast, we included both Simons, Cohen & Verhulst (2012) and Koch, Wilson & Hill (2016) as they explored individual traits that could be considered ornaments (e.g. plumage) in addition to plasma concentration of carotenoids.

Table S2.

Decisions made at the full-text screening stage, with reasons for exclusion.

read_csv("csv_files/full_text_screeening_results.csv",
         locale = locale(encoding = "latin1")) %>% 
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Table S3.

Variables related to content extracted from meta-analyses related to sexual selection.

read_csv("csv_files/systematic_map_extraction.csv") %>% 
  slice(-2) %>% 
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Appendix S4.

Data extraction

We tried to summarise questions from meta-analytical studies (see Table S4) by clumping similar subquestions into a single one, as described for Garamszegi (2005) in Section III.5. However, if a summarised or original question fitted more than two topics within our classification framework, we splitted it into multiple questions as long as authors provided results for each of them. For instance, Alissa (2018) presented multiple questions with each belonging to a different topic related to sexual selection. Thus, in that case (and others alike) we listed questions separately as the study showed findings that specifically answered them. This system worked well for all but four studies (Thornhill & Moller, 1998; Yasukawa et al., 2010; McLean, Bishop & Nakagawa, 2012; Parker, 2013), which had questions that fitted three different topics (pre-copulatory sexual traits, mate choice, and mating success) but could not be split. This is because they mixed mate choice outcomes with observations of mating success (as others described in Section III.2.e.iv), without any form of distinction (e.g. moderator) in their results. Thus, we chose to remove the relevant questions from these four cases from the mate choice category, leaving them in pre-copulatory sexual traits and mating success only. Additionally, it was difficult to classify the study from Garcia-Roa et al. (2020) as it used several different measures to calculate effect sizes. Because many of its effect sizes were related to sperm number and genital traits, we placed this study within the topic of post-copulatory intrasexual competition. We also note that some meta-analyses might actually connect different topics but our classification system might have not considered it as such if these links were not central to the study. For instance, studies that evaluate whether mating is assortative consider many traits, including ornaments (e.g. Jiang, Bolnick & Kirkpatrick, 2013; Wang et al., 2019; Moura et al., 2021). Yet, because ornaments were not the main focus of these questions, we did not attribute the topic ‘pre-copulatory sexual traits’ to these studies.
Because the meta-analytical questions we summarised resulted from our interpretation of meta-analyses, we also recorded direct quotes from meta-analyses regarding their goals (Table S4). These quotes were extracted from the meta-analyses’ last two paragraphs of the introduction or background section.

Table S4.

Questions extracted from meta-analyses related to sexual selection (variable ‘question description’). The variable ‘quote’ contains excerpts extracted from meta-analyses that state their goals. Variables after ‘question description’ are the categories we used to classify questions, in which 1 represents that a given question fitted in the category and 0 that it did not. The variable ‘sex roles classification’ refers to the classification regarding its conformity with sex roles depending on the sex focused by the question (see details in Section II.3.f). EPP = extra-pair paternity.

questions_list %>% 
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Table S5.

Variables related to methods extracted from meta-analyses related to sexual selection. We initially planned to assess whether meta-analyses provided selection criteria (see Pollo et al., 2023), but we found that this was highly subjective, so we excluded this variable from our data extraction.

read_csv("csv_files/reporting_appraisal_extraction.csv") %>% 
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Table S6.

General systematic map results. See Table S3 for metadata for the variables in this table.

systematic_map_results %>% 
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Table S7.

Results of generalized linear models (GLMs) and Spearman correlations between the standardised number of species and the proportion of species represented by the two most abundant animal groups in meta-analytical studies with multiple species from different animal taxa. p values < 0.001 are reported as zero. ‘GLM SE’ stands for standard error of the GLM coefficient.

cor_1 <-
  with(n_species_df,
       cor.test(prop_sum[number_of_species < 70],
                number_of_species[number_of_species < 70],
                method = "spearman"))

glm_n_species_1 <- 
  n_species_df %>% 
  filter(number_of_species < 70) %>%
  glm(data = .,
      formula = prop_sum ~ 
        s_number_of_species,
      family = binomial,
      weights = number_of_species)

cor_2 <-
  with(n_species_df,
       cor.test(prop_sum[number_of_species >= 70],
                number_of_species[number_of_species >= 70],
                method = "spearman"))

glm_n_species_2  <-
  n_species_df %>% 
  filter(number_of_species >= 70) %>% 
  glm(data = .,
      formula = prop_sum ~ 
        s_number_of_species,
      family = binomial,
      weights = number_of_species)

cor_3 <-
  cor.test(n_species_df$prop_sum,
           n_species_df$number_of_species,
           method = "spearman")

glm_n_species_3 <-
  n_species_df %>% 
  glm(data = .,
      formula = prop_sum ~ 
        s_number_of_species,
      family = binomial,
      weights = number_of_species)

table_s7 <- 
  map_df(list(glm_n_species_1,
              glm_n_species_2,
              glm_n_species_3),
         broom::tidy) %>% 
  add_column(set = rep(c("less than 70 species",
                         "more than or equal to 70 species",
                         "all data points"),
                       each = 2),
             sample_size = rep(c(nrow(n_species_df[n_species_df$number_of_species < 70, ]),
                                 nrow(n_species_df[n_species_df$number_of_species >= 70, ]),
                                 nrow(n_species_df)),
                               each = 2),
             .before = "term") %>% 
  add_column(correlation_coefficient = rep(c(cor_1$estimate,
                                             cor_2$estimate,
                                             cor_3$estimate),
                                           each = 2),
             correlation_p_value = rep(c(cor_1$p.value,
                                         cor_2$p.value,
                                         cor_3$p.value),
                                       each = 2)) %>% 
  rename(glm_term = term,
         glm_coefficient = estimate,
         glm_std_error = std.error,
         glm_z_value = statistic,
         glm_p_value = p.value) %>% 
  mutate(set = str_to_sentence(set),
         across(where(is.numeric),
                ~round(., digits = 3)))

table_s7[, 3] <- rep(c("Intercept",
                       "Standardised number of species"),
                     each = 3)

names(table_s7) <- 
  c("Set",
    "Sample size",
    "GLM term",
    "GLM coefficient",
    "GLM SE",
    "GLM z value",
    "GLM p value",
    "Correlation coefficient",
    "Correlation p value")

datatable(table_s7,
          extensions = "Buttons",
          rownames = FALSE)

Table S8.

Pairwise comparisons (using Mann-Whitney tests) between meta-analyses of different taxonomic scope regading the number of species, empirical studies, and effect sizes used by them. p values < 0.001 are reported as zero.

wilcox.diffs <- function(df = clean_names(systematic_map_results)) {
  tax <- df %>%
    select(taxonomic_scope)
  
  lvl_tax <- c("Single species",
               "Specific group",
               "All taxa")
  
  number <- c("number_of_species",
              "number_of_studies",
              "number_of_effect_sizes")
  
  wilcox_comps <-
    tibble(comparison = NA,
           variable = NA,
           sample_size_1 = NA,
           sample_size_2 = NA,
           u_value = NA,
           p_value = NA)
  
  for (i in 1:3) {
    for (j in 3:1) {
      k <- j
      if(j == 1) {k <- 3}
      
      x1 <- unlist(as.vector(df[tax == lvl_tax[k],
                                number[i]]))
      x1 <- x1[!is.na(x1)]
      
      if(j == 1) {k <- 2}
      x2 <- unlist(as.vector(df[tax == lvl_tax[k - 1],
                                number[i]]))
      x2 <- x2[!is.na(x2)]
      
      res <- wilcox.test(x1, 
                         x2)
      
      wilcox_comps <- 
        wilcox_comps %>% 
        add_row(comparison = j,
                variable =  number[i],
                sample_size_1 = length(x1),
                sample_size_2 = length(x2),
                u_value = res$statistic,
                p_value = res$p.value)
      
    }
  }
  wilcox_comps <-
    wilcox_comps %>% 
    slice(-1)
  return(wilcox_comps)
}

table_s8 <- 
  wilcox.diffs() %>% 
  mutate(comparison = case_when(comparison == 3 ~ "All taxa vs. specific group",
                                comparison == 2 ~ "Specific group vs. single species",
                                comparison == 1 ~ "All taxa vs. single species"),
         variable = str_to_sentence(str_replace_all(variable,
                                                    "_",
                                                    " ")),
         across(where(is.numeric),
                ~round(., 3)))

names(table_s8) <- 
  c("Comparison",
    "Variable",
    "Sample size 1",
    "Sample size 2",
    "U value",
    "p value")

datatable(table_s8,
          extensions = "Buttons",
          rownames = FALSE)

Table S9.

Trait modality for meta-analytical questions that fitted the pre-copulatory sexual trait category.

questions_modality %>% 
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Table S10.

General reporting appraisal results. Table S5 provides metadata for the variables in this table.

reporting_appraisal_results %>% 
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Table S11.

Results of a generalised linear model (GLM) and Spearman correlation between meta-analyses’ number of affiliations and standardised number of authors. p values < 0.001 are reported as zero. ‘GLM SE’ stands for standard error of the GLM coefficient.

cor_affiliations_number <-
  with(df_analyses,
       cor.test(total_number_authors,
                total_number_countries,
                method = "spearman"))

glm_affiliations_number <- 
  df_analyses %>% 
  glm(data = .,
      formula = total_number_countries ~ 
        scale(total_number_authors),
      family = poisson)

table_s11 <-
  broom::tidy(glm_affiliations_number) %>% 
  add_column(sample_size = nrow(df_analyses),
             .before = "term") %>% 
  add_column(correlation_coefficient = cor_affiliations_number$estimate,
             correlation_p_value = cor_affiliations_number$p.value) %>% 
  rename(glm_term = term,
         glm_coefficient = estimate,
         glm_std_error = std.error,
         glm_z_value = statistic,
         glm_p_value = p.value) %>% 
  mutate(across(where(is.numeric),
                ~round(., digits = 3))) 

names(table_s11) <- 
  c("Sample size",
    "GLM term",
    "GLM coefficient",
    "GLM SE",
    "GLM z value",
    "GLM p value",
    "Correlation coefficient",
    "Correlation p value")

table_s11[, 2] <- c("Intercept",
                    "Standardised number of authors")

datatable(table_s11,
          extensions = "Buttons",
          rownames = FALSE)

Table S12.

Institutional affiliations listed in meta-analyses related to sexual selection. The following countries were classified as part of the Global South: Argentina, Brazil, China, Costa Rica, India, Iran, Mexico, and Taiwan. The remaining countries of affiliation in our data set were classified as part of the Global North.

affiliations %>%
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Table S13.

Gender of authors of meta-analyses related to sexual selection. This data set contains, on each row, the name of an author from a meta-analytical study. ‘Author order’ represents the order in which the name appeared in the authorship list of that study and ‘Total number of authors’ represents the total number of authors of that study. ‘Automated gender’ shows the gender assigned to ‘Author given name’ using the package genderizeR, with its certainty as the variable ‘Automated certainty’. ‘Manual gender’ is the revised gender assignment, including manual insertions when certainty from automated process was lower than 0.95.

gender %>%
  datatable(.,
            extensions = "Buttons",
            rownames = FALSE)

Table S14.

Model selection of generalised linear models (GLMs) with taxonomic scope as the response variable (unrestricted = 1 versus specific species or animal group = 0; unclear excluded) with all possible combinations of nine predictor variables: P1 = binary gender of first author (man = 1 versus woman = 0); P2 = standardised proportion of women as authors; P3 = standardised number of authors; P4 = standardised number of affiliations; P5 = standardised number of countries affiliated; P6 = standardised number of continents affiliated; P7 = continent of the first affiliation listed (European = 1 versus non-European = 0); P8 = standardised proportion of authors from the Global South; and P9 = standardised publication year. The following countries were classified as part of the Global South: Argentina, Brazil, China, Costa Rica, India, Iran, Mexico, and Taiwan. The remaining countries of affiliation in our data set were classified as part of the Global North. Empty cells show that the predictor variable was not in the model selected. ‘df’ stands for degrees of freedom, ‘lokLik’ stands for the logarithm of the Likelihood of each model, ‘AICc’ stands for Akaike information criterion corrected for small sample sizes, ‘Delta AICc’ stands for AICc of the model minus the AICc of the model with lower AICc (first row).

options(na.action = "na.fail")

# taxonomic scope ----
glm_taxonomic_scope <- 
  glm(data = df_taxonomic_scope,
      unrestricted ~ 
        first_author_gender +
        s_prop_women +
        s_total_number_authors +
        s_total_number_affiliations +
        s_total_number_countries +
        s_total_number_continents +
        first_europe +
        s_prop_global_south +
        s_publication_year,
      family = "binomial")

table_s14 <-
  dredge(glm_taxonomic_scope) %>% 
  filter(delta < 2) %>% 
  tibble() %>% 
  rename(intercept = `(Intercept)`) %>% 
  add_column(sample_size = nrow(df_taxonomic_scope)) %>% 
  mutate(across(where(is.numeric),
                ~round(., digits = 3))) %>%
  select(sample_size,
         intercept,
         first_author_gender,
         s_prop_women,
         s_total_number_authors,
         s_total_number_affiliations,
         s_total_number_countries,
         s_total_number_continents,
         first_europe,
         s_prop_global_south,
         s_publication_year,
         df,
         logLik,
         AICc,
         delta,
         weight)

names_model_selection <-
  c("Sample size",
    "Intercept",
    paste0("P", 1:9),
    "df",
    "logLik",
    "AICc",
    "Delta AICc",
    "Weight")

names(table_s14) <- 
  names_model_selection

datatable(table_s14,
          extensions = "Buttons",
          rownames = FALSE)

options(na.action = "na.omit")

Table S15.

Results of a generalised linear model (GLM) with taxonomic scope as the response variable (unrestricted = 1 versus specific species or animal group = 0; unclear excluded) with only predictor variables that appeared in all models selected (see Table S14). p values < 0.001 are reported as zero. ‘GLM SE’ stands for standard error of the GLM coefficient.

glm_taxonomic_scope_selected <-
  glm(data = df_taxonomic_scope,
      unrestricted ~ 
        s_prop_women +
        s_publication_year,
      family = "binomial")

table_s15 <-
  broom::tidy(glm_taxonomic_scope_selected) %>% 
  add_column(sample_size = nrow(df_taxonomic_scope),
             .before = "term") %>% 
  rename(glm_coefficient = estimate,
         glm_std_error = std.error,
         glm_z_value = statistic,
         glm_p_value = p.value) %>% 
  mutate(across(where(is.numeric),
                ~round(., digits = 3))) 

names_selected_model <- 
  c("Sample size",
    "GLM term",
    "GLM coefficient",
    "GLM SE",
    "GLM z value",
    "GLM p value")

names(table_s15) <-
  names_selected_model

table_s15[, 2] <- c("Intercept",
                    "Standardised proportion of women as authors",
                    "Standardised publication year")

datatable(table_s15,
          extensions = "Buttons",
          rownames = FALSE)

Table S16.

Model selection of generalised linear models (GLMs) with focal sex as the response variable (conformist = 1 versus nonconformist, neutral, or hermaphrodite = 0; unclear excluded) with all possible combinations of nine predictor variables: P1 = binary gender of first author (man = 1 versus woman = 0); P2 = standardised proportion of women as authors; P3 = standardised number of authors; P4 = standardised number of affiliations; P5 = standardised number of countries affiliated; P6 = standardised number of continents affiliated; P7 = continent of the first affiliation listed (European = 1 versus non-European = 0); P8 = standardised proportion of authors from the Global South; and P9 = standardised publication year. The following countries were classified as part of the Global South: Argentina, Brazil, China, Costa Rica, India, Iran, Mexico, and Taiwan. The remaining countries of affiliation in our data set were classified as part of the Global North. Empty cells show that the predictor variable was not in the model selected. ‘df’ stands for degrees of freedom, ‘lokLik’ stands for the logarithm of the Likelihood of each model, ‘AICc’ stands for Akaike information criterion corrected for small sample sizes, ‘Delta AICc’ stands for AICc of the model minus the AICc of the model with lower AICc (first row).

options(na.action = "na.fail")

glm_conf <-
  glm(data = df_conf,
      conf ~ 
        first_author_gender +
        s_prop_women +
        s_total_number_authors +
        s_total_number_affiliations +
        s_total_number_countries +
        s_total_number_continents +
        first_europe +
        s_prop_global_south +
        s_publication_year,
      family = "binomial")

table_s16 <- 
  dredge(glm_conf) %>% 
  filter(delta < 2) %>% 
  tibble() %>% 
  rename(intercept = `(Intercept)`) %>% 
  add_column(sample_size = nrow(df_conf)) %>% 
  mutate(across(where(is.numeric),
                ~round(., digits = 3))) %>%
  select(sample_size,
         intercept,
         first_author_gender,
         s_prop_women,
         s_total_number_authors,
         s_total_number_affiliations,
         s_total_number_countries,
         s_total_number_continents,
         first_europe,
         s_prop_global_south,
         s_publication_year,
         df,
         logLik,
         AICc,
         delta,
         weight)

names(table_s16) <- 
  names_model_selection

datatable(table_s16,
          extensions = "Buttons",
          rownames = FALSE)

options(na.action = "na.omit")

Table S17.

Results of a generalised linear model (GLM) with focal sex as the response variable (conformist = 1 versus nonconformist, neutral, or hermaphrodite = 0; unclear excluded) with only predictor variables that appeared in all models selected (see Table S16). p values < 0.001 are reported as zero. ‘GLM SE’ stands for standard error of the GLM coefficient.

glm_conf_selected <-
  glm(data = df_conf,
      conf ~ 
        s_publication_year,
      family = "binomial")

table_s17 <- 
  broom::tidy(glm_conf_selected) %>% 
  add_column(sample_size = nrow(df_conf),
             .before = "term") %>% 
  rename(glm_coefficient = estimate,
         glm_std_error = std.error,
         glm_z_value = statistic,
         glm_p_value = p.value) %>% 
  mutate(across(where(is.numeric),
                ~round(., digits = 3)))

names(table_s17) <-
  names_selected_model

table_s17[, 2] <- c("Intercept",
                    "Standardised publication year")

datatable(table_s17,
          extensions = "Buttons",
          rownames = FALSE)

Table S18.

Model selection of generalised linear models (GLMs) with focal sex as the response variable (nonconformist = 1 versus conformist, neutral, or hermaphrodite = 0; unclear excluded) with all possible combinations of nine predictor variables: P1 = binary gender of first author (man = 1 versus woman = 0); P2 = standardised proportion of women as authors; P3 = standardised number of authors; P4 = standardised number of affiliations; P5 = standardised number of countries affiliated; P6 = standardised number of continents affiliated; P7 = continent of the first affiliation listed (European = 1 versus non-European = 0); P8 = standardised proportion of authors from the Global South; and P9 = standardised publication year. The following countries were classified as part of the Global South: Argentina, Brazil, China, Costa Rica, India, Iran, Mexico, and Taiwan. The remaining countries of affiliation in our data set were classified as part of the Global North. Empty cells show that the predictor variable was not in the model selected. ‘df’ stands for degrees of freedom, ‘lokLik’ stands for the logarithm of the Likelihood of each model, ‘AICc’ stands for Akaike information criterion corrected for small sample sizes, ‘Delta AICc’ stands for AICc of the model minus the AICc of the model with lower AICc (first row).

options(na.action = "na.fail")

glm_nonconf <-
  glm(data = df_nonconf,
      nonconf ~ 
        first_author_gender +
        s_prop_women +
        s_total_number_authors +
        s_total_number_affiliations +
        s_total_number_countries +
        s_total_number_continents +
        first_europe +
        s_prop_global_south +
        s_publication_year,
      family = "binomial")

table_s18 <-
  dredge(glm_nonconf) %>% 
  filter(delta < 2) %>% 
  tibble() %>% 
  rename(intercept = `(Intercept)`) %>% 
  add_column(sample_size = nrow(df_nonconf)) %>% 
  mutate(across(where(is.numeric),
                ~round(., digits = 3))) %>%
  select(sample_size,
         intercept,
         first_author_gender,
         s_prop_women,
         s_total_number_authors,
         s_total_number_affiliations,
         s_total_number_countries,
         s_total_number_continents,
         first_europe,
         s_prop_global_south,
         s_publication_year,
         df,
         logLik,
         AICc,
         delta,
         weight)

names(table_s18) <- 
  names_model_selection

datatable(table_s18,
          extensions = "Buttons",
          rownames = FALSE)

options(na.action = "na.omit")

Table S19.

Results of a generalised linear model (GLM) with focal sex as the response variable (nonconformist = 1 versus conformist, neutral, or hermaphrodite = 0; unclear excluded) with only predictor variables that appeared in all models selected (see Table S18). p values < 0.001 are reported as zero. ‘GLM SE’ stands for standard error of the GLM coefficient.

glm_nonconf_selected <-
  glm(data = df_nonconf,
      nonconf ~ 
        s_prop_women,
      family = "binomial")

table_s19 <-
  broom::tidy(glm_nonconf_selected) %>% 
  add_column(sample_size = nrow(df_nonconf),
             .before = "term") %>% 
  rename(glm_coefficient = estimate,
         glm_std_error = std.error,
         glm_z_value = statistic,
         glm_p_value = p.value) %>% 
  mutate(across(where(is.numeric),
                ~round(., digits = 3)))

names(table_s19) <-
  names_selected_model

table_s19[, 2] <- c("Intercept",
                    "Standardised proportion of women as authors")

datatable(table_s19,
          extensions = "Buttons",
          rownames = FALSE)

Table S20.

Model selection of generalised linear models (GLMs) with methodological transparency as the response variable with all possible combinations of nine predictor variables: P1 = binary gender of first author (man = 1 versus woman = 0); P2 = standardised proportion of women as authors; P3 = standardised number of authors; P4 = standardised number of affiliations; P5 = standardised number of countries affiliated; P6 = standardised number of continents affiliated; P7 = continent of the first affiliation listed (European = 1 versus non-European = 0); P8 = standardised proportion of authors from the Global South; and P9 = standardised publication year. The following countries were classified as part of the Global South: Argentina, Brazil, China, Costa Rica, India, Iran, Mexico, and Taiwan. The remaining countries of affiliation in our data set were classified as part of the Global North. Empty cells show that the predictor variable was not in the model selected. ‘df’ stands for degrees of freedom, ‘lokLik’ stands for the logarithm of the Likelihood of each model, ‘AICc’ stands for Akaike information criterion corrected for small sample sizes, ‘Delta AICc’ stands for AICc of the model minus the AICc of the model with lower AICc (first row).

options(na.action = "na.fail")

glm_transparency <-
  glm(data = df_transparency,
      prop_transparent ~ 
        first_author_gender +
        s_prop_women +
        s_total_number_authors +
        s_total_number_affiliations +
        s_total_number_countries +
        s_total_number_continents +
        first_europe +
        s_prop_global_south +
        s_publication_year,
      weights = valid_n,
      family = "binomial")

table_s20 <-
  dredge(glm_transparency) %>% 
  filter(delta < 2) %>% 
  tibble() %>% 
  rename(intercept = `(Intercept)`) %>% 
  add_column(sample_size = nrow(df_transparency)) %>% 
  mutate(across(where(is.numeric),
                ~round(., digits = 3))) %>%
  select(sample_size,
         intercept,
         first_author_gender,
         s_prop_women,
         s_total_number_authors,
         s_total_number_affiliations,
         s_total_number_countries,
         s_total_number_continents,
         first_europe,
         s_prop_global_south,
         s_publication_year,
         df,
         logLik,
         AICc,
         delta,
         weight)

names(table_s20) <- 
  names_model_selection

datatable(table_s20,
          extensions = "Buttons",
          rownames = FALSE)

options(na.action = "na.omit")

Table S21.

Results of a generalised linear model (GLM) with methodological transparency as the response variable with only predictor variables that appeared in all models selected (see Table S20). p values < 0.001 are reported as zero. ‘GLM SE’ stands for standard error of the GLM coefficient.

glm_transparency_selected <-
  glm(data = df_transparency,
      prop_transparent ~ 
        first_europe +
        s_publication_year +
        s_prop_women +
        s_total_number_continents,
      weights = valid_n,
      family = "binomial")

table_s21 <-
  broom::tidy(glm_transparency_selected) %>% 
  add_column(sample_size = nrow(df_transparency),
             .before = "term") %>% 
  rename(glm_coefficient = estimate,
         glm_std_error = std.error,
         glm_z_value = statistic,
         glm_p_value = p.value) %>% 
  mutate(across(where(is.numeric),
                ~round(., digits = 3)))

names(table_s21) <-
  names_selected_model

table_s21[, 2] <- c("Intercept",
                    "First author with European affiliation",
                    "Standardised publication year",
                    "Standardised proportion of women as authors",
                    "Standardised number of continents listed as affiliations")

datatable(table_s21,
          extensions = "Buttons",
          rownames = FALSE)