Dans ce qui suit, je souhaite discuter de deux ouvrages complémentaires qui stimulent le débat sur les vertus et faiblesses de la démocratie, dans un contexte où le changement climatique, mais aussi la confrontation à des régimes autoritaires hostiles, pose de manière urgente la question de sa capacité à affronter et gérer les crises.
Des personnalités politiques de premier plan de la France Insoumise affirment depuis plusieurs mois que le nucléaire est une source d’électricité “intermittente” 1, et même bientôt “plus intermittent[e] que l’éolien” si l’on en croit Jean-Luc Mélenchon 2. Ce discours, bien sûr, vise à promouvoir la transition vers 100% d’énergies renouvelables défendue par le parti. Une telle transition impliquerait de fermer les 56 réacteurs actuellement en opération, lesquels ont fourni à la France 70% de son électricité en 2019 3. Une telle tâche n’est pas seulement difficile ; elle est aussi critiquable, alors que l’énergie nucléaire demeure à ce stade la source la plus propre en terme de gaz à effet de serre. Alors que le bien-fondé de la sortie du nucléaire en Allemagne est remise en question par les menaces actuelles sur l’approvisionnement en gaz et par ses conséquences en terme d’émissions de gaz à effet de serre, LFI et ses alliés d’EELV ont besoin de renforcer leurs arguments contre l’énergie nucléaire. Par chance pour eux, le parc nucléaire français rencontre actuellement des difficultés majeures. Environ la moitié des réacteurs sont arrêtés pour des opérations de maintenance, pour diverses raisons, y compris le fameux problème de corrosion sous contrainte anormale identifié sur plusieurs d’entre eux. Il en résulte que la disponibilité du parc nucléaire est actuellement très limitée, et un risque de blackout pour cet hiver. C’est principalement sur la base de cette situation que le nucléaire est qualifié d’“intermittent” par la France Insoumise, et même “bientôt plus intermittent que l’éolien”.
The volume Social Epistemology: Essential Readings1 edited by Alvin I. Goldman and Dennis Whitcomb introduces the topic of social epistemology by bringing together a variety of previously published issues which belong to it. These different chapters are valuable not as much for the originality of the results that they provide, rather than for illustrating the wide scope of questions embodied social epistemology. Readers who are interested like me in diverse topics such as sociology of science, the cultural brain hypothesis in anthropology, or institutional design in political science, will find that social epistemology may provide useful insights beneficial to each of these matters – which is what makes this introduction exciting to read, and the whole enterprise look attractive. It is in light of the diversity of social epistemology, that Alvin I. Goldman proposes that the topic be split into three categories.
Goldman, Alvin I. and Dennis Whitcomb (eds), 2011, Social Epistemology: Essential Readings, New York: Oxford University Press ↩
Chaque dimanche de onze heures à midi, Émilie Aubry anime « L’Esprit Public » sur France Culture, présentée comme une « mise en perspective de l’actualité politique au cours d’un débat d’intellectuels engagés ». La page de l’émission avertit ses auditeurs : ceux-ci doivent s’attendre à de la « polémique », et même de l’« impertinence ».
Abstract: In this paper, I applied sentiment analysis and emotion detection to press articles and illustrations to explore differences in the journalistic treatment of various political figures.
Abstract: In this article, I analyzed how signatures on a nationwide constitutional petition correlated with various socioeconomic/political variables in French cities. Low education turned out to be a strongly negative factor. I explored the possibility that this could be the result of poor media coverage. I then measured media coverage of the petition by applying a speech-to-text model to public television news archives. The article has been cited in a book, in research papers, and in an appeal to the French Constitutional Court to increase media coverage of these petitions.
PoliticsData miningStatistical and Bayesian Inference
Abstract: We analyzed voter trajectories between the French presidential and European elections using a Bayesian ecological inference model. We assessed how these trajectories were influenced by socio-economic factors. This revealed, among other things, the rallying of the right-wing bourgeoisie behind Macron.
EpidemicsData miningStatistical and Bayesian Inference
Abstract: In this paper, we compared mortality data with the official death toll attributed to Covid. We showed that the number of deaths attributed to Covid significantly underestimated the actual number of deaths. We then showed that the government later reduced the discrepancy by accounting for Covid-related deaths occurring in nursing homes, but that there remained an unaccounted for excess mortality in deaths occurring at home that could be attributed to Covid. This article has been cited in research papers.
EpidemicsData miningStatistical and Bayesian Inference
Abstract: In this article, we used data from the Oxford Government Response Tracker to show that France took containment measures against Covid relatively late compared to other countries, given the timing of the epidemics. We also estimated how many deaths could have been avoided if certain measures had been taken a few days earlier, by adapting a simulation from the Imperial College.
Gautheron L., Lavechin M., Riad R., Scaff C., Cristia A. “Longform recordings : Opportunities and challenges ”Permalink in In the proceedings of LIFT 2020 - 2èmes journées scientifiques du Groupement de Recherche "Linguistique informatique, formelle et de terrain", 2020 (Writing - Original Draft)
Abstract: Language use in everyday life can be studied using lightweight, wearable recorders that collect long-form recordings—that is, audio (including speech) over whole days. The hardware and software underlying this technique are increasingly accessible and inexpensive, and these data are revolutionizing the language acquisition field. We first place this technique into the broader context of the current ways of studying both the input being received by children and children's own language production, laying out the main advantages and drawbacks of long-form recordings. We then go on to argue that a unique advantage of long-form recordings is that they can fuel realistic models of early language acquisition that use speech to represent children's input and/or to establish production benchmarks. To enable the field to make the most of this unique empirical and conceptual contribution, we outline what this reverse engineering approach from long-form recordings entails, why it is useful, and how to evaluate success.
Abstract: The technique of long-form recordings via wearables is gaining momentum in different fields of research, notably linguistics and neurology. This technique, however, poses several technical challenges, some of which are amplified by the peculiarities of the data, including their sensitivity and their volume. In this paper, we begin by outlining key problems related to the management, storage, and sharing of the corpora that emerge when using this technique. We continue by proposing a multi-component solution to these problems, specifically in the case of daylong recordings of children. As part of this solution, we release ChildProject, a Python package for performing the operations typically required by such datasets and for evaluating the reliability of annotations using a number of measures commonly used in speech processing and linguistics. This package builds upon an annotation management system, which allows the importation of annotations from a wide range of existing formats, as well as upon data validation procedures, which assert the conformity of the data, or, alternatively, produce detailed and explicit error reports. Our proposal could be generalized to populations other than children and beyond linguistics.
Language acquisitionStatistical and Bayesian Inference
Abstract: What are the vocal experiences of children growing up on Malakula island, Vanuatu, where multilingualism is the norm? Long-form audio-recordings captured spontaneous speech behavior by, and around, 38 children (5–33 months, 23 girls) from 11 villages. Automated analyses revealed most children's vocal input came from female adults and other children's voices, with small contributions from male adult voices. The greatest changes with age involved an increase in the input vocalizations from other children. Total input (collapsing across child-directed and overheard speech, and across languages) was ∼11 min per hour, which was at least 5 min (31%) lower than that found in other populations studied using comparable methods in previous literature, as well as in archival American data analyzed with the same algorithm. In contrast, children's own vocalization counts were two to four times higher than previous reports for North-American English-learning monolingual infants at matched ages, and comparable to estimates from archival American data, consistent with a resilient language-learning cognitive system for this aspect of vocal development. The strongest association between input and output was with vocalizations by other children, rather than those by adults, which is consistent with research in anthropology but less so with current theoretical trends in developmental psychology. These results invite further research in populations that are under-represented in developmental science.
Gautheron L.“La désunité de la physique des hautes-énergies ”Permalink in XIVe Congrès de la Société française d'histoire des sciences et des techniques: symposium "La physique de l'après Seconde guerre mondiale, entre ruptures et continuités", Bordeaux, France, 2023
Science and Collective IntelligenceNatural language processingNetworks
Abstract: According to Peter Galison, the coordination of different “subcultures” within a scientific field happens through local exchanges within “trading zones.” In his view, the workability of such trading zones is not guaranteed, and science is not necessarily driven towards further integration. In this paper, we develop and apply quantitative methods (using semantic, authorship, and citation data from scientific literature), inspired by Galison’s framework, to the case of the disunity of high-energy physics. We give prominence to supersymmetry, a concept that has given rise to several major but distinct research programs in the field, such as the formulation of a consistent theory of quantum gravity or the search for new particles. We show that “theory” and “phenomenology” in high-energy physics should be regarded as distinct theoretical subcultures, between which supersymmetry has helped sustain scientific “trades.” However, as we demonstrate using a topic model, the phenomenological component of supersymmetry research has lost traction and the ability of supersymmetry to tie these subcultures together is now compromised. Our work supports that even fields with an initially strong sentiment of unity may eventually generate diverging research programs and demonstrates the fruitfulness of the notion of trading zones for informing quantitative approaches to scientific pluralism.
Science and Collective IntelligenceNatural language processingNetworksStatistical and Bayesian InferenceInverse problems
Abstract: How do scientists navigate between the need to capitalize on their prior knowledge through specialization, and the urge to adapt to evolving research opportunities? Drawing from diverse perspectives on adaptation, this paper proposes an unsupervised Bayesian approach motivated by Optimal Transport of the evolution of scientists' research portfolios in response to transformations in their field. The model relies on $186,162$ scientific abstracts and authorship data to evaluate the influence of intellectual, social, and institutional resources on scientists' trajectories within a cohort of $2,195$ high-energy physicists between 2000 and 2019. Using Inverse Optimal Transport, the reallocation of research efforts is shown to be shaped by learning costs, thus enhancing the utility of the scientific capital disseminated among scientists. Two dimensions of social capital, namely "diversity" and "power", have opposite associations with the magnitude of change in scientists' research interests: while "diversity" disrupts and expands research interests, "power" is associated with more stable research agendas. Social capital plays a more crucial role in shifts between cognitively distant research areas. More generally, this work suggests new approaches for understanding, measuring and modeling collective adaptation using Optimal Transport.
Language acquisitionStatistical and Bayesian Inference
Abstract: Long-form audio recordings are increasingly used to study individual variation, group differences, and many other topics in theoretical and applied fields of developmental science, particularly for the description of children's language input (typically speech from adults) and children’s language output (ranging from babble to sentences). The proprietary LENA software has been available for over a decade, and with it, users have come to rely on derived metrics like adult word count (AWC) and child vocalization counts (CVC), which have also more recently been derived using an open-source alternative, the ACLEW pipeline. Yet, there is relatively little work assessing the reliability of long-form metrics in terms of the stability of individual differences across time. Filling this gap, we analyzed eight spoken-language datasets: four from North American English-learning infants, and one each from British English-, French-, American English-Spanish, and Quechua-Spanish-learning infants. The audio data were analyzed using two types of processing software: LENA and the ACLEW open-source pipeline. When all corpora were included, we found relatively low to moderate reliability (across multiple recordings, intraclass correlation coefficient attributed to the child identity (Child ICC), was <50% for most metrics). There were few differences between the two pipelines. Exploratory analyses suggested some differences as a function of child age and corpora. These findings suggest that, while reliability is likely sufficient for various group-level analyses, caution is needed when using either LENA or ACLEW tools to study individual variation. We also encourage improvement of extant tools, specifically targeting accurate measurement of individual variation.