From sunblock to softblock: Analyzing the correlates of neology in published writing and on social media
arXiv:2602.13123v1 Announce Type: new Abstract: Living languages are shaped by a host of conflicting internal and external evolutionary pressures. While some of these pressures are universal across languages and cultures, others differ depending on the social and conversational context: language use in newspapers is subject to very different constraints than language use on social media. Prior distributional semantic work on English word emergence (neology) identified two factors correlated with creation of new words by analyzing a corpus consisting primarily of historical published texts (Ryskina et al., 2020, arXiv:2001.07740). Extending this methodology to contextual embeddings in addition to static ones and applying it to a new corpus of Twitter posts, we show that the same findings hold for both domains, though the topic popularity growth factor may contribute less to neology on Twitter than in published writing. We hypothesize that this difference can be explained by the two dom
arXiv:2602.13123v1 Announce Type: new Abstract: Living languages are shaped by a host of conflicting internal and external evolutionary pressures. While some of these pressures are universal across languages and cultures, others differ depending on the social and conversational context: language use in newspapers is subject to very different constraints than language use on social media. Prior distributional semantic work on English word emergence (neology) identified two factors correlated with creation of new words by analyzing a corpus consisting primarily of historical published texts (Ryskina et al., 2020, arXiv:2001.07740). Extending this methodology to contextual embeddings in addition to static ones and applying it to a new corpus of Twitter posts, we show that the same findings hold for both domains, though the topic popularity growth factor may contribute less to neology on Twitter than in published writing. We hypothesize that this difference can be explained by the two domains favouring different neologism formation mechanisms.
Executive Summary
The article 'From sunblock to softblock: Analyzing the correlates of neology in published writing and on social media' explores the factors influencing the creation of new words (neology) in different linguistic contexts. By extending prior research on distributional semantics, the study compares the emergence of neologisms in historical published texts with those in Twitter posts. The findings suggest that while similar factors correlate with neology in both domains, the influence of topic popularity growth may be less pronounced on Twitter. The authors hypothesize that this difference arises from distinct neologism formation mechanisms favored by each medium.
Key Points
- ▸ Neology is influenced by both universal and context-specific factors.
- ▸ The study extends prior methodology to include contextual embeddings and a new corpus of Twitter posts.
- ▸ Topic popularity growth may contribute less to neology on Twitter than in published writing.
- ▸ Different neologism formation mechanisms are hypothesized to explain the observed differences.
Merits
Comprehensive Methodology
The study employs a robust methodology that includes both static and contextual embeddings, providing a more nuanced analysis of neology across different domains.
Cross-Domain Comparison
By comparing neology in published writing and social media, the study offers valuable insights into the differing linguistic pressures and mechanisms at play in these contexts.
Empirical Evidence
The findings are supported by empirical data, enhancing the credibility and reliability of the conclusions drawn.
Demerits
Limited Scope
The study focuses primarily on English language neology, which may limit the generalizability of the findings to other languages and cultures.
Hypothesis Without Direct Testing
While the hypothesis regarding different neologism formation mechanisms is intriguing, it is not directly tested within the study, leaving room for further investigation.
Corpus Limitations
The reliance on a specific corpus of Twitter posts may introduce biases or limitations that could affect the generalizability of the findings to other social media platforms.
Expert Commentary
The article presents a well-structured and methodologically sound analysis of neology in different linguistic contexts. The extension of prior research to include contextual embeddings and a new corpus of Twitter posts is a significant advancement in the field. The findings regarding the differing influence of topic popularity growth on neology in published writing versus social media are particularly noteworthy. However, the hypothesis about distinct neologism formation mechanisms, while intriguing, would benefit from further empirical testing. The study's focus on English language neology and the reliance on a specific corpus of Twitter posts are potential limitations that could be addressed in future research. Overall, the article makes a valuable contribution to the understanding of language evolution and the factors influencing lexical innovation in different contexts.
Recommendations
- ✓ Future research should explore the generalizability of these findings to other languages and cultures to provide a more comprehensive understanding of neology.
- ✓ Direct empirical testing of the hypothesis regarding different neologism formation mechanisms would strengthen the study's conclusions and provide deeper insights into the processes driving lexical innovation.