Displaying 1 - 10 of 10
-
Dang, A., Raviv, L., & Galke, L. (2024). Testing the linguistic niche hypothesis in large with a multilingual Wug test. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (
Eds. ), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 91-93). Nijmegen: The Evolution of Language Conferences. -
Dang, A., Raviv, L., & Galke, L. (2024). Morphology matters: Probing the cross-linguistic morphological generalization abilities of large language models through a Wug Test. In CMCL 2024 - 13th Edition of the Workshop on Cognitive Modeling and Computational Linguistics, Proceedings of the Workshop (pp. 177-188). Kerrville, TX, USA: Association for Computational Linguistics (ACL).
-
Galke, L., Ram, Y., & Raviv, L. (2024). Learning pressures and inductive biases in emergent communication: Parallels between humans and deep neural networks. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (
Eds. ), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 197-201). Nijmegen: The Evolution of Language Conferences. -
Galke, L., Ram, Y., & Raviv, L. (2024). Deep neural networks and humans both benefit from compositional language structure. Nature Communications, 15: 10816. doi:10.1038/s41467-024-55158-1.
Abstract
Deep neural networks drive the success of natural language processing. A fundamental property of language is its compositional structure, allowing humans to systematically produce forms for new meanings. For humans, languages with more compositional and transparent structures are typically easier to learn than those with opaque and irregular structures. However, this learnability advantage has not yet been shown for deep neural networks, limiting their use as models for human language learning. Here, we directly test how neural networks compare to humans in learning and generalizing different languages that vary in their degree of compositional structure. We evaluate the memorization and generalization capabilities of a large language model and recurrent neural networks, and show that both deep neural networks exhibit a learnability advantage for more structured linguistic input: neural networks exposed to more compositional languages show more systematic generalization, greater agreement between different agents, and greater similarity to human learners.Additional information
https://www.nature.com/articles/s41467-024-55158-1#Sec23 -
Melnychuk, T., Galke, L., Seidlmayer, E., Bröring, S., Förstner, K. U., Tochtermann, K., & Schultz, C. (2024). Development of similarity measures from graph-structured bibliographic metadata: An application to identify scientific convergence. IEEE Transactions on Engineering Management, 71, 9171 -9187. doi:10.1109/TEM.2023.3308008.
Abstract
Scientific convergence is a phenomenon where the distance between hitherto distinct scientific fields narrows and the fields gradually overlap over time. It is creating important potential for research, development, and innovation. Although scientific convergence is crucial for the development of radically new technology, the identification of emerging scientific convergence is particularly difficult since the underlying knowledge flows are rather fuzzy and unstable in the early convergence stage. Nevertheless, novel scientific publications emerging at the intersection of different knowledge fields may reflect convergence processes. Thus, in this article, we exploit the growing number of research and digital libraries providing bibliographic metadata to propose an automated analysis of science dynamics. We utilize and adapt machine-learning methods (DeepWalk) to automatically learn a similarity measure between scientific fields from graphs constructed on bibliographic metadata. With a time-based perspective, we apply our approach to analyze the trajectories of evolving similarities between scientific fields. We validate the learned similarity measure by evaluating it within the well-explored case of cholesterol-lowering ingredients in which scientific convergence between the distinct scientific fields of nutrition and pharmaceuticals has partially taken place. Our results confirm that the similarity trajectories learned by our approach resemble the expected behavior, indicating that our approach may allow researchers and practitioners to detect and predict scientific convergence early. -
Seidlmayer, E., Melnychuk, T., Galke, L., Kühnel, L., Tochtermann, K., Schultz, C., & Förstner, K. U. (2024). Research topic displacement and the lack of interdisciplinarity: Lessons from the scientific response to COVID-19. Scientometrics, 129, 5141-5179. doi:10.1007/s11192-024-05132-x.
Abstract
Based on a large-scale computational analysis of scholarly articles, this study investigates the dynamics of interdisciplinary research in the first year of the COVID-19 pandemic. Thereby, the study also analyses the reorientation effects away from other topics that receive less attention due to the high focus on the COVID-19 pandemic. The study aims to examine what can be learned from the (failing) interdisciplinarity of coronavirus research and its displacing effects for managing potential similar crises at the scientific level. To explore our research questions, we run several analyses by using the COVID-19++ dataset, which contains scholarly publications, preprints from the field of life sciences, and their referenced literature including publications from a broad scientific spectrum. Our results show the high impact and topic-wise adoption of research related to the COVID-19 crisis. Based on the similarity analysis of scientific topics, which is grounded on the concept embedding learning in the graph-structured bibliographic data, we measured the degree of interdisciplinarity of COVID-19 research in 2020. Our findings reveal a low degree of research interdisciplinarity. The publications’ reference analysis indicates the major role of clinical medicine, but also the growing importance of psychiatry and social sciences in COVID-19 research. A social network analysis shows that the authors’ high degree of centrality significantly increases her or his degree of interdisciplinarity. -
Galke, L., & Scherp, A. (2022). Bag-of-words vs. graph vs. sequence in text classification: Questioning the necessity of text-graphs and the surprising strength of a wide MLP. In S. Muresan, P. Nakov, & A. Villavicencio (
Eds. ), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (pp. 4038-4051). Dublin: Association for Computational Linguistics. doi:10.18653/v1/2022.acl-long.279. -
Galke, L., Cuber, I., Meyer, C., Nölscher, H. F., Sonderecker, A., & Scherp, A. (2022). General cross-architecture distillation of pretrained language models into matrix embedding. In Proceedings of the IEEE Joint Conference on Neural Networks (IJCNN 2022), part of the IEEE World Congress on Computational Intelligence (WCCI 2022). doi:10.1109/IJCNN55064.2022.9892144.
Abstract
Large pretrained language models (PreLMs) are rev-olutionizing natural language processing across all benchmarks. However, their sheer size is prohibitive for small laboratories or for deployment on mobile devices. Approaches like pruning and distillation reduce the model size but typically retain the same model architecture. In contrast, we explore distilling PreLMs into a different, more efficient architecture, Continual Multiplication of Words (CMOW), which embeds each word as a matrix and uses matrix multiplication to encode sequences. We extend the CMOW architecture and its CMOW/CBOW-Hybrid variant with a bidirectional component for more expressive power, per-token representations for a general (task-agnostic) distillation during pretraining, and a two-sequence encoding scheme that facilitates downstream tasks on sentence pairs, such as sentence similarity and natural language inference. Our matrix-based bidirectional CMOW/CBOW-Hybrid model is competitive to DistilBERT on question similarity and recognizing textual entailment, but uses only half of the number of parameters and is three times faster in terms of inference speed. We match or exceed the scores of ELMo for all tasks of the GLUE benchmark except for the sentiment analysis task SST-2 and the linguistic acceptability task CoLA. However, compared to previous cross-architecture distillation approaches, we demonstrate a doubling of the scores on detecting linguistic acceptability. This shows that matrix-based embeddings can be used to distill large PreLM into competitive models and motivates further research in this direction. -
Vagliano, I., Galke, L., & Scherp, A. (2022). Recommendations for item set completion: On the semantics of item co-occurrence with data sparsity, input size, and input modalities. Information Retrieval Journal, 25(3), 269-305. doi:10.1007/s10791-022-09408-9.
Abstract
We address the problem of recommending relevant items to a user in order to "complete" a partial set of items already known. We consider the two scenarios of citation and subject label recommendation, which resemble different semantics of item co-occurrence: relatedness for co-citations and diversity for subject labels. We assess the influence of the completeness of an already known partial item set on the recommender performance. We also investigate data sparsity through a pruning parameter and the influence of using additional metadata. As recommender models, we focus on different autoencoders, which are particularly suited for reconstructing missing items in a set. We extend autoencoders to exploit a multi-modal input of text and structured data. Our experiments on six real-world datasets show that supplying the partial item set as input is helpful when item co-occurrence resembles relatedness, while metadata are effective when co-occurrence implies diversity. This outcome means that the semantics of item co-occurrence is an important factor. The simple item co-occurrence model is a strong baseline for citation recommendation. However, autoencoders have the advantage to enable exploiting additional metadata besides the partial item set as input and achieve comparable performance. For the subject label recommendation task, the title is the most important attribute. Adding more input modalities sometimes even harms the result. In conclusion, it is crucial to consider the semantics of the item co-occurrence for the choice of an appropriate recommendation model and carefully decide which metadata to exploit. -
Seidlmayer, E., Voß, J., Melnychuk, T., Galke, L., Tochtermann, K., Schultz, C., & Förstner, K. U. (2020). ORCID for Wikidata. Data enrichment for scientometric applications. In L.-A. Kaffee, O. Tifrea-Marciuska, E. Simperl, & D. Vrandečić (
Eds. ), Proceedings of the 1st Wikidata Workshop (Wikidata 2020). Aachen, Germany: CEUR Workshop Proceedings.Abstract
Due to its numerous bibliometric entries of scholarly articles and connected information Wikidata can serve as an open and rich
source for deep scientometrical analyses. However, there are currently certain limitations: While 31.5% of all Wikidata entries represent scientific articles, only 8.9% are entries describing a person and the number
of entries researcher is accordingly even lower. Another issue is the frequent absence of established relations between the scholarly article item and the author item although the author is already listed in Wikidata.
To fill this gap and to improve the content of Wikidata in general, we established a workflow for matching authors and scholarly publications by integrating data from the ORCID (Open Researcher and Contributor ID) database. By this approach we were able to extend Wikidata by more than 12k author-publication relations and the method can be
transferred to other enrichments based on ORCID data. This is extension is beneficial for Wikidata users performing bibliometrical analyses or using such metadata for other purposes.
Share this page