We use current text mining and natural language processing methods as well as audio, video, and image analysis in order to examine large collections of data from the humanities and social sciences.

Furthermore, we examine the theoretical foundations of our work and reflect upon our methodology in order to learn more about the epistemological framework of our research, as well as the implications and consequences of computational research practices in the humanities and social sciences.

Ongoing third-party funded projects

In the BMBF-funded project "FakeNarratives – Understanding Narratives of Disinformation in Public and Alternative News Videos" we use distant viewing methods to analyse the underlying structures of narratives of disinformation. For more information, see the corresponding BMBF page and the project's websiteThe project is part of our research focus "Distant Viewing & Video Analytics".

Publications:

  • Tseng, Chiao-I, Liebl, Bernhard, Burghardt, Manuel & Bateman, John (to appear 2023). FakeNarratives – First Forays in Understanding Narratives of Disinformation in Public and Alternative News Videos. In: Proceedings of the DHd Conference 2023, Luxemburg/Trier.
  • Liebl, Bernhard & Burghardt, Manuel (in review). Zoetrope – Interactive Feature Exploration in News Videos. In: Proceedings of the ADHO Conference on Digital Humanities, Graz.
  • Bateman, John, Tseng, Chiao-I & Burghardt, Manuel (in review). Narrative Strukturen als Mittel der indirekten Kritik in der audiovisuellen Nachrichtenberichterstattung. Sammelband "Textualität des Films".

In the project "More than a Feeling: Media Sentiment as a Mirror of Investors' Expectations at the Berlin Stock Exchange, 1872-1930", which has been running since 2019, we are using various methods from the field of deep learning to implement OCR and layout detection as well as text mining and sentiment analysis for a historical German stock exchange newspaper. More information can be found in the corresponding GEPRIS entry and on the project's website. The project is part of our research focus "Text Mining and Natural Language Processing".

Publications:

  • Janos Borst, Lino Wehrheim, Manuel Burghardt (in review): “Money Can’t Buy Love?” Creating a Historical Sentiment Index for the Berlin Stock Exchange, 1872–1930. In: Proceedings of the ADHO Conference on Digital Humanities, Graz.
  • Lino Wehrheim, Janos Borst, Manuel Burghardt & Andreas Niekler (to appear 2023): „Auch heute war die Stimmung im Allgemeinen fest.“ Zero-Shot Klassifikation zur Bestimmung des Media Sentiment an der Berliner Börse zwischen 1872 und 1930. In: Proceedings of the DHd 2023, Luxemburg/Trier.
  • Lino Wehrheim, Bernhard Liebl, Manuel Burghardt (2022): Extracting Textual Data from Historical Newspaper Scans and its Challenges for “Guerilla-Projects”. In: Regensburg Economic and Social History (RESH) Discussion Paper Series.
  • Lino Wehrheim, Bernhard Liebl (2020): What’s in the news? (Erfolgs-)Rezepte für das wissenschaftliche Arbeiten mit digitalisierten Zeitungen. In: Schöch, Christof (Hrsg.) DHd 2020 Spielräume: Digital Humanities zwischen Modellierung und Interpretation. Konferenzabstracts, pp. 70–72.
  • Liebl, B. & Burghardt, M. (2020). From Historical Newspapers to Machine-Readable Data: The Origami OCR Pipeline. Proceedings of the 1st Workshop on Computational Humanities Research (CHR).
  • Liebl, B. & Burghardt, M. (2020). An Evaluation of DNN Architectures for Page Segmentation of Historical Newspapers. 25th International Conference on Pattern Recognition, Mailand. (Preprint https://arxiv.org/abs/2004.07317)

Current Research Foci

One focus of our research is Distant Viewing & Video Analytics, i.e. the analysis of news videos, films and series with computational methods. 

Publications:

  • Luhmann, J., Burghardt, M. & Tiepmar, J. (2020). SubRosa: Determining Movie Similarities based on Subtitles. 3. InfDH-Workshop “Methoden und Anwendungen der Computational Humanities”, INFORMATIK 2020 Workshops, Lecture Notes in Informatics (LNI), Karlsruhe.
  • Luhmann, J., Burghardt, M. & Tiepmar, J. (2020). SubRosa – Multi-Feature-Ähnlichkeitsvergleiche von Untertiteln. Book of Abstracts, DHd 2020, Paderborn.
  • Burghardt, M., Heftberger, A., Pause, J., Walkowski, N., & Zeppelzauer, M. (2020). Film and Video Analysis in the Digital Humanities – An Interdisciplinary Dialog. Special Issue for Digital Humanities Quarterly, 14(4).
  • Burghardt, M., Pause, J. & Walkowski, N.-O. (2019). Scalable Viewing in den Filmwissenschaften. In Book of Abstracts, DHd 2019.
  • Burghardt, M., Meyer, S., Schmidtbauer, S. & Molz, J. (2019). “The Bard meets the Doctor” – Computergestützte Identifikation intertextueller Shakespearebezüge in der Science Fiction-Serie Dr. Who. In Book of Abstracts, DHd 2019.
  • Burghardt, M., Kao, M. & Walkowski, NO (2018). Scalable MovieBarcodes – An Exploratory Interface for the Analysis of Movies. 3rd IEEE VIS Workshop on Visualization for the Digital Humanities, Berlin.
  • Burghardt, M., Hafner, K. Edel, L., Kenaan, S. & Wolff, C. (2017). An Information System for the Analysis of Color Distributions in MovieBarcodes. In Proceedings of the 15th International Symposium of Information Science (ISI 2017).
  • Burghardt, M., Kao, M. & Wolff, C. (2016). Beyond Shot Lengths – Using Language Data and Color Information as Additional Parameters for Quantitative Movie Analysis. In Book of Abstracts of the International Digital Humanities Conference (DH).
  • Burghardt, M. & Wolff, C. (2016). Digital Humanities in Bewegung: Ansätze für die computergestützte Filmanalyse. In Book of Abstracts, DHd 2016.

In the field of Digital Environmental Humanities, we pursue an interdisciplinary approach,  dovetailing questions and methods from literary studies and biodiversity research. To this end, we utilise methods from the fields of NLP and IR.

Publications:

  • Langer, Lars, Burghardt, Manuel, Borgards, Roland, Köhring, Esther & Wirth, Christian (2022). Digital Environmental Humanities – Zum Potential von „Computational and Literary Biodiversity Studies“ (CoLiBiS). In Book of Abstracts of the “Digital Humanities im deutschsprachigen Raum” (DHd) Conference, Potsdam.
  • Langer, L., Burghardt, M., Borgards, R., Böhning-Gaese, K., Seppelt, R. & Wirth, C. (2021). The rise and fall of awareness for biodiversity – A comprehensive quantification of historical changes in the use of vernacular labels for biological taxa in Western creative literature”. In People and Nature

Our research in DH Scientometrics & Theory of DH is dedicated to surveying disciplinary boundaries and academic research practices as well as their theoretical foundations, particularly in the Digital and Computational Humanities.

Publications:

  • Ruth, Nicolas, Niekler, Andreas & Burghardt, Manuel (2023). Peeking Inside the DH Toolbox – Detection and Classification of Software Tools in DH Publications. In: Proceedings of the CHR Conference 2023.
  • Kleymann, Rabea, Niekler, Andreas & Burghardt, Manuel (2022). Conceptual Forays: A Corpus-based Study of “Theory” in Digital Humanities Journals. In Journal of Cultural Analytics.
  • Gutiérrez De la Torre, Silvia E., Equihua, Julián, Niekler, Andreas & Burghardt, Manuel (2022). Into the bibliography jungle: using random forests to predict dissertations’ reference section. Proceedings of the Workshop on Understanding LIterature references in academic full TExt (ULITE), co-located with ACM/IEEE Joint Conference on Digital Libraries 2022.Walkowski, Niels-Oliver & Burghardt, Manuel (2022). Executable Papers in den Computational Humanities. In Book of Abstracts of the “Digital Humanities im deutschsprachigen Raum” (DHd) Conference, Potsdam.
  • Kleymann, Rabea, Niekler, Andreas & Burghardt, Manuel (2022). Conceptual Forays: A Corpus-based Study of “Theory” in Digital Humanities Journals. In Journal of Cultural Analytics.
  • Burghardt, Manuel, Luhmann, Jan & Niekler, Andreas (2022). Tools as Epistemologies in DH? A Corpus-Based Exploration. In Book of Abstracts of the ADHO Conference on Digital Humanities, Tokyo.
  • Gutiérrez, Silvia, Kleymann, Rabea, Niekler, Andreas & Burghardt, Manuel (2022). The many faces of theory in DH. In Book of Abstracts of the ADHO Conference on Digital Humanities, Tokyo.
  • Luhmann, J. & Burghardt, M. (erscheint 2021). Digital Humanities – A Discipline in its Own Right? An Analysis of the Role and Position of DH in the Academic Landscape. In Journal of the Association for Information Science and Technology (JASIST), Special issue on Digital Humanities.
  • Burghard, M. & Luhmann, J. (2021). Same same, but different? On the Relation of   Information Science and the Digital Humanities. A Scientometric Comparison of Academic Journals Using LDA and Hierarchical Clustering. In Proceedings of the 16th International Symposium of Information Science (ISI2021): “Information between Data and Knowledge – Information Science and its Neighbors from Data Science to Digital Humanities”.

Text mining and natural language processing have a long tradition in the digital humanities. We are particularly interested in current methods for text reuse detection, text similarity and sentiment analysis.

Publications:

  • Liebl, Bernhard & Burghardt, Manuel (2022). The Vectorian API – A Research Framework for Semantic Textual Similarity (STS) Searches. In Book of Abstracts of the ADHO Conference on Digital Humanities, Tokyo.
  • Liebl, Bernhard & Burghardt, Manuel (2022). “Embed, embed! There’s knocking at the gate.” Detecting Intertextuality with Embeddings and the Vectorian. In “Fabrikation von Erkenntnis – Experimente in den Digital Humanities”, edited by Manuel Burghardt, Lisa Dieckmann, Timo Steyer, Peer Trilcke, Niels-Oliver Walkowski, Joëlle Weis & Ulrike Wuttke. Melusina Press.
  • Akiki, C. & Burghardt, M. (2020). Toward a Musical Sentiment (MuSe) Dataset for Affective Distant Hearing. Proceedings of the 1st Workshop on Computational Humanities Research (CHR).
  • Bryan, M., Burghardt, M. & Molz, J. (2020). A Computational Expedition into the Undiscovered Country - Evaluating Neural Networks for the Identification of Hamlet Text Reuse. Proceedings of the 1st Workshop on Computational Humanities Research (CHR).
  • Liebl, B. & Burghardt, M. (2020). “Shakespeare in The Vectorian Age” – An Evaluation of Different Word Embeddings and NLP Parameters for the Detection of Shakespeare Quotes”. Proceedings of the 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LateCH), co-located with COLING’2020.
  • Liebl, B. & Burghardt, M. (2020). „The Vectorian“ – Eine parametrisierbare Suchmaschine für intertextuelle Referenzen. Book of Abstracts, DHd 2020, Paderborn.

Immersive and Interactive Humanities explore the use of AR/VR/XR, eye tracking, tangible interfaces and other HCI technologies for the Digital Humanities.

Publications:

  • Piontkowitz, Vera & Burghardt, Manuel (2021). Best Practices für die Gestaltung virtueller Museumsräume. In „Fabrikation von Erkenntnis – Experimente in den Digital Humanities“, edited by Manuel Burghardt, Lisa Dieckmann, Timo Steyer, Peer Trilcke, Niels Walkowski, Joëlle Weis, Ulrike Wuttke. Wolfenbüttel 2021. (= Zeitschrift für digitale Geisteswissenschaften / Sonderbände, 5) (https://zfdg.de/sb005_005)
  • Schwappach, F. & Burghardt, M. (2019). Augmentierte Notizbücher und Natürliche Interaktion – Unterstützung der Kulturtechnik Handschrift in einer digitalen Forschungswelt. Book of Abstracts, DHd 2019.
  • Dechant, M. & Burghardt, M. (2015). Virtuelle Rekonstruktion des Regensburger Ballhauses. In Book of Abstracts, DHd 2015.
  • Haas, B., Kautetzky, M., Voit, M., Burghardt, M. & Wolff, C. (2015). ResearchSherlock: Toward a seamless integration of printed books into the digital academic workflow. In Proceedings of the 14th International Symposium of Information Science (ISI 2015).
  • Bazo, A., Burghardt, M. & Wolff, C. (2013). Virtual Bookshelf – Ein Natural User Interface zur Organisation und Exploration digitaler Dokumente. In Proceedings of the 13th International Symposium of Information Science.

Computational Game Studies, i.e. the study of (video) games with computational methods, is a comparatively new area of research. Our focus here is particularly on methods for the empirical, multimodal analysis of video games.

Publications:

  • Burghardt, M. & Tiepmar, J. (2021). The Game Walkthrough Corpus (GWTC) – A Resource for the Analysis of Textual Game Descriptions. In Journal of Open Humanities Data.

Computational Spatial Humanities are concerned with the computer-assisted evaluation of spatial data in the context of questions from the humanities.

Computational Social Science (CSS) is an interdisciplinary field that primarily uses computational methods to develop and test theories about human behavior in the social sciences, psychology, and economics. These methods are applied to large data sets about human behavior, such as those found in social media or digital archives, and often involve the use of Big Data. However, CSS also includes text analytics and simulations.

In our group, we mainly deal with content analysis and cooperate with communication and media sciences, political sciences and behavioral research at the Leipzig site.

Publications:

  • Wiedemann, G., & Niekler, A. (2017). Hands-On: A Five Day Text Mining Course for Humanists and Social Scientists in R. Proceedings of the Workshop on Teaching NLP for Digital Humanities (Teach4DH) 2017, Berlin, Germany, September 12, 2017., 57–65.

  • Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G., Reber, U., Häussler, T., Schmid-Petri, H., & Adam, S. (2018). Applying LDA Topic Modeling in Communication Research: Toward a Valid and Reliable Methodology. Communication Methods and Measures, 12(2–3), 93–118.

  • Maier, D., Niekler, A., Wiedemann, G., & Stoltenberg, D. (2020). How Document Sampling and Vocabulary Pruning Affect the Results of Topic Models. In Computational Communication Research (Bd. 2, Nummer 2, S. 139–152). Amsterdam University Press.

  • Niekler, A. (2018). Automatisierte Verfahren für die Themenanalyse nachrichtenorientierter Textquellen. Herbert von Halem Verlag.

  • Niekler, A., Bleier, A., Kahmann, C., Posch, L., Wiedemann, G., Erdogan, K., Heyer, G., & Strohmaier, M. (2018). ILCM – A Virtual Research Infrastructure for Large-Scale Qualitative Data. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA).