This page offers an overview of our previous research endeavours.
Completed third-party funded projects
The DFG-funded project "Development of a Model Repository and Automatic Font Recognition for OCR-D" aims to improve the recognition rates of OCR procedures for historical prints. Since existing models have usually been trained either on the basis of modern corpora or unfiltered historical corpora with a large variety of fonts, the extent to which they suited for this task is limited. By training font-specific OCR models, the aim is to improve the reliability of text recognition in image digitisations of historical prints.
More information on the project can be found in the corresponding GEPRIS entry. The project is part of our former research focus "OCR & Layout Recognition".
Publications:
- Weichselbaumer, N., Seuret, M., Limbach, S., Dong, R., Burghardt, M. & Christlein, V. (2020). New Approaches to OCR for Early Printed Books. In DigItalia 1-2020, DOI: 10.36181/DIGITALIA-00014
As part of the BMBF-funded project "FakeNarratives – Understanding Narratives of Disinformation in Public and Alternative News Videos" we used distant viewing methods to analyse the underlying structures of narratives of disinformation in videos from public media and alternative media shared on social networks. Further information can be found on the BMBF information page and the project website.
Publications:
- Tseng, C.-I.; Bateman, J.; Burghardt, M.; Liebl, B. (2023). FakeNarratives – First Forays in Understanding Narratives of Disinformation in Public and Alternative News Videos. In: Trilcke, P.; Busch, A.; Helling, P. (Hrsg.), DHd2023: Open Humanities, Open Culture. 2023. S. 138-142. DOI: 10.5281/zenodo.7715277
- Liebl, B.; Burghardt, M. (2023). Zoetrope – Interactive Feature Exploration in News Videos. In: Baillot, A.; Scholger, W.; Tasovac, T.; Vogeler, G. (Hrsg.), Digital Humanities 2023: Book of Abstracts . 2023. S. 432-434. DOI: 10.5281/zenodo.8107770
- Burghardt, M.; Tseng, C.-I.; Bateman, J. (2023). Narrative Strukturen als Mittel der indirekten Kritik in der audiovisuellen Nachrichtenberichterstattung. In: Schlickers, S.; Preußer, H.-P. (Hrsg.), Bestimmte Unbestimmtheit – Offene Struktur und funktionale Lenkung in audiovisuellen Medien. Marburg: Schüren Verlag. 2023. ISBN: 978-3-7410-0426-1
- Liebl, B.; Burghardt, M. (2023): Designing a Prototype for Visual Exploration of Narrative Patterns in News Videos. In: Wohlgemuth, V.; Klein, M.; Krupka, D.; Winter, C. (Hrsg.), INFORMATIK 2023 - Designing Futures: Zukünfte gestalten. Bonn: Gesellschaft für Informatik e.V.. 2023. S. 831-840. DOI: 10.18420/inf2023_93
- Burghardt, M.; Liebl, B.; Ruth, N. (2023): From Clusters to Graphs – Toward a Scalable Viewing of News Videos. In: Šeļa, A.; Jannidis, F.; Romanowska, I. (Hrsg.), Proceedings of the Computational Humanities Research Conference 2023. CEUR-WS. 2023. S. 167-177
In the project "More than a Feeling: Media Sentiment as a Mirror of Investors' Expectations at the Berlin Stock Exchange, 1872-1930", which has been running since 2019, we are using various methods from the field of deep learning to implement OCR and layout detection as well as text mining and sentiment analysis for a historical German stock exchange newspaper. More information can be found in the corresponding GEPRIS entry and on the project's website. The project is part of our research focus "Text Mining and Natural Language Processing".
Publications:
- Janos Borst, Lino Wehrheim, Manuel Burghardt (in review): “Money Can’t Buy Love?” Creating a Historical Sentiment Index for the Berlin Stock Exchange, 1872–1930. In: Proceedings of the ADHO Conference on Digital Humanities, Graz.
- Lino Wehrheim, Janos Borst, Manuel Burghardt & Andreas Niekler (to appear 2023): „Auch heute war die Stimmung im Allgemeinen fest.“ Zero-Shot Klassifikation zur Bestimmung des Media Sentiment an der Berliner Börse zwischen 1872 und 1930. In: Proceedings of the DHd 2023, Luxemburg/Trier.
- Lino Wehrheim, Bernhard Liebl, Manuel Burghardt (2022): Extracting Textual Data from Historical Newspaper Scans and its Challenges for “Guerilla-Projects”. In: Regensburg Economic and Social History (RESH) Discussion Paper Series.
- Lino Wehrheim, Bernhard Liebl (2020): What’s in the news? (Erfolgs-)Rezepte für das wissenschaftliche Arbeiten mit digitalisierten Zeitungen. In: Schöch, Christof (Hrsg.) DHd 2020 Spielräume: Digital Humanities zwischen Modellierung und Interpretation. Konferenzabstracts, pp. 70–72.
- Liebl, B. & Burghardt, M. (2020). From Historical Newspapers to Machine-Readable Data: The Origami OCR Pipeline. Proceedings of the 1st Workshop on Computational Humanities Research (CHR).
- Liebl, B. & Burghardt, M. (2020). An Evaluation of DNN Architectures for Page Segmentation of Historical Newspapers. 25th International Conference on Pattern Recognition, Mailand. (Preprint https://arxiv.org/abs/2004.07317)
Previous Research Foci
Papers:
OCR and Layout Recognition, i.e. the automated transformation of scans of physical text documents into machine-readable, digital documents, plays a crucial role in the Digital Humanities, especially when it comes to computational research into historical sources.
Publications:
- Weichselbaumer, N., Seuret, M., Limbach, S., Dong, R., Burghardt, M. & Christlein, V. (2020). New Approaches to OCR for Early Printed Books. In DigItalia 1-2020, DOI: 10.36181/DIGITALIA-00014
- Liebl, B. & Burghardt, M. (2020). From Historical Newspapers to Machine-Readable Data: The Origami OCR Pipeline. Proceedings of the 1st Workshop on Computational Humanities Research (CHR).
- Liebl, B. & Burghardt, M. (2020). An Evaluation of DNN Architectures for Page Segmentation of Historical Newspapers. 25th International Conference on Pattern Recognition, Mailand. (Preprint arxiv.org/abs/2004.07317)
- Lehenmeier, C., Burghardt, M. & Mischka, B. (2020). Layout Detection and Table Recognition – Recent Challenges in Digitizing Historical Documents and Handwritten Tabular Data. 24th International Conference on Theory and Practice of Digital Libraries, Lyon.
In this area we use computational approaches to digitise and analyse symbolic music (sheet music).
Publications:
- Burghardt, M. & Fuchs, F. (2019). A Computational Approach to Analyzing Musical Complexity of the Beatles. In Book of Abstracts, DH 2019.
- Burghardt, M. (2018). Digital Humanities in der Musikwissenschaft – Computer-gestützte Erschließungsstrategien und Analyseansätze für handschriftliche Liedblätter. In B. Wiermann & A. Bonte (Hrsg.): Bibliothek. Forschung und Praxis, Sonderheft “Digitale Forschungsinfrastruktur für die Musikwissenschaft” (Preprint).
- Burghardt, M. & Lamm, L. (2017). Entwicklung eines Music Information Retrieval-Tools zur Melodic Similarity-Analyse deutschsprachiger Volkslieder. GI Workshop „Musik trifft Informatik“, INFORMATIK 2017, Chemnitz.
- Burghardt, M. & Spanner, S. (2017). Allegro: User-centered Design of a Tool for the Crowdsourced Transcription of Handwritten Music Scores. Proceedings of the DATeCH (Digital Access to Textual Cultural Heritage) conference. ACM.
- Burghardt, M., Spanner, S., Schmidt, T., Fuchs, F., Buchhop, K., Nickl, M. & Wolff, C. (2017). Digitale Erschließung einer Sammlung von Volksliedern aus dem deutschsprachigen Raum. In Book of Abstracts, DHd 2017.
- Burghardt, M., Lamm, L., Lechler, D., Schneider, M. & Semmelmann, T. (2016). Tool based Identification of Melodic Patterns in MusicXML Documents. In Book of Abstracts of the International Digital Humanities Conference (DH).
- Burghardt, M., Lamm, L., Lechler, D., Schneider, M. & Semmelmann, T. (2015). MusicXML Analyzer. Ein Analysewerkzeug für die computergestützte Identifikation von Melodie-Patterns. In Proceedings des 9. Hildesheimer Evaluierungs- und Retrievalworkshops (HiER) (S. 29–42).
- Meier, F., Bazo, A., Burghardt, M. & Wolff, C. (2015). A Crowdsourced Encoding Approach for Handwritten Sheet Music. In J. Roland, Perry; Kepper (Hg.), Music Encoding Conference Proceedings 2013 and 2014 (S. 127–130).
In quantitative drama analysis, we use different methods from the fields of NLP and text mining to facilitate a distant reading of stage plays. A particular focus of our work in this area is sentiment analysis.
Publications:
- Schmidt, T., Burghardt, M., Dennerlein, K. & Wolff, C. (2019). Katharsis – A Tool for Computational Drametrics. In Book of Abstracts, DH 2019.
- Schmidt, T., Burghardt, M. & Wolff, C. (2019). Towards Multimodal Sentiment Analysis of Historic Plays: A Case Study with Text and Audio for Lessing’s Emilia Galotti. Proceedings of the DHN (DH in the Nordic Countries) Conference, Copenhagen.
- Schmidt, T. & Burghardt, M. (2018). An Evaluation of Lexicon-based Sentiment Analysis Techniques for the Plays of Gotthold Ephraim Lessing. Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (pp. 139-149). Santa Fe, New Mexico: Association for Computational Linguistics.
- Schmidt, T., Burghardt, M. & Dennerlein, K. (2018). Sentiment Annotation of Historic German Plays: An Empirical Study on Annotation Behavior. Sandra Kübler, Heike Zinsmeister (eds.), Proceedings of the Workshop on Annotation in Digital Humanities (annDH 2018) (pp. 47-52). Sofia, Bulgaria.
- Schmidt, T. & Burghardt, M. (2018). Toward a Tool for Sentiment Analysis for German Historic Plays. In: Piotrowski, M. (ed.), COMHUM 2018: Book of Abstracts for the Workshop on Computational Methods in the Humanities 2018 (pp. 46-48). Lausanne, Switzerland: Laboratoire laussannois d’informatique et statistique textuelle.
- Schmidt, T., Burghardt, M. & Wolff, C. (2018). Herausforderungen für Sentiment Analysis-Verfahren bei literarischen Texten. In: Burghardt, M. & Müller-Birn, C. (Hrsg.), INF-DH-2018. Bonn: Gesellschaft für Informatik e.V.
- Schmidt, T., Burghardt, M. & Dennerlein, K. (2018). “Kann man denn auch nicht lachend sehr ernsthaft sein?” – Zum Einsatz von Sentiment Analyse-Verfahren für die quantitative Untersuchung von Lessings Dramen. In Book of Abstracts, DHd 2018.
- Wilhelm, T., Burghardt, M. & Wolff, C. (2013). “To See or Not to See” - An Interactive Tool for the Visualization and Analysis of Shakespeare Plays. In Tagungsband der Konferenz „Kultur und Informatik“: Visual Worlds & Interactive Spaces (S. 175–185).