Research Guides: Digital Humanities: Tools

Contact Information

Sara Grosvald | Planning and Development

Bloomfield Library for the Humanities and Social Sciences

T +972.2.5882134 | saragr@savion.huji.ac.il

Sketch Engine

Sketch Engine is a tool for building, managing and exploring large text collections in dozens of languages. Sketch Engine is a tool to explore how language works. Its algorithms analyze authentic texts of billions of words (text corpora) to identify instantly what is typical in language and what is rare, unusual or emerging usage. It is also designed for text analysis or text mining applications. Sketch Engine contains 500 ready-to-use corpora in 90+ languages, each having a size of up to 30 billion words.

SkELL (Sketch Engine for Language Learning) is a simple tool for students and teachers of English to easily check whether or how a particular phrase or a word is used by real speakers of English.

For access, please refer to msperiodicals@savion.huji.ac.il

Open Access Tools

Zenodo
Zenodo is a general-purpose open-access repository developed under the European OpenAIRE program and operated by CERN. It allows researchers to deposit data sets, research software, reports, and any other research related digital artifacts. For each submission, a persistent digital object identifier (DOI) is minted, which makes the stored items easily citeable.

GitHub

is an American company that provides hosting for software development version control using Git.
GitHub offers plans for free, professional, and enterprise accounts.Free GitHub accounts are commonly used to host open source projects. As of January 2019, GitHub offers unlimited private repositories to all plans, including free accounts. As of May 2019, GitHub reports having over 37 million users and more than 100 million repositories (including at least 28 million public repositories),making it the largest host of source code in the world.

Figshare

Figshare
is a repository where users can make all of their research outputs available in a citable, shareable and discoverable manner. All file formats can be published, including videos and datasets that are often demoted to the supplemental materials section in current publishing models. Users of the site maintain full control over the management of their research while benefiting from global access, version control and secure backups in the cloud.

Python

Python
Python is a programming language that lets you work more quickly and integrate your systems more effectively.
You can learn to use Python and see almost immediate gains in productivity and lower maintenance costs.

Agisoft

Agisoft Website
Agisoft is proud to be among the pioneers of digital photogrammetry solutions developers.

ArcGIS

ArcGIS online
Connect people, locations, and data using interactive maps. Work with smart, data-driven styles and intuitive analysis tools. Share your insights with the world or specific groups.

Microsoft Researcher Tools

Microsoft Research Tools
Datasets, SDKs, APIs and other open source code created by Microsoft researchers made available to the broader academic community.

WorldMap

WorldMap
Build your own mapping portal and publish it to the world or to just a few collaborators. WorldMap is open source software.

CoreNLP

CoreNLP
CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages: Arabic, Chinese, English, French, German, and Spanish.

Voyant

Voyant
Voyant is a suite of analysis and exploration tools for digital texts. Voyant is largely built on the foundations of text analysis tool design and methodology from over 50 years of humanities computing research.

Overviewdocs

Overviewdocs
Open-Source Document Mining
Overview is a document mining application originally built for investigative journalists. It’s also used for legal work, training machine learning models, and research of all types. It’s a visualization and analysis tool designed for sets of documents, from dozens to millions of pages of material.

Overview imports many formats and languages, includes built-in OCR, a sophisticated search engine, document annotation, word clouds, entity detection, and topic-based document clustering. It has tagging and metadata support and supports many input and export formats. If you need custom analysis, you can write your own plugins using the API.

Juxtacommons

Juxtacommons
Juxta is a tool that allows you to compare and collate versions of the same textual work.

It is an online space powered by the Juxta Web Service that lets you collate sets and share visualizations with your peers.

OpenRefine

OpenRefine
OpenRefine (previously Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.

OpenRefine always keeps your data private on your own computer until YOU want to share or collaborate. Your private data never leaves your computer unless you want it to. (It works by running a small server on your computer and you use your web browser to interact with it)

OpenRefine is available in English, Chinese, Spanish, French, Russian, Portuguese (Brazil), German, Japanese, Italian, Hungarian, Hebrew, Filipino, Cebuano, Tagalog

FromThePage

FromthePage
Transcribe documents from anywhere – upload PDFs and pictures, or directly import documents from a digital library.
Free-from text transcription, Field-based, transcription, OCR correction, Indexing, Translation
Multiple export formats (TEI, CSV, HTML, IIIF/Open Annotation).
Collaborate: Make your transcription project public or select authorized collaborators. Version control
Automatic markup, In-document collaborator notes
Manage: Review the progress of your transcription project and the most recent activity in the transcriber community at a glance.
Project dashboard, Transcription quality review
Collaborator engagement and time tracking tools
Digital library system integrations (IIIF, ContentDM - OCLC, Internet Archive and Omeka)

The Library Authority

Library Research Guides

Digital Humanities: Tools