A lot of these are just getting off the ground. But you're welcome to poke about.
A lot of websites about language include sound files so visitors can hear words. Often, when the visitor clicks the sound file, they are taken to a separate page in their browser with a single "player" to play the sound. This distracts from the flow of the page. There's a better way.
This is a small one-page web application for testing how various fonts handle multiple combining characters.
It was created during a class on orthography taught by
Keren Rice and
Gwendolyn Hyslop at
InField2010.
A few experiments in fuzzy lookup with minimal code.
Fuzzy matching is the common name for the process of finding non-exact results when searching a textual database. The technique is used in familiar functionality such as spellchecking and suggestions in search engines. Fuzzy matching can also be very useful for linguists working with linguistic databases. This brief introduction explains some basics behind fuzzy matching, and explains a simple implementation.
For instance, using fuzzy matching, a query for _Nicolai_ might match _Nicolaï_;
Further reading
[1] http://en.wikipedia.org/wiki/Fuzzy_matching
The
Leipzig Glossing Rules are a practical characterization of traditional linguistic glossing technique. I'm working on tools to edit glosses in this format within the web browser.
What format to exchange linguistics scholarship in has been a matter of contention for years. How should interlinear glosses be rendered—as tables in word processing files? But which sort of file? .doc? .docx? .rtf? .pdf? One of the various forms for Open Office? I'd like to suggest that the most future-proof format is HTML5. This project contains some experiments that how HTML5 can be used to create documents that look very much like a traditional linguistics paper.
Herein some random experiments at marking up entries from various bilingual dictionaries as HTML.
A few conclusions: After encoding several examples, I've come to the conclusion that using definition lists (<dl>'s) to encode bilingual information seems to work best. My reasoning for this is that if you turn off the CSS in your browser, the examples that were marked up as definition lists look much more reasonable.
If you have a big list of words, how can you find good minimal pairs? Like this...
Linguists have traditionally used field notebooks to record their elicitations with speakers. While a noble tradition, there are drawbacks to this approach: above all, the text is not searchable. But also, there is no convenient way to link the transcriptions to recordings.
We propose a web-based tool for transcribing audio recordings.
Flickr.com offers a programmatic interface to its vast array of tagged images. Picnic is a tool for collecting images under a free license from Flickr for use in language documentation and language learning. The code is hosted on github
here.
Suppose you are interested in studying reduplication, and suppose you have a big list of words in the language. How can you automatically collect many words which contain reduplication from the list of words? This article will tell you how.
Suppose you have a bunch of .txt files filled with prose. How can
you turn that prose into a list of individual sentences? You can
start by reading this little tutorial.
Wals.info is an amazing web-based resource for cataloging the linguistic features of the languages of the world. Anyone interested in language or linguistics should check it out.
Interestingly, the project also allows users to access a SQL dump of their database. Herein, we take a look at some basic SQL using that database to keep it interesting.
Python is a great language for working on linguistic tasks. It's readable and reasonably concise. It's also popular, which means there are a lot of existing libraries you can tap into to help you with your project.
Here's a super short tutorial on a simple linguistic task: how can we extract sentences from a text file?
## Philosophical Preliminaries
(Just kidding.)
A good way to think of programming is how you would learn to be a wizard. When Hermione tells you that there's a spell to
## Read the file
Problem: There is a need for a browser-based method for inputting text in non-standard writing systems.
Proposed solution: The tool will be implemented as a jQuery plugin which can be included in any web page, and specify a format for defining new keyboard layouts (preferably by using an existing standard).
In which I attempt to use the <video>.
A little HTML can go a long way to toward bringing your work in linguistics to the web.
Herein I'll be collecting some basic documentation on how to produce simple web content for linguists.