John Goldsmith

Senior Fellow


I am interested in understanding the nature of symbolic representation, in both natural language and genomic sequences, through the development of software that induces structure from data.

In the area of natural language, the focus of my recent work has been the development of a program called Linguistica which induces the morphology (the word internal structure) of a language on the basis of a corpus from the language. We are currently working on languages as diverse as Somali and Swahili in addition to more familiar European languages. Our webpage is