If you hear scientists say something like “We don’t understand much about the climate on Jupiter” or “We don’t know why electrons behave in a particular way”, then that kind of ignorance seems reasonable. Jupiter is a huge planet, it’s far away, and it’s an inanimate object that can’t answer questions about itself. Same for electrons, except they are really small, presenting different problems.

But what about language sciences? Suppose a linguist says “we don’t really understand how determiners work in Salish” or “we don’t understand quantification in Hmong”. What does that mean? It doesn’t sound like a difficult problem to solve. It’s not like studying Jupiter or electrons. Dealing with languages means dealing with people. Can’t you just ask them how their language works? The short answer:


Let’s consider the simplest case: asking someone what the name of something is. It usually works well enough to get someone to name an item or activity for you. A question like “what do you call that thing?” (while pointing at it) can usually get you what you want.

But not always. There’s an object in my kitchen that I variably refer to as “saucer”, “saucepan”, and “pot”. Suppose for the sake of discussion that we live in an alternative world where English is a rare and endangered language that few people speak. Instead, suppose that most of the world speaks Zapotec and a Zapotecan linguist was doing fieldwork on English in my kitchen. I speak fluent Zapotec and English in this scenario, and the linguist knows only some highly accented and extremely limited English. If she asks me what that thing is, there’s several answers I might give.

Suppose on this occasion I tell her it’s a “saucer”. Later, she’s going over the notes, and she notices that on another occasion, I called it a “saucepan”. Two different words, same object. They clearly contain a shared root “sauce”. Is “pan” a special suffix that gives a particular nuance to the base meaning ‘sauce’? The linguist looks through her notes, and see that she recorded “pan” as independent word on another occasion, but never as a suffix before. Now that she thinks about it, maybe this is the same pan that’s a prefix in ‘pancake’?

The linguist get more curious. What happened to the ‘r’ in ‘saucer’ and why is the word not ‘saucerpan’ instead? Maybe the root is ‘sauce’ and the ‘r’ is from the agentive ‘-er’ suffix. She knows that I have a thing in my kitchen called a ‘toaster’, which is transparently related to ‘toast’, so maybe this is part of the same pattern. So the word is really sauce+er+pan, with /r/ deleted before /p/. There is no exaggeration to this, by the way. These are exactly the kind of things that linguists think about when confronted with data from an unfamiliar language. You just try anything to connect the pieces.

At the next meeting the confused Zapotecan linguist asks me why I called that thing both a ‘saucer’ and a ‘saucepan’. And I genuinely have no explanation. This isn’t part of my wild scenario – in the actual world I don’t have any basis for choosing one word over the other. So I’d have to tell the linguist “I don’t know”.

Or rather, I’m not aware of my reasons for choosing one over the other. I’m sure there’s a pattern. I use this same object for multiple reasons (boiling water for tea, heating soup, melting chocolate, occasionally frying things), so maybe my choice of words depends on why I need the thing in the first place. This would be something for the Zapotecan linguist to figure out. The point is that I would be of no help to her. Asking me why I use one word and not another isn’t going to be a fruitful line of inquiry.

This can be a problem in linguistic fieldwork – direct questions to speakers of the language often result in an ‘I don’t know’, which of course a complete dead-end for the researcher. Most people just use their language. They don’t really think about why or how, so even when confronted with something that might seem like an easy question (‘what is that thing called?’) it can be difficult to give a straight answer.

Another problem with asking for names is that sometimes things have different names under different circumstances, so the answer you get in one context wouldn’t be the same in another. For instance, a zipper on your pants is specifically called a “fly”, but this isn’t the word you’d use for the zipper on your jacket. Whether an animal is “cow” or “beef” depends on if you’re going to eat it. A chair is a called a throne when a monarch sits in it. And so on. These are usually issues you can clear up as you hear and use the language in a wider variety of contexts, but it illustrates the difficulty of accurately documenting a language, even for simple things like concrete nouns.

It’s even more difficult with words that don’t have a physical reference. Imagine now that the Zapotecan linguist wants to know the meaning of the words ‘so’, ‘well’, ‘actually’, or ‘oops’. How do you explain that? Give it a try and post in the comments. It’s not impossible, but it’s not easy either. You’ll probably find that it is easier to describe when you would use these words, rather than what they mean.

Recording a word list isn’t all that linguists are doing when they are documenting a language. Linguists also want to know about the grammar, the rules and patterns of the language. And you really can’t ask about these things. It just doesn’t work.

The main problem with asking about grammar is The Blank Stare. Not everyone has an education in formal grammar, and technical terms are meaningless to them. Here are some sample questions that a linguist might like to know about a language. See how well you can answer them for English (or any language you might know):

How do you mark progressive and perfect aspect (if you do)?
How do you form pseudo-clefts?
What are some ways to nominalize a verb?
Is dative shift allowed?
Is there wh-movement?

Unless the person you are talking to has an advanced education in linguistics, they probably won’t know how to answer these question. In even asking questions like these, you risk confusing, intimidating, or even embarrassing the person you are talking to. And that’s not going to help any future work.

It’s not a problem that people don’t know how to answer. Explicit knowledge of grammar isn’t required for fluency. If you don’t know how to answer these questions, you shouldn’t be embarassed, or think that you have bad grammar. This is one of the most amazing things about human language – you don’t need to be aware of its immense complexity to be able to use it fluently or competently.

In any case, it’s best to avoid the direct questioning approach. Instead, you need to create the right kind of conversational context, to encourage the use of a particular kind of language. Stories works really well for this. Asking “how do you form the past tense?” is a dead-end. But ask someone to tell a story about what they did before lunch, and they will be forced to use the past tense. From there, you can start to make some inferences about the language, and then it’s just the standard scientific method: hypothesize how a piece of grammar works, then test your hypothesis by further interacting with a native speaker.

For example, say you notice that all the past tense verbs you have encountered so far have the suffix -fu on them. You can hypothesize: “past tense is indicated by a suffix -fu”. Then you test this under different circumstances. How you exactly go about this testing would depend on what theory of linguistics you were working with, and what aspect of the language interests you. Here a few common things that a linguist would look for:

– Get the past tense for all possible person and number combinations. In other words, you can to get ‘I verb’, ‘you verb’, ‘he verb’, etc. This is a fairly obvious thing to do, since it’s very useful to have this information for basic communication purposes. It’s also extremely common for languages to have different suffixes for these purposes. For example, in French there are different past tense suffixes for nearly all of the different person/number combinations. For example, ‘I could’ is je pouv-ais while ‘we could’ is nous pouv-ions, and ‘they could’ is elles pouv-aient.

– Get the past tense of verbs ending in as many different consonants and vowels as possible. Often suffixes change based on the final sound of a root. In Inuktitut for example, person agreement suffixes start with a /t/ if the verb ends with a consonant, and they start with a /j/ if the verb ends in a vowel.

– Get the past tense with subjects of different genders, grammatical or actual. In Russian, the past (and only the past tense) changes depending on the (grammatical) gender of the subject.

If you check enough different things, chances are that at least one of them contradict your original hypothesis that -fu is the past tense suffix (unless this language is incredibly regular). For example, it might turn out that for the small number of verbs ending with an /l/, the suffix becomes -lu instead of -fu, and you just didn’t happen to collect any of those verbs in your first few encounters with a speaker of the language. When you run into these counter-examples, you re-adjust your hypothesis to include this new information. In this particular case, a linguist would probably change her description of the language to now include the rule fu -> lu / l_

So when you hear that linguists “don’t know” how part of a language works, it really means they haven’t come up with an accurate description that matches what real native speakers do. In a sense, linguistic discoveries are never really discoveries. Finding out how the past-tense works will come as no new information to native speakers of that language. They already knew how to form the past tense (that’s part of what makes them a native speaker in the first place) they just can’t answer direct questions about it. The challenge for linguists is finding a way to get access to this ‘hidden’ grammatical knowledge that we all carry around in our brains.


