We intuitively know what a WORD is. In written language words are separated by spaces. In spoken language you can sometimes hear a pause between them, although in most cases there’s nothing noticeable that separates words in spoken language.
We can distinguish the orthographic word, the grammatical word and the lexeme.
An ORTHOGRAPHIC WORD is a word form separated by spaces from other orthographic words in written texts and the corresponding form in spoken language.
In the example:
She wanted to win the game.
there are six orthographic words: she, wanted, to, win, the and game.
A GRAMMATICAL WORD is a word form used for a specific grammatical purpose.
For example in the sentence:
That man over there said that he would like to talk to you.
we have the word THAT used twice. This is one orthographic word, but we’re dealing with two grammatical words here: the first THAT is a demonstrative adjective and the other THAT is a conjunction.
A LEXEME is a group of word forms with the same basic meaning that belong to the same word class.
For example the words AM, WAS, IS belong to one lexeme, as they have the same basic meaning and are all verbs. Also the words COME and CAME belong to the same lexeme.
How do they relate to one another?
In many cases orthographic and grammatical words overlap. For example in the sentence:
They bought the house.
there are four orthographic words and four grammatical words, so there is one-to-one correspondence in this case.
But if we slightly modify the sentence like so:
They didn’t buy the house.
there are now five orthographic words and six grammatical words. This is because the orthographic word DIDN’T represents a sequence of two grammatical words: DID + NOT.
It may also be the other way around. In the sentence:
I kind of like it.
there are five orthographic words, but only four grammatical words, because the two orthographic words KIND OF actually represent a single grammatical word.
You can also watch the video version here: