Comments I made in reply to a correspondent’s questions about delimiters and tokenizing in the Learner module may be worth sharing here.
As a part of my Master’s work in psychology I applied my program to a few samples of data from my advisor’s funded research study on family interactions. In one phase of the study observers viewed video-taped sessions of family members (parent and child) interacting in various modes (play or work) and coded qualitative features of each moment’s activity over a period of time.
The following page describes the application in more detail and reflects on its implications for the conduct of scientific inquiry in general.
In this application a “word” or “string” is a fixed-length sequence of qualitative features and a “sentence” or “strand” is a sequence of words that ends with what the observer judges to be a significant pause in activity.
In the qualitative research phases of the study one is simply attempting to discern any significant or recurring patterns in the data one possibly can.
In this case the observers are tokenizing the observations according to a codebook that has passed enough intercoder reliability studies to afford them all a measure of confidence it captures meaningful aspects of whatever reality is passing before their eyes and ears.