Now I began to collect Greek letters for the corpus. My goal is to collect letters from before 500 BCE up through to 500 CE. The purpose of this range is to have a large data set that is that diachronically balanced, in the hopes of situating the New Testament letters within the history of the Greek language. Upon a short inspection, the New Testament letters also have the features of ethical treatise and legal argumentation as well. So, treatises and arguments from the philosophers will also be considered important for the analysis corpus. This way, New Testament letters can be compared with a broad range of Greek letters, argumentation pieces, and ethical treatises.
The next step is to decide what should go in and what should be left out. It would not be necessary to have every letter, legal discourse, or ethical discourse, since that would make the corpus too large to manage all the issues. More importantly, collecting all the available pieces would unbalance the corpus by weighting it too heavily for certain centuries.
In an effort to keep the corpus balanced diachronically, and to maintain manageability, only a sample the pieces are taken from each era. The sample would include a size of text comparable to that in the eras of least available texts. Also, only letters of a certain size were included as many minute letters with little more than a quick note are available. It is deemed better to exclude these as they didn’t have many special constructions due to their brevity.